Histogram of letters of a text

조회 수: 13 (최근 30일)
Julozert
Julozert 2020년 5월 2일
댓글: Walter Roberson 2020년 5월 2일
Hi guys,
I want to create a histogram that shows how often a letter was used in a text but I have no idea how to count the letters or how to plot the histogram so i can see each letter at the x axis.
Does anyone have an idea how i can do it?
  댓글 수: 1
Rik
Rik 2020년 5월 2일
You might also be interested in the Text Analytics Toolbox.

댓글을 달려면 로그인하십시오.

채택된 답변

Walter Roberson
Walter Roberson 2020년 5월 2일
Compare the current input character against the first possible letter that you want to count. If you get a match, increment the counter associated with that letter. Otherwise compare against the second possible letter, and if there is a match, increment the counter associated with that letter. And so on. Eventually move on to the next input character.
OR
Compare all of the input characters against the first letter you want to count. Set the counter associated with that letter to the number of matches you got; do the same thing for the second letter you want to count, and so on.
Hint: you can create a vector of the letters you want to count, and do the counting in a loop.
  댓글 수: 2
Julozert
Julozert 2020년 5월 2일
So i have found a way to create my histogram but is there a way to compare the letter count of 2 texts in one histogram?
Lets say I have counted the letter 'A' 1000 times in text1 and 2000 times in text 'B' and I want them in the same figure next to each other to compare how could I do that? I tried using hold on but it kinda looked weird and the bars were stacking at each other
Walter Roberson
Walter Roberson 2020년 5월 2일
You can use bar() with the 'grouped' option.
Use one column (important that it be column!) in Y for each bar-in-a-group that you want drawn.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Rik
Rik 2020년 5월 2일
Once you have the text in a Matlab array it is stored as numbers, so you can use the normal tools.
lorem='Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.';
letters=unique(lorem);
letter_counts=histcounts(double(lorem),max(letters)-min(letters)+1);
letter_counts(letter_counts==0)=[];
bar(1:numel(letters),letter_counts)
set(gca,'XTick',1:numel(letters))
set(gca,'XTickLabels',num2cell(letters))
  댓글 수: 1
Walter Roberson
Walter Roberson 2020년 5월 2일
I believe that this is a homework problem...

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Data Distribution Plots에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by