how to get the most frequent words

조회 수: 5 (최근 30일)
Dezdi
Dezdi 2018년 11월 26일
댓글: Dezdi 2018년 11월 27일
how can i creat a function with input n and return as output paramer a cell array which will contains the n most frequent words in a text together with the number of times they appeared and return same withleast frequent words and I have a seperet text file.
many thanks

답변 (1개)

Matt J
Matt J 2018년 11월 26일
편집: Matt J 2018년 11월 26일
Say str is your text. Then you could get a histogram of all the words in str by doing,
[u,i,j]=unique(strsplit(str));
H=histcounts(j,1:max(j)+1)
  댓글 수: 6
Steven Lord
Steven Lord 2018년 11월 26일
Let's choose a sample piece of text (thanks to Mr. Dickens and Earl Bulwer-Lytton):
str = 'it was the best of times it was the worst of times it was a dark and stormy night';
[u,i,j]=unique(strsplit(str));
H=histcounts(j,1:max(j)+1);
The +1 in the histcounts call is necessary so the last bin does not contain both max(j) and max(j)-1. Look at the elements of u and H. To see them side-by-side easily I'll put them in a table.
t = table(u.', H.', 'VariableNames', {'u', 'H'})
Looking at the information in the table t, what would you assign to the most and least variables if you were solving this problem manually? How do you think you could generate those most and least variables in your program using the information stored in the u and H variables?
Dezdi
Dezdi 2018년 11월 27일
table doesnt work for me.
I get the u meaning but what is i and idx?
it will return me with the maxk(H,3) only numbers if I put them into word give me the same word.
How I could put them into a cell array which contains the word and how many times was in it ?

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by