Does anyone know how the sonnetsCounts.mat file was created on the following MATLAB page: https://uk.mathworks.com/help/textanalytics/ref/ldamodel.predict.html
Predict Top LDA Topics of Word Count Matrix
Load the example data. sonnetsCounts.mat contains a matrix of word counts and a corresponding vocabulary of preprocessed versions of Shakespeare's sonnets.
load sonnetsCounts.mat
size(counts)
ans = 1×2
154 3092
When I open the sonnetsCounts.mat file, it has the following data
val =
(1,1) 1
(106,1) 1
(131,1) 2
(154,1) 1
(1,2) 1
(143,2) 1
I presume the second column in the frequency of words. But I'm not sure if the vector in the first column represents two words?
Peter

 채택된 답변

Walter Roberson
Walter Roberson 2018년 12월 24일
편집: Walter Roberson 2018년 12월 24일

0 개 추천

The counts is a sparse matrix.
(143,2) 1
means that sonnet #143, unique word #2, had a count of 1.

댓글 수: 4

Peter Mayhew
Peter Mayhew 2018년 12월 26일
편집: Peter Mayhew 2018년 12월 26일
Thanks Walter for your response, I've spent more time looking into this since your reply.
Would I be correct in saying that the sonnet sparse matrix would be created using the encode command. i.e. encode(bag,documents)
Peter
Walter Roberson
Walter Roberson 2018년 12월 26일
No. It is the Counts property of the bag directly not the result of encoding an additional document against the bag.
Peter Mayhew
Peter Mayhew 2018년 12월 26일
편집: Peter Mayhew 2018년 12월 26일
OK, so if I understand correctly. I would perform the following command
bag = bagOfWords(documents);
Then check the counts property of variable bag.
Walter Roberson
Walter Roberson 2018년 12월 26일
Counts with a capital C, but Yes.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Statistics and Machine Learning Toolbox에 대해 자세히 알아보기

제품

릴리스

R2018b

태그

질문:

2018년 12월 23일

댓글:

2018년 12월 26일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by