Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?

조회 수: 1 (최근 30일)
I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?

채택된 답변

Christopher Creutzig
Christopher Creutzig 2018년 11월 26일
The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Text Analytics Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by