MATLAB Answers

Question on running fitlda

조회 수: 1(최근 30일)
I want to run fitlda, with the following specification:
* use Griffiths and Steyvers (2004) Gibbs Sampling algorithm for LDA as they ran it,
* 12 topics (i.e. K=12),
* a symmetric alpha of 50/K (no updating),
* a symmetric beta of .01 (no updating), and
* exactly 2000 iterations (without early termination).
Would that be:
numTopics = 12;
mdl = fitlda(bag,numTopics,'Verbose',1,'InitialTopicConcentration',50,'FitTopicConcentration',false,'WordConcentration',.01,'LogLikelihoodTolerance',0,'IterationLimit',2000);

  댓글 수: 0

댓글을 달려면 로그인하십시오.

채택된 답변

Christopher Creutzig
Christopher Creutzig 10 Dec 2018
Gibbs sampling involves stochastic elements (i.e., a pseudorandom number generator), meaning reproducing exactly the results of the 2004 paper will require using their code and their rng settings. (Which is also why in degenerate cases, you do get substantially different answers for multiple fitlda calls.)
Without looking up the definition of β in the original paper, I'm not sure if you want to set 'WordConcentration',.01 or 'WordConcentration',.01*bag.NumWords.
Other than that, the call looks like it should do what you ask, yes.

  댓글 수: 3

Stephen Bruestle
Stephen Bruestle 10 Dec 2018
Thanks.
For future users trying to do the same thing: Chris's answer helped me figure out that I would need to set β to be:
'WordConcentration',.01*bag.NumWords
The reason that you would want with this alpha and beta is to be consistant with the recommendations in Steyvers and Griffiths (2007).
Kai Friedrich
Kai Friedrich 11 Jun 2020
Hey Stephen,
I am trying to do the same thing.
Great answer for the beta parameter.
What about alpha?
Is sufficient to just insert 50, when I want my alpha parameter in MATLAB to be 50/K?
Thanks!
Stephen Bruestle
Stephen Bruestle 11 Jun 2020
I think that you just insert 50.
That said, I never was able to get results similar to the GibbsLDA++ program. There seems to be some sort of optimization still going on. In the end, I ended up using on Gibbs++, as I had more confidence in it.
If you are writing an academic paper, I would recommend Gibbs++, as it is better documented and is used in many academic works. If you really want to use MatLab, the original code by Griffiths and Steyvers is MatLab code.
It is a shame that fitlda is not properly documented. Without precise empirical definitions of each function, fitlda seems to be worthless for academic purposes.

댓글을 달려면 로그인하십시오.

추가 답변(0개)

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by