Improving the consistency of the NNMF function
조회 수: 4 (최근 30일)
이전 댓글 표시
I'm attempting to use non-negative matrix factorization on a matrix containing spectral information (A). Whenever I run the nnmf function, the output matrices W and H are usually different from any other iterations. I have found that this is stated in the help documentation for the nnmf function:
"Because the root-mean-squared residual D may have local minima, repeated factorizations may yield different W and H."
However, as a result of this, I find it difficult to make use this method to say anything scientifically meaningful, as it introduces considerable bias on my behalf (I can effectively run the function repeatedly until I come to a result that fits with my narrative).
My question: how can I get the nnmf function to return W and H matrices with higher reproducibility thereby improving my confidence in the method? I've tried tweaking the input options by decreasing the tolerances, increasing the number of replicates in the initial run, and increasing the number of iterations, all with little effect.
My code is currently very similar to what is written in the help documentation and looks like this:
numcom = 2; % The rank. My datasets typically can be described by very low-rank approximations
opt = statset('MaxIter', 10, 'Display', 'final');
[W0,H0] = nnmf(A, numcom, 'replicates', 10, 'options', opt, 'algorithm', 'mult'); %Get starting values
opt = statset('Maxiter', 1000, 'Display', 'final');
[W,H] = nnmf(A, numcom, 'w0', W0, 'h0', H0, 'options', opt,' algorithm', 'als');
Of course, I can set the random number generator to default before running the function every time:
rng('default')
But that kind of defeats the purpose ;)
댓글 수: 1
답변 (1개)
Jakub
2019년 8월 19일
According to my experiences I only use 'als' algorithm and with many replicates which usually gives me better estimate. So something like this:
opt = statset('Maxiter',100,'Display','final','useparallel',true);
[coeff,score] = nnmf(A, numcom,'replicates',1e6,'options',opt);
댓글 수: 1
Guy Reading
2019년 11월 12일
Agreed, with enough replicates hopefully the space will be adequately explored and the global max will be found each time & repeatably returned. How many is enough? Depends on how large your input space (m) is...
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!