K Means Clustering Question
조회 수: 16 (최근 30일)
이전 댓글 표시
Hi,
I have been trying to run k-means clustering in Matlab by setting a seed (rng). A few times it goes through without issue, but sometimes when I run the k-means with the same rng, i get the error "Warning: Failed to converge in 100 iterations."
I am working why I get the error message sporadically. Given that I set the rng, I would expect it to work fine if it did in the past?
Thanks,
댓글 수: 0
답변 (1개)
Adithya Addanki
2015년 12월 1일
편집: Adithya Addanki
2015년 12월 1일
Hi Munaf,
Please confirm the release of MATLAB you are using if you are comparing the results between different releases. Also, please find the release notes and the changes incorporated into "kmeans" and related functions in the link below: http://www.mathworks.com/help/stats/release-notes.html
It may be possible that the algorithm is converging for the default number of iterations (100). Please look at the "MaxIter" parameter for the "kmeans" function to increase the number of iterations.
For instance:
[idx,C,sumd,D] = kmeans(X,20,'MaxIter',10000)
I understand the usage of seed in "rng" is to produce predictable sequence of numbers. Let us refer to a simple example (first example from the link below):
%load sample data
load fisheriris
X = meas(:,3:4);
figure;
plot(X(:,1),X(:,2),'k*','MarkerSize',5);
title 'Fisher''s Iris Data';
xlabel 'Petal Lengths (cm)';
ylabel 'Petal Widths (cm)';
% usage of rng with seed = 1
rng(1);
[idx,C] = kmeans(X,3);
rng(1);
[idx2,C2] = kmeans(X,3);
[idx3,C3] = kmeans(X,3);
rng(1);
[idx4,C4] = kmeans(X,3);
[idx5,C5] = kmeans(X,3);
[idx6,C6] = kmeans(X,3);
[idx7,C7] = kmeans(X,3);
Now, if you notice the centroids returned from the above commands C,C2 and C4 will be the same as you have set the seed each time before calling the "kmeans" function (Case 1). Whereas, C3, C5, C6 and C7 will be different as the sequence generated by "rng" is not set to use the seed again (Case 2).
In the second case it may be possible that the number of iterations required is higher than the default (Many factors come into picture: size of the data, number of clusters and underlying algorithm used)
I hope this answers your question.
Thanks,
Adithya
댓글 수: 1
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!