Why does KMEANS return different results when invoked on the same input?

Question

0 개 추천

When I run the following code multiple times, KMEANS returns different partitions (and hence a different vector s of within-cluster sums of point-to-centroid distances) although the data matrix a is the same:

   a = [0 -1 0 2 0] 
[b c s] = kmeans(a,2,'distance','cityblock')

Output 1:

Output2:

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

MathWorks Support Team 2011년 2월 24일

MATLAB Online에서 열기

2 개 추천

This is expected behavior because KMEANS by default selects the initial cluster centroid positions at random (albeit from the observations). That is, the value of the 'start' parameter is set to 'sample' as can be seen from the documentation. Another outcome you would also observe if you run your code several times is that KMEANS errors out because an empty cluster is created at the first iteration (i.e., b is all 1's or all 2's). You could always pass a matrix of initial positions as the value for the 'start' parameter, for example:

[b c s] = kmeans(a,2,'distance','cityblock','start',[0 1]')

This would yield the same result every time but since the partition returned by KMEANS highly depends on the initial centroid positions, you would probably get a sub-optimal partition (unless your provide a "lucky" vector for the 'start' parameter). The typical use of KMEANS entails setting the 'Replicates' parameter to an integer n corresponding to the number of times to repeat the clustering. KMEANS then returns the solution with the lowest value for s.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Why does KMEANS return different results when invoked on the same input?

채택된 답변

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

추가 답변 (0개)

카테고리

제품

태그

Community Treasure Hunt

Why does KMEANS return different results when invoked on the same input?

채택된 답변

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

추가 답변 (0개)

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기