check if data are well separated by clustering
조회 수: 2 (최근 30일)
이전 댓글 표시
I have data labelled with either 0 or 1s. I want to know how well they can be separated by clustering. However, each time I run k means, even though the grouping is consistent, the output for the same data point is random (either 0 or 1). What I am currently doing is to reverse the result is label comparison is lower than 50%, does that make sense? Also, I am wondering how to compare it in a context where I have three categories.
댓글 수: 0
답변 (1개)
the cyclist
2018년 7월 20일
The reason that this happens is that there is randomness involved in the initial guess of centroid position. This will lead to different labeling.
I think a better way to try to get uniform labeling would be to use the final centroid position (the second output of the kmeans function), and assign labels based on that. For example, you could work systematically from smallest to largest value along the first dimension.
I don't think this will be perfect (especially if some clusters are not well separated), but it might work out.
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Cluster Analysis and Anomaly Detection에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!