problem about function "evalclusters" function for gap statistic

조회 수: 2 (최근 30일)
ke wang
ke wang 2022년 4월 21일
답변: Himanshu 2024년 10월 4일
%% This is an exampler for Clustering evaluation based on gap statistic
YYc=rand(1,100);
evaluation = evalclusters(YYc,"kmeans","gap","KList",2:10,'SearchMethod','globalMaxSE','B',40);%% 'SearchMethod' — Method for selecting optimal number of clusters, 'globalMaxSE' (default) | 'firstMaxSE'
I find that almost all references take "firstMaxSE" as the serach method . There is little literature being based on "globalMaxSE" . I want to know why the programmer design "'globalMaxSE", and how to find the reference.

답변 (1개)

Himanshu
Himanshu 2024년 10월 4일
Hello,
I see that you are trying to understand why the "globalMaxSE" search method is included in the "evalclusters" function for gap statistic and how to find relevant references.
"globalMaxSE" selects the number of clusters corresponding to the global maximum gap value, considering the standard error, which can provide a more reliable choice when the gap statistic has multiple local maxima.
This method is designed to ensure robustness in cluster selection, especially when the gap statistic curve is noisy or has several peaks.
For references, I would recommend exploring academic papers on clustering and gap statistics, as they may discuss variations in methods for selecting the optimal number of clusters.
I hope this helps.

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by