필터 지우기
필터 지우기

How does the 'ward linkage' during cluster analysis work?

조회 수: 11 (최근 30일)
Franziska Ba
Franziska Ba 2019년 12월 2일
I have the following problem: I would like to examine my data with a cluster analysis.
As distance measure (similarity measure) I use "correlation". As 'Linkage' I use 'ward' because it’s best known for grouping the “real” clusters (I know that you should actually use 'ward' with 'euclidean'). Furthermore, the ‘ward’ should not unite those objects that have the smallest distance from each other but the objects that least increase a given variance criterion.
cgObj = clustergram(data(2:264,:),'Standardize',2 ,'Colormap','jet','RowPDist','correlation', ...
'ColumnPDist','correlation' ,'Linkage','ward','DisplayRange',1, 'Symmetric',1, 'Cluster',1);
Now I have checked the theoretic procedure of the code with a simple example.
First, the similarity is quantified for each object pair and buffered in a (non-visible) distance matrix. After that, the first two objects which increase the variance the least should be grouped. Thereafter, the similarities between the newly created group and the remaining objects are re-quantified. This is followed by a new grouping step, and so on.
As far as I understand it, the variance that should increase as little as possible is not calculated from the distance matrix but from the output matrix.
Why is the distance matrix calculated if it is not used in 'linkage' anyway? Or have I thought incorrectly about that? I would like to understand the exact procedure of grouping during cluster analysis.
I am grateful for any suggestion!

답변 (0개)

카테고리

Help CenterFile Exchange에서 Expression Analysis에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by