How to change color of the data on biplot by the result of clustering

Question

徹也長島 2022년 11월 11일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1848558-how-to-change-color-of-the-data-on-biplot-by-the-result-of-clustering

댓글: Adam Danz 2022년 11월 11일

Test.xlsx

Hello, everyone

I'm a begginer of Matlab.

I wanna show the result of the clustering with PCA and biplot.

But, I don't know how to change the color of the data on biplot by the result of clustering. In the picture, the color of the data is only red. I wanna separate the data into 3 colors(because the number of clusters is 3).

Could you tell me your idea?

D=readmatrix("Test.xlsx");

[coeff,score,latent]=pca(D);

[idx,H,sumd]=kmeans(D,3,MaxIter=1000,Display="final",Replicates=5);

Replicate 1, 10 iterations, total sum of distances = 10012.1. Replicate 2, 12 iterations, total sum of distances = 10011.4. Replicate 3, 11 iterations, total sum of distances = 10012.1. Replicate 4, 10 iterations, total sum of distances = 10012.1. Replicate 5, 13 iterations, total sum of distances = 10017.1. Best total sum of distances = 10011.4

vbls = {'Depth','Sample','Ping','sea bottom mean','Length','Height','Perimeter','Area','BAmean','TAmean','Elongation','UNEVENNESS1','UNEVENNESS"','Lectangularity','Fractual demensiton','Circularity'};

figure

biplot(coeff(:,1:3),'scores',score(:,1:3),"VarLabels",vbls)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Atsushi Ueno 2022년 11월 11일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/1848558-how-to-change-color-of-the-data-on-biplot-by-the-result-of-clustering#answer_1097293

편집: Atsushi Ueno 2022년 11월 11일

Biplot - MATLAB biplot (Modify Biplot Properties) (mathworks.com)

For attached data, the output of biplot function becomes like below.

The graphic handle "h" in this example contains 104 object handles.

Handles h(1:16) correspond to line handles for the three variables.
Handles h(17:32) correspond to marker handles for the three variables.
Handles h(33:48) correspond to text handles for the three variables.
Handles h(49:1012) correspond to line handles for the observations.
The last handle h(1013) corresponds to a line handle for the axis lines.

Also, "Cluster indices" (idx) which is one of output of kmeans function, is used as color index.

But there is a drawback that these values (from 1 to 3 in this case) change every time they are executed.

D=readmatrix("https://jp.mathworks.com/matlabcentral/answers/uploaded_files/1188973/Test.xlsx");

[coeff,score,latent]=pca(D);

[idx,H,sumd]=kmeans(D,3,MaxIter=1000,Display="final",Replicates=5);

Replicate 1, 14 iterations, total sum of distances = 10080.8. Replicate 2, 14 iterations, total sum of distances = 10804.3. Replicate 3, 9 iterations, total sum of distances = 10014.8. Replicate 4, 12 iterations, total sum of distances = 10796.3. Replicate 5, 8 iterations, total sum of distances = 11103.3. Best total sum of distances = 10014.8

vbls = {'Depth','Sample','Ping','sea bottom mean','Length','Height','Perimeter','Area','BAmean','TAmean','Elongation','UNEVENNESS1','UNEVENNESS"','Lectangularity','Fractual demensiton','Circularity'};

figure

h = biplot(coeff(:,1:3),'scores',score(:,1:3),"VarLabels",vbls); % output h has been added

% added from here

xlim([-0.1 0.5]); ylim([-0.1 0.5]); zlim([-0.5 0.3]); % to make it look good

color = 'rgb'; % just for this example

for k = 1:size(D,1)

h(k + size(D,2)*3).MarkerEdgeColor = color(idx(k)); % chenge the color of data

end

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Adam Danz 2022년 11월 11일

I would encourage you to investigate this approach further by using a simpler data set with fewer points so you can see what's going on and confirm that this is what you want to do.

The demo below plots the results twice using the same data and same exact code but the results differ. This is because kmeans uses a random starting point so the grouping indices will likely differ each time you run it.

Load and compute data

load carsmall
X = [Acceleration Displacement Horsepower MPG Weight];
X = rmmissing(X);
Z = zscore(X); % Standardized data

Plot the results

[coefs,score] = pca(Z);

nClusters = width(Z);

[idx,H,sumd]=kmeans(Z,nClusters,MaxIter=1000,Replicates=5);

figure()

h = biplot(coefs(:,1:2),'Scores',score(:,1:2));

% Change color of varlines and observations according to kmeans results

colors = lines(width(Z));

tags = {h.Tag};

observationHandles = h(strcmp(tags, 'obsmarker'));

for i = 1:nClusters

h(i).Color = colors(i,:);

h(i).LineWidth = 2;

set(observationHandles(idx==i), 'Color', colors(i,:))

end

set(observationHandles, 'MarkerSize', 12)

Copy-pasted from the block above to plot this again

[coefs,score] = pca(Z);

nClusters = width(Z);

[idx,H,sumd]=kmeans(Z,nClusters,MaxIter=1000,Replicates=5);

figure()

h = biplot(coefs(:,1:2),'Scores',score(:,1:2));

% Change color of varlines and observations according to kmeans results

colors = lines(width(Z));

tags = {h.Tag};

observationHandles = h(strcmp(tags, 'obsmarker'));

for i = 1:nClusters

h(i).Color = colors(i,:);

h(i).LineWidth = 2;

set(observationHandles(idx==i), 'Color', colors(i,:))

end

set(observationHandles, 'MarkerSize', 12)

댓글을 달려면 로그인하십시오.

How to change color of the data on biplot by the result of clustering

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

How to change color of the data on biplot by the result of clustering

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

답변 (1개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기