Index number of k-means clusters

조회 수: 4 (최근 30일)
Gabriel
Gabriel 2016년 2월 25일
댓글: wathiq dukhan 2019년 7월 26일
Hi guys, I have a program that cluster some data and then calculate the optimized number of clusters.
I want to know how I can get the index number of each data so I can plot them separated in clusters
My code is the following:
%importar dados do excel
imp = xlsread('Academia.xlsx');
%%Loop Kmeans K clusters
k=20
CH=zeros(1,k);
SH=zeros(1,k);
DB=zeros(1,k);
SUB=zeros(1,k);
%
% for i=1:k
% CH{i}=0;
% SH{i}=0;
% DB{i}=0;
% end
% eva = evalclusters(x,clust,criterion)
for i=1:k
[idx,C]=kmeans(imp,i,'MaxIter',10000); %C=centroides
% Como pegar o valor de idx ?
eva = evalclusters(imp, idx, 'CalinskiHarabasz')
CH(1,i)=eva.CriterionValues
eva2 = evalclusters(imp, idx, 'Silhouette')
SH(1,i)=eva2.CriterionValues
eva3 = evalclusters(imp, idx, 'DaviesBouldin')
DB(1,i)=eva3.CriterionValues
end
CH(1,1)=0
% % Descobrir as diferenças entre os valores de CH
while i>1
SUB(1,i)=CH(1,i)-CH(1,i-1);
i=i-1;
end
SUB(1,1)=0
%Achar o pulo maximo entre os clusters para o valor de CH:
% cell2mat(SUB); %Converte para matriz
%SUB2=cell2mat(SUB)
[V,N]=max(SUB) % Por algum motivo está pulando a célula vazia e fornecendo valor incorreto
SH
%SHF=cell2mat(SH)
[V2,N2]=max(SH) %Valor mais próximo de 1
DB
%DBF=cell2mat(DB)
[V3,N3]=min(DB) %Valor mais próximo de 0
if (N==N2) && (N2==N3)
disp('CH=SH=DB')
N
N2
N3
i=N
% y=xlswrite('Academia_target.xlsx',M(1,i),'D1:D80')
elseif N2==N3
disp('SH=DB')
elseif N==N3
disp('CH=DB')
elseif N==N2
disp('CH=SH')
else
disp('Todas as métricas forneceram valores diferentes')
end
In this case I want to retrieve the idx from the clustering in 4 clusters which is the optimized on in my case ( the Excel file from which I take the data is attached to this question.
Thanks in Advance !
  댓글 수: 2
jgg
jgg 2016년 2월 25일
I'm unclear what the problem is here; idx is a vector corresponding to the cluster ID for each observation for your fitted cluster. Isn't that what you want?
Gabriel
Gabriel 2016년 2월 27일
Hi jgg,
The thing is that each time the K-means runs (up to 20) the value of idx changes. And I only know which value "i" I need after the loop for is completed. So I need to store all of idx values and have a way to retrieve the value which corresponds to i=4 in this case.
I already tried using a variable R{i}=idx after the k-means but for some reason I receive a error message saying (Cell contents assignment to a non-cell array object.
Error in import_excelv4 (line 27) R{i}=idx

댓글을 달려면 로그인하십시오.

답변 (1개)

wathiq dukhan
wathiq dukhan 2019년 7월 26일
I would like a program to calculate the number of clusters by genetic algorithm and k-means.
  댓글 수: 2
Walter Roberson
Walter Roberson 2019년 7월 26일
k-means itself must be passed the number of clusters to use; it is not able to calculate the number of clusters. However, there are algorithms that can be used that run k-means a number of times and take estimates of what the most likely number of clusters is under certain conditions.
wathiq dukhan
wathiq dukhan 2019년 7월 26일
I need to use GA to calculate the number of clusters.

댓글을 달려면 로그인하십시오.

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by