clustering, matlab, nominal data

조회 수: 2 (최근 30일)
Radoslav Vandzura
Radoslav Vandzura 2016년 1월 14일
댓글: Tom Lane 2016년 1월 30일
Hello All. I need an advice. I need recommend method of clustering which is suitable for nominal data in Matlab. Could you help me, please? I appreciate every idea. Thank you in advance.

채택된 답변

Walter Roberson
Walter Roberson 2016년 1월 15일

추가 답변 (2개)

Image Analyst
Image Analyst 2016년 1월 15일
Try the Classification Learner app on the Apps tab.
  댓글 수: 1
Tom Lane
Tom Lane 2016년 1월 16일
This could work as a post-processing step to assign new data to classes found from the original data. But classificationLearner would require that you know the clusters (groups) for the original data.

댓글을 달려면 로그인하십시오.


Tom Lane
Tom Lane 2016년 1월 16일
For hierarchical clustering, consider using Hamming distance. Here's an example that isn't realistic but that illustrates what to do:
x=randi(3,100,4); % noisy coordinates
x(1:50,5:6) = randi(2,50,2); % try to make 1st 50 points closer
x(51:100,5:6) = 2+randi(2,50,2); % next 50 points different
z = linkage(x,'ave','hamming'); % try average linkage clustering
dendrogram(z,100) % show dendrogram with all points
  댓글 수: 2
Radoslav Vandzura
Radoslav Vandzura 2016년 1월 20일
But I have categorical data and I need to do clustering....My data consist of names of operating system (UNIX, WINDOWS,...), type of virtualisation (virtual system, virtualization host,...)... Can I change these data to number? For example, UNIX-1, WINDOWS-2...??????
I don´t know what do you mean...:( IS the Classification Learner app for Classification not for clustering, isn´t it?
Tom Lane
Tom Lane 2016년 1월 30일
You are right that the clustering functions operate on matrices so you would need to convert your data to numbers. The grp2idx function could be helpful. And yes, the Classification Learner app is aimed at classifying data into known groups. Here is a simple example where you can see the Hamming distance between data represented by a three-category variable and a two-category variable.
>> x = [1 1;2 1;3 1;1 2;2 2;2 3];
>> squareform(pdist(x,'hamming'))
ans =
0 0.5000 0.5000 0.5000 1.0000 1.0000
0.5000 0 0.5000 1.0000 0.5000 0.5000
0.5000 0.5000 0 1.0000 1.0000 1.0000
0.5000 1.0000 1.0000 0 0.5000 1.0000
1.0000 0.5000 1.0000 0.5000 0 0.5000
1.0000 0.5000 1.0000 1.0000 0.5000 0

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by