Assigned clusters to new data

조회 수: 14 (최근 30일)
Aravin
Aravin 2018년 6월 1일
댓글: Abdullah 2022년 2월 13일
Hello experts,
Lets assume I have done kmean clustering using
[a C] = kmeans(X,1000);
Now I want to assigned new data lets newX clusters IDs. What built-in method I should use ?
  댓글 수: 3
Aravin
Aravin 2018년 6월 1일
I have centroids (C) and now I want to map newX to centroids/clusters.
Abdullah
Abdullah 2022년 2월 13일
Xtest is your new data
and, C is the cluster centroid locations
Use euclidean distance to find the nearest clusters
[~,idx_test] = pdist2(C,Xtest,'euclidean','Smallest',1);

댓글을 달려면 로그인하십시오.

답변 (2개)

Aditya Adhikary
Aditya Adhikary 2018년 6월 1일
편집: Aditya Adhikary 2018년 6월 1일
K-means clustering as such is an unsupervised method. From what I gather, you would like to learn the cluster centroids using the kmeans algorithm and then use these centroids to map new test data to the centroids in some manner. You could do the following :-
1. Assign each new data point to its closest centroid, by using a distance measure like sum-of-squares(same as Euclidean distance), or cosine similarity. For these, you can simply use the formulae, or use built in methods such as norm (also refer: How to calculate Euclidean distance ) or pdist.
Another (perhaps better) way, instead of reusing the centroids :-
2. Label all the samples in your training set according to the cluster they were assigned to (ex. you can choose a cluster and label all the points inside it as belonging to class 1), and then train a classifier (could be any algorithm, such as SVM) on this training data. Afterwards, classify your test samples using this model.
Hope this helps!

KSSV
KSSV 2018년 6월 1일
Read about knnsearch. This gives you the nearest neighbors of the given point. If your neighbors falls in certain Id..the given point also falls in the same ID.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by