Form a matrix with variable number of columns (with fixed row numbers)

조회 수: 3 (최근 30일)
I'm clustering a 70,000x3 data matrix(say X is data and K is no. of clusters). For that I want to store the index value (positions) of data points belonging to each cluster(required for further calculations and work) . As per the problem, it would be like Y[c, j] (c = cluster number, j = index vslue of data point ). But, the clusters formed isn't equally distributed. The number of data points belonging to different clusters will be different. Is there a way to form such matrix with variable number of columns and fixed number of rows? If not, please suggest the another way.
Thanks in advance !

채택된 답변

Walter Roberson
Walter Roberson 2018년 12월 19일
Just store the cluster number for each point. If you need to know which points belong to one particular cluster then use logical masks idx == cluster_number or find(idx == cluster_number). If you do that a lot, then calculate it once and store in cell arrays.
If it is for some reason particularly important to store everything in a single numeric array, then zero pad or nan pad the shorter rows.

추가 답변 (1개)

Image Analyst
Image Analyst 2018년 12월 19일
Why should they have the same number? What if there aren't? I guess you could force one by taking the principal components with pca() and then sorting on PC1 and the splitting it at the half way point into two clusters. Would that do what you want?
Also, what is X and K? Is K the number of clusters you want, like 2 or 3? Then what is X?
Attach your data with the paper clip icon, and a screenshot of it plotted with scatter3() using the insert frame icon.
  댓글 수: 1
Pushkar Khatri
Pushkar Khatri 2018년 12월 20일
Sorry for confusion, X = data matrix, K = number of clusters( 5 or 7)
I am saying that the number of data points in each cluster won't be same. Since, there is random allocation of clusters initially ( also 70000/ K will not be integer for all K ).
Why is the sorting needed ? Can't I just store the indices values of data points of a cluster (if that's possible ) and then use that for further clustering?

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by