Question on Agglomerative hierarchical cluster, Distance matrix has more elements than the maximum allowed size in MATLAB.

조회 수: 3 (최근 30일)
Per the documentation in matlab, we can do hierarchical clusteringhierarchical clustering as such:
Y = pdist(X)
Z = linkage(Y)
T = cluster(Z,'cutoff',1.2)
My data set has 50663 samples. I can not load the entire data set because each sample has ~500000 features. So first I calculate the square form of the distance matrix 2 samples at a time to overcome the memory issue. Then I vectorize the lower triangular distance matrix ((:)), which will produce the same output as pdist (Y in the above example, size 1283344453).
When I try to apply linkage I get the error "Distance matrix has more elements than the maximum allowed size in MATLAB".
1) The distance matrix and its vectorized form are both loaded in my workspace, so .... no
2) Is it trying to calculate a new distance matrix based on the difference of the 1283344453 elements? The documentation does not say load the squareform distance matrix.
I am lost. The only way I can get around the size of my data set is to use the distance matrix instead of the actual data. Any thoughts would be appreciated.

답변 (1개)

Michael Moore
Michael Moore 2022년 8월 15일
The linkage function is expecting a row-vector, if you pass it a column vector you will get this warning. Try passing the transpose of your vector to the linkage function.

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by