Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

This example shows how to use `cmdscale`

to perform classical (metric) multidimensional scaling, also known as principal coordinates analysis.

`cmdscale`

takes as an input a matrix of inter-point distances and creates a configuration of points. Ideally, those points are in two or three dimensions, and the Euclidean distances between them reproduce the original distance matrix. Thus, a scatter plot of the points created by `cmdscale`

provides a visual representation of the original distances.

As a very simple example, you can reconstruct a set of points from only their inter-point distances. First, create some four dimensional points with a small component in their fourth coordinate, and reduce them to distances.

rng default; % For reproducibility X = [normrnd(0,1,10,3),normrnd(0,.1,10,1)]; D = pdist(X,'euclidean');

Next, use `cmdscale`

to find a configuration with those inter-point distances. `cmdscale`

accepts distances as either a square matrix, or, as in this example, in the vector upper-triangular form produced by `pdist`

.

[Y,eigvals] = cmdscale(D);

`cmdscale`

produces two outputs. The first output, `Y`

, is a matrix containing the reconstructed points. The second output, `eigvals`

, is a vector containing the sorted eigenvalues of what is often referred to as the "scalar product matrix," which, in the simplest case, is equal to `Y*Y'`

. The relative magnitudes of those eigenvalues indicate the relative contribution of the corresponding columns of `Y`

in reproducing the original distance matrix `D`

with the reconstructed points.

format short g [eigvals eigvals/max(abs(eigvals))]

`ans = `*10×2*
35.41 1
11.158 0.31511
1.6894 0.04771
0.1436 0.0040553
7.9529e-15 2.246e-16
4.564e-15 1.2889e-16
2.6538e-15 7.4944e-17
-2.2475e-17 -6.3471e-19
-3.6359e-16 -1.0268e-17
-3.3335e-15 -9.4139e-17

If `eigvals`

contains only positive and zero (within round-off error) eigenvalues, the columns of `Y`

corresponding to the positive eigenvalues provide an exact reconstruction of `D`

, in the sense that their inter-point Euclidean distances, computed using `pdist`

, for example, are identical (within round-off) to the values in `D`

.

`maxerr4 = max(abs(D - pdist(Y))) % Exact reconstruction`

maxerr4 = 3.5527e-15

If two or three of the eigenvalues in `eigvals`

are much larger than the rest, then the distance matrix based on the corresponding columns of `Y`

nearly reproduces the original distance matrix `D`

. In this sense, those columns form a lower-dimensional representation that adequately describes the data. However it is not always possible to find a good low-dimensional reconstruction.

`maxerr3 = max(abs(D - pdist(Y(:,1:3)))) % Good reconstruction in 3D`

maxerr3 = 0.043142

`maxerr2 = max(abs(D - pdist(Y(:,1:2)))) % Poor reconstruction in 2D`

maxerr2 = 0.98315

The reconstruction in three dimensions reproduces `D`

very well, but the reconstruction in two dimensions has errors that are of the same order of magnitude as the largest values in `D`

.

max(max(D))

ans = 5.8974

Often, `eigvals`

contains some negative eigenvalues, indicating that the distances in `D`

cannot be reproduced exactly. That is, there might not be any configuration of points whose inter-point Euclidean distances are given by `D`

. If the largest negative eigenvalue is small in magnitude relative to the largest positive eigenvalues, then the configuration returned by `cmdscale`

might still reproduce `D`

well.