Cosine distance range interpretation

조회 수: 9 (최근 30일)
Louis
Louis 2013년 12월 13일
편집: Roger Stafford 2013년 12월 14일
I am trying to use the cosine distance in pdist2. I am confused about it's output. As far as I know it should be between 0 and 1. Since Matlab uses 1-(cosine), then 1 would be the highest variability while 0 would be the lowest. However the output seems to range from 0.5 to 1.5 or something along that!
Can somebody please advise me on how to interpret its output and why ?
  댓글 수: 1
dpb
dpb 2013년 12월 13일
Looking at the m-file, it doesn't appear to do what it says, precisely...
...
case 'cos' % Cosine
[X,Y,flag] = normalizeXY(X,Y);
...
case 'cor' % Correlation
X = bsxfun(@minus,X,mean(X,2));
Y = bsxfun(@minus,Y,mean(Y,2));
[X,Y,flag] = normalizeXY(X,Y);
...
case 'spe'
X = tiedrank(X')'; % treat rows as a series
Y = tiedrank(Y')';
X = X - (p+1)/2; % subtract off the (constant) mean
Y = Y - (p+1)/2;
[X,Y,flag] = normalizeXY(X,Y);
...
case {'cos' 'cor' 'spe'} % Cosine, Correlation, Rank Correlation
% This assumes that data have been appropriately preprocessed
for i = 1:ny
d = zeros(nx,1,outClass);
for q = 1:p
d = d + (X(:,q).*Y(i,q));
end
...
There's some other normalization and ordering but no cos() in sight. The difference between the various alternatives seems only in the precondition of the input values before the distance computation for the three cases here.
I don't have time at the moment to try to actually read this more thoroughly; perhaps the above will give you some klews...

댓글을 달려면 로그인하십시오.

채택된 답변

Roger Stafford
Roger Stafford 2013년 12월 14일
편집: Roger Stafford 2013년 12월 14일
Your statement, "Since Matlab uses 1-(cosine), then 1 would be the highest variability while 0 would be the lowest", is not true. The cosine difference as defined by matlab can range anywhere between 0 and 2. The cosine of the included angle between two vectors can range from -1 up to +1, so one minus cosine would range from 2 down to 0. The cosine distance would be zero for two vectors pointing in the same direction while it would be 2 if they pointed in opposite directions. The formula for computing the cosine distance between two vectors v1 and v2 is:
d = 1 - dot(v1,v2)/norm(v1)/norm(v2);
Suppose you have v1 = [6,-8], and v2 = [-9,12] which are pointed in exactly opposite directions. The computation would go:
d = 1 - ((6)*(-9)+(-8)*(12))/...
sqrt((6)^2+(-8)^2)/sqrt((-9)^2+(12)^2)
= 1 - (-150)/10/15 = 1 - (-1) = 1 + 1 = 2

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Descriptive Statistics에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by