What threshold is plotconfusion applying?

조회 수: 8 (최근 30일)
Caleb Begly
Caleb Begly 2018년 5월 2일
편집: Greg Heath 2018년 5월 3일
There is some interesting behaviour with plotconfusion when passing it double values instead of categorical values. If I have some predicted responses and plot the confusion matrix like this:
plotconfusion(Y', yEst)
I get different results from when I do this to set the decision boundary at 0.5 (which is what the confusion() function documentation claims is the boundary used, although I'm not sure if that's what plotconfusion uses).
plotconfusion(Y', double(yEst > 0.5))
What is it actually doing behind the scenes?
Code and data provided

채택된 답변

Greg Heath
Greg Heath 2018년 5월 3일
편집: Greg Heath 2018년 5월 3일
% The example in the "help PLOTCONFUSION" dcumentation doesn't help because there are no errors with the simpleclass_dataset! Therefore, consider the cancer dataset in the "doc PLOTCONFUSION" example
close all, clear all, clc
[ x t ] = cancer_dataset;
[ I N ] = size(x) %[ 9 699 ]
[ O N ] = size(t) %[ 2 699 ]
vart = mean(var(t',1)) % 0.2259
t1 = t(1,:); t2 = t(2,:);
N1 = sum(t1), N2 = sum(t2) % 458, 241
m1 = mean(t1), m2 = mean(t2) % 0.6552, 0.3448
vart1 = var(t1,1), vart2 = var(t2,1)% 0.2259, 0.2259
net = patternnet(10);
rng(0)
[net tr y e ] = train(net,x,t);
% y = net(x); e = t - y;
NMSE = mse(e)/vart % 0.0984
% NOTE: Although regression error ~ 10%)
% classifcation error will only be ~ 3%
[c,cm,ind,per] = confusion(t,y)
% cm = 446 12 ( 12/458 = 0.0262 )
% 8 233 ( 8/241 = 0.0332 )
% c = 0.0286 ( 20/699 = 0.0286 )
% ( 8/454 = 0.0176 , 12/245 = 0.0490 )
% per = 0.0490 0.0176 0.9824 0.9510
% 0.0176 0.0490 0.9510 0.9824
plotconfusion(t,y)
NOTE: There are no thresholds to apply. The classification is determined by the class with the highest output which is interpreted as a posterior probability.
Hope this helps.
Thank you for formally accepting my answer
Greg
  댓글 수: 2
Greg Heath
Greg Heath 2018년 5월 3일
IT SHOULD BE NOTED THAT THE TARGET MATRIX COLUMNS SHOULD ALWAYS BE COLUMNS OF THE UNIT MATRIX !!!
The corresponding values are often interpreted as prior probabilities.
Greg
Caleb Begly
Caleb Begly 2018년 5월 3일
Ok, thanks for the help. So it looks like it looks at the yEst values as "probabilities" and takes the one that is the largest. Not sure what you mean by taking the class with the lowest error, because it is not picking the class for each sample where the difference between y and yEst is smallest. However, the following works:
[~, class] = max(yEst);
yEstNew = full(ind2vec(class));
plotconfusion(Y', yEstNew); % gives the same values as plotconfusion(Y', yEst);
I'm still not sure exactly what function it is doing though, because this doesn't explain how it would handle it if y isn't 0 or 1.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Deep Learning Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by