I have dataset which I classified using 10 different thresholds. Then I evaluated true and false positive rate (TPR, FPR) to generate ROC curve. However, the curve looks strange. Did I evaluated the curve correctly? Below is the code which I used to generate ROC curve.
TPR=[0.214091009346534 0.231387608987612 0.265932891531049 ...
0.324782536928746 0.407704239947213 0.497932979272465 ...
0.566189022386499 0.587833185570207 0.546182718263242 ...
0.434923996561788];
FPR=[0.006017495627892 0.008669605012233 0.013377312018797 ...
0.022621821298088 0.039994426565193 0.069264094928662 ...
0.108694153334795 0.148784394110204 0.178634096117665 ...
0.194756822274831];
plot(FPR,TPR);

댓글 수: 2

Cretu Calin
Cretu Calin 2017년 4월 7일
I think that the last two values are wrong. You cannot have any point in the right side of the diagonal [(0,0),(1,1)].
Image Analyst
Image Analyst 2017년 4월 7일
Cretu, please explain why believe the last two points are to the right of the 0-to-1 diagonal:

댓글을 달려면 로그인하십시오.

 채택된 답변

Thorsten
Thorsten 2015년 11월 25일

1 개 추천

I agree that the curves look strange. If you decrease the threshold, you cannot get a smaller true positive rate. The rate can only stay the same or increase. So your two points at the end of the curve are wrong. Also, you should vary your threshold through the full range, from max to 0, such that your curve starts from (0,0) and ends at (1,1).

댓글 수: 9

Karolina
Karolina 2015년 11월 25일
Why do you think that my two last points are wrong? I have another statistics evaluated for this data (TypeI and TypeII error) and the values for these statistics looks logic (see below). TypeI starts almost at zero and ends at 0.93, so I thought that it is enough to have nice ROC curve. How should I select my thresholds to have the beginning of ROC curve at zero and the end at one?
TypeI=[0.001690304992780 0.005603857605008 0.016671577454581 ...
0.045346619920683 0.110468974349514 0.236186829620592 ...
0.427461429256604 0.645969927783621 0.826361530690066 ...
0.932744106170055];
TypeII=[0.929205323099390 0.837527907084626 0.688232020353751 ...
0.503233198019267 0.327662747863992 0.195276741144506 ...
0.111228585773647 0.062940711525258 0.036581507396483 ...
0.022156258145306];
Thorsten
Thorsten 2015년 11월 25일
편집: Thorsten 2015년 11월 25일
The last two points in TPR are smaller than the last but third point. This means that you get fewer TP's for lower thresholds. That's wrong. If N points are a hit at threshold t, they are a hit a threshold t -dt and t -2*dt. So the true positive rate should be monotonically increasing for decreasing thresholds. But that's not the case in your data. To be more specific, please expand on how you determined TPR and FPR.
You can normalize the response of your operator to the range 0,1 and then you can vary the thresholds in the range 0,1. Or if you don't want to normalize, you vary the thresholds in the range Xmin, Xmax, where Xmin, Xmax is the range of your operator response.
Karolina
Karolina 2015년 11월 25일
편집: Karolina 2015년 11월 25일
I evaluated TPR and FPR based on these formulas:
TPR=TP/TP+FN
FPR=FP/FP+TN
where TP,FN, FP, and TN are taken from https://en.wikipedia.org/wiki/Confusion_matrix
% T1 = dataset with the lowest threshold
A1=nnz(T1==3); %TP
B1=nnz(T1==4); %FP
C1=nnz(T1==6); %FN
D1=nnz(T1==8); %TN
% import A2-A10, ..., D2-D10
TPRA1=A1/(A1+C1); %True Positive Rate
FPRA1=B1/(B1+D1); %False Positive Rate
% evaluate TPRA2-TPRA10; FPRA2-FPRA10
TPR=[TPRA1 TPRA2 TPRA3 TPRA4 TPRA5 TPRA6 TPRA7 TPRA8 TPRA9 TPRA10];
FPR=[FPRA1 FPRA2 FPRA3 FPRA4 FPRA5 FPRA6 FPRA7 FPRA8 FPRA9 FPRA10];
plot(FPR,TPR);
Thorsten
Thorsten 2015년 11월 25일
편집: Thorsten 2015년 11월 25일
You missed the parentheses TPR = TP/(TP + FN); FPR = FP/(FP + TN);
But I see that this error is not in your code.
How are the T1, ..., T10 data generated? Do you have the operator response and the ground truth data?
Karolina
Karolina 2015년 11월 25일
편집: Karolina 2015년 11월 25일
Yes, I forgot the parentheses, but only in the post. In my code it is fine. My ground truth are data which have been manually classified and verified (reference.tif in the attached zip), the data which I am thresholding are in data.tif (this is only small part of my dataset), and the thresholds in thresholds.txt
T1-T10 are evaluated the following (I did this in another software than Matlab):
X=data.tif
Z1= if X is > 0.9 give 3 else 4
Z2= if X is > 1.0 give 3 else 4
Z3= if X is > 1.1 give 3 else 4
Z4= if X is > 1.2 give 3 else 4
Z5= if X is > 1.3 give 3 else 4
Z6= if X is > 1.4 give 3 else 4
Z7= if X is > 1.5 give 3 else 4
Z8= if X is > 1.6 give 3 else 4
Z9= if X is > 1.7 give 3 else 4
Z10= if X is > 1.8 give 3 else 4
Y=reference.tif % reference data have values 1 or 2
T1=Z1*Y
T2=Z2*Y
T3=Z3*Y
T4=Z4*Y
T5=Z5*Y
T6=Z6*Y
T7=Z7*Y
T8=Z8*Y
T9=Z9*Y
T10=Z10*Y
Thorsten
Thorsten 2015년 11월 25일
Here's a way to compute the ROC curve for your data:
% ground truth
GT = imread('../../Downloads/matlab/reference.tif');
GT = GT == 1; % convert to binary image
P = nnz(GT); % number of positive responses in ground truth
N = nnz(1-GT);
% responses
R = imread('../../Downloads/matlab/data.tif');
% your thresholds
thresholds = [0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8];
% alternatively, use 100 thresholds between min(R) and max(R)
% thresholds = linspace(min(R(:)), max(R(:)));
% pre-allocate for speed
tp = nan(1, length(thresholds));
fp = nan(1, length(thresholds));
for i = 1:numel(thresholds)
t = thresholds(end-i+1); % thresholds from high to low as i increases
Rt = R > t; % thresholded response
tp(i) = nnz(Rt & GT);
fp(i) = nnz(Rt & ~GT);
end
% convert to rates
TPR = tp/P;
FPR = fp/N;
plot(FPR, TPR) % ROC
Karolina
Karolina 2015년 11월 25일
Thank you! I have one more question, because after applying the script for all my dataset I have an error which is I think related to the pixels which have not data values (-3.4028235e+38). What should I do to exclude these values from evaluation. The message is:
Error using &
Matrix dimensions must agree.
Thorsten
Thorsten 2015년 11월 25일
편집: Thorsten 2015년 11월 25일
You have to restrict all computations to the valid indices:
valid_ind = R > -3.40e+38;
P = nnz(GT(valid_ind)); % number of positive responses in ground truth
N = nnz(1-GT(valid_ind));
and in the loop
tp(i) = nnz(Rt(valid_ind) & GT(valid_ind));
fp(i) = nnz(Rt(valid_ind) & ~GT(valid_ind));
The best way to do it is to write a function
function [TPR, FPR] = roc(GT, R, thresholds)
and call this function with
GT(valid_ind), R(valid_ind)
in case you have to exclude some pixels from the analysis.
Natsu dragon
Natsu dragon 2018년 2월 3일
편집: Natsu dragon 2018년 2월 3일
hello, i have used the same code with your attached data and i got results, but when i used it with my own results i got nothing, it didn't plots. can you help me to understand why?

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

태그

질문:

2015년 11월 25일

편집:

2018년 2월 3일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by