ROC curve for multiclass classifier
조회 수: 7 (최근 30일)
이전 댓글 표시
Hi,
I want to plot RoC curve for multiclass (6 class in total) classifiers that includes SVM, KNN, Naive Bayes, Random Forest and Ensemble. I did calculated the confusion matrix along with Precision Recall but I'm not able to generate the graph that includes ROC and AUC curve. I did tried perfcurve but its for binary class. Can anyone please help me figuring out how to resolve this? Thank you.
댓글 수: 0
답변 (3개)
Mobayode Akinsolu
2021년 1월 3일
편집: Mobayode Akinsolu
2021년 1월 3일
See the following the example for three classes available here (note that you will need to view the section on "Plot ROC Curve for Classification Tree" on the page): https://www.mathworks.com/help/stats/perfcurve.html
For mutiple classes, I will suggest the following according to the example above:
[validationPredictions, validationScores] = validationPredictFcn(validationPredictors)
% validationPredictFcn is any user-defined prediction function which returns the scores - see help info. on "predict" and others.
% validationPredictors is the test data (e.g., table (holding the predictors) of size n-by-k
% n is the number of samples or observations
% k is the number of classes
% validationPredictions is the cell containing all predictions (size is n-by-1)
% validationScores (n-by-k) are the posterior probabilities that an observation (a row vector of predictors) belongs to a class
Assuming that I have four classes ('A', 'B', 'C', and 'D') and I want to plot the ROC curve for the second class which is labelled 'B' (making 'B' the positive class and the other classes, 'A', 'C', and 'D' the negative classes), I will try the following to obtain a score for the positive class 'B' which factors in the negative classes 'A', 'C', and 'D':
diffscore=zeros;
for i=1:size(scores,1)
temp=scores(i,:); % a row vector holding the scores for the classes [A, B, C, D] for the ith observation out of the total.
% score of +ve class minus the maximum of the scores of all the negative classes (similar to the example available via the webpage link)
diffscore(i,:)=temp(2)-max([temp(1),temp(3),temp(4)]);
end
Once the new scores have been obtained, you can proceed to plot the ROC curve for the positive B class ('A', 'C' and 'D' are the negative classes) as follows (similar to the example available via the link):
[X,Y,T,~,OPTROCPT,suby,subnames] = perfcurve(validationResponse,diffscore,'B'); % validationResponse: True class labels
"validationResponse" in this case is specified as the cell array of character vectors, or categorical array holding the true class lables for "validationPredictors", the test data.
% Plot ROC Curve
plot(X,Y)
hold on
plot(OPTROCPT(1),OPTROCPT(2),'ro')
xlabel('False positive rate')
ylabel('True positive rate')
title('ROC Curve')
hold off
You can try the above for as many classes as possible by simply revising this line in the code according to the current positive class and the negative classes. For example:
diffscore(i,:)=temp(2)-max([temp(1),temp(3),temp(4),temp(5),temp(6)]); % for six classes
Please let me know if this helps.
댓글 수: 5
Mobayode Akinsolu
2022년 3월 26일
I reckon that any good mathematics/statistics text focused on probability ought to provide you with a very detailed rationale for using these scores.
For "quick" references:
Please see Section 5.6.5 Area under the ROC Curve (AUC) in: A. Burkov, The Hundred-Page Machine Learning Book, vol. 1. Québec, QC, Canada: Andriy Burkov, 2019.
We have also explained this briefly in the 2nd paragraph of Section VI A, in this fairly recent published paper.
I hope that helps.
참고 항목
카테고리
Help Center 및 File Exchange에서 ROC - AUC에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!