loss, the classification error
조회 수: 6 (최근 30일)
이전 댓글 표시
I'm using the "loss" function when I calculate a classification error.
Below is a confusion matrix that has one mis classification.
In abvoe case, I think the loss should be 1/7*100 = 14.3 %.
But the "loss" function shows 15.9 %.
It seems the "loss"' function has a special logic to calculate loss.
So, I'd like to know what it is. And, if possible, have loss value same as 14.3 % by modifying option in "loss" function.
댓글 수: 0
답변 (1개)
Drew
2024년 10월 2일
편집: Drew
2024년 10월 2일
As indicated at https://www.mathworks.com/help/stats/classreg.learning.classif.compactclassificationensemble.loss.html#bst1mt4-4, "The software normalizes the observation weights so that they sum to the corresponding prior class probability stored in the Prior property." So, the unexpected behavior is due to the loss function making use of the prior class probabilities stored in the model.
Since you are looking for the unweighted classification error, with no dependence on the prior class probabilities stored in the model, this can be obtained from the "loss" function by specifiying a custom loss function handle in the "LossFun" name-value argument. The custom loss function "function loss = unweighted_classiferror_LossFun(C, s, W, Cost)" given in the following code can be used with the name-value argument "LossFun=@unweighted_classiferror_LossFun". The following code sets up a simple classifier that encounters exactly the situation you mentioned in the question, and then performs the calculation of the unweighted classification error to obtain the desired 14.3% (which is the same as 1/7 or 0.1429) using the "loss" function.
% For reproducability
rng(1);
% Two classes of data
% class 1, N(0,1) with 523 observations
% class 2, N(3,1) with 477 observations
Xtrain=[randn(523,1);(randn(477,1)+3)];
% Vector of target class labels, 1 and 2
Ytrain=[ones(523,1);2*ones(477,1)];
% Make a simple tree model with one split
mdl=fitctree(Xtrain,Ytrain,MaxNumSplits=1);
view(mdl)
model_priors_based_on_training_data = mdl.Prior
model_class_names = mdl.ClassNames
% Create some fake test data that has the confusion matrix in the question
Xtest = [0 0 0 0 3 3 1.7]';
Ytest = [1 1 1 1 2 2 2]';
cm = confusionmat(Ytest,predict(mdl,Xtest))
% classiferror loss weighted by class priors from training data
weighted_classiferror_loss = loss(mdl,Xtest,Ytest)
% unweighted classiferror loss using a simple custom loss function
unweighted_classiferror_loss = loss(mdl,Xtest,Ytest,LossFun=@unweighted_classiferror_LossFun)
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Calculate unweighted loss with custom function-handle loss function
function loss = unweighted_classiferror_LossFun(C, s, W, Cost)
% C is the N-by-K logical matrix for N observations and K classes
% indicating the class to which the corresponding observation belongs.
% The column order corresponds to the class order in mdl.ClassNames.
% s is the N-by-K matrix of predicted scores
% W is the N-by-1 vector of observation weights
% Cost is the K-by-K numeric matrix of misclassification costs
% This particular implementation ignores inputs "W" and "Cost"
% Find the class with the highest score for each observation
[~, predictedClass] = max(s, [], 2);
% Find the true class for each observation
[~, trueClass] = max(C, [], 2);
% Calculate the number of misclassified instances
misclassified = trueClass ~= predictedClass;
% Calculate the unweighted classification error
loss = sum(misclassified) / length(misclassified);
end
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Sequence and Numeric Feature Data Workflows에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!