Why is my accuracy of trained classifier using function generated from classification learner is less than the model directly exported from the classification learner app?

조회 수: 1 (최근 30일)
load("savedPumpData.mat");
disp(pumpData);
Data = removevars(pumpData,"flow");
save("Data.mat","Data");
disp(Data)
trainRatio = 0.7;
% Create a random partition of the data into training and test sets
c = cvpartition(size(Data, 1), 'HoldOut', 1 - trainRatio);
% Create the training and test sets
trainingData = Data(c.training, :);
testData = Data(c.test, :);
[featureTableTrain,outputTable0] = Features(trainingData);
disp(featureTableTrain)
[trainedClassifier, validationAccuracy] = BagTrees(featureTableTrain);
[featureTableTest,outputTable] = Features(testData);
disp(featureTableTest)
[yfit,scores] = BaggedTress.predictFcn(featureTableTest);
disp(yfit);
accuracy = sum(yfit==testData.faultCode)/numel(testData.faultCode)*100;
fprintf('Accuracy: %.2f%%\n', accuracy);
figure;
confusionchart(testData.faultCode, yfit);
title('Confusion Matrix RF');
[yfit1,scores1] = trainedClassifier.predictFcn(featureTableTest);
disp(yfit1);
accuracy = sum(yfit1==testData.faultCode)/numel(testData.faultCode)*100;
fprintf('Accuracy: %.2f%%\n', accuracy);
figure;
confusionchart(testData.faultCode, yfit1);
title('Confusion Matrix');
%Feature is the function code generated using Diagnostic feature designer
%BaggedTrees is the model exported to workspace using classification learner getting 90% accuracy
%BagTrees is the generated function code of the same model which is exported getting 70%
  댓글 수: 1
Vinay Maruvada
Vinay Maruvada 2023년 10월 19일
I have datasest of total 240 rows which i spitted as mentioned in above code
I have imported featureTableTrain into the Classification learner for training and featureTableTest for Testing the data

댓글을 달려면 로그인하십시오.

채택된 답변

Drew
Drew 2023년 10월 18일
Based on what you sent, it looks like the short answer is that the model exported from Classification Learner was trained on all of the data (100%), while the model trained with the training function was trained with 70% of the data.
The final model Classification Learner exports is always trained using the full data set, excluding any data reserved for testing (See https://www.mathworks.com/help/stats/export-classification-model-for-use-with-new-data.html ). If you don't want Classification Learner to use the holdout validation data when training its final model for export, then do the following:
  • Start the Classification Learner session by loading only the training data (70%). Choose whichever validation scheme you would like to use within this 70% of data.
  • After the session is started, load the remaining 30% of the data as the test set.
  • Then, when the final model is exported, it will be trained on only 70% of the data.
When exporting the model, if you check the box to "Include training data in the exported model", then you can take a look at the size of the training data by examining the properties of the exported model. For example, if the exported trainedModel is an ensemble of trees, take a look at:
size(trainedModel.ClassificationEnsemble.X)
If this answer helps you, please remember to accept the answer.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Classification Ensembles에 대해 자세히 알아보기

제품


릴리스

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by