Why NaN values are found in score from kfoldPredict
조회 수: 4 (최근 30일)
이전 댓글 표시
Names = {'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'};
isCategoricalPredictor = [false, false, true, false, true, false, false, false];
% Use tree learner
template = templateTree('NumVariablesToSample', 'all',... % to analyse predictor importance
'Reproducible',true, 'Surrogate','on', 'MaxNumSplits', maxNumSplits, 'MinLeafSize', minLeafSize); % Surrogate on to obtain measure of association
% optimizable variable does not accept
BestEnsembleMdl = fitcensemble(X_train,y_train,...
'Learners',template, ...
'Method', method, ...
'NumLearningCycles', numLearningCycles, ...
'Holdout', 0.2, ...
'LearnRate', learnRate, ...
'ScoreTransform','logit',... % transform scores to probabilistic estimates
'CategoricalPredictors', isCategoricalPredictor,...
'PredictorNames', Names);
[~, score] = kfoldPredict(BestEnsembleMdl);
Hi, I tried to run kfoldPredict using Classification Partitioned Ensemble produced by fitcensemble method.
When I run kfoldPredict, there are many NaN values found in the score variable returned by kfoldPredict method. Refered to the score variable in the attached mat file.
I am expecting to get real values from the score.
From example above, I use the following values:
learnRate = 0.9702
maxNumSplits = 16826
method = 'LogitBoost'
numLearningCycles = 2
minLeafSize = 1
I have saved X_train & y_train variables in the attached mat file. I have reduced the number of rows in X_train & y_train to 10 rows as a demonstration.
1) Why there are NaN values in the score?
2) What should I do to ensure that there are no NaN values in the score?
Thank you
댓글 수: 0
답변 (1개)
Shashank Gupta
2020년 11월 20일
Hey Yean,
Yes, you get NaNs at the output score, those NaNs value index denotes the "HoldOut" fraction which is used as validation data. So depending on HoldOut value, kfoldPredict choose the index from the training sample which will be used as validation and only those sample index will get scores and rest become NaN. You can check by changing the HoldOut Value and see those NaN keeps on changing. Also one suggestion make sure the classes are distributed well while training and testing.
I hope this clear some confusion and enough for you to explore.
Cheers
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Classification Ensembles에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!