Dimensionality reduction: select 3 random attributes for each tree in fitcensemble

조회 수: 6 (최근 30일)
Hi,
I want to use fitcensemble to determine which of my original attributes are the most important.
My idea is to create a large forest of 2000 trees. In each tree, I want to use only 3 attributes, randomnly selected. Also, I have set the maximum number of splits to 2. With this model and the method oobPermutedPredictorImportance() I understand I will get the most important attributes.
I think with my code (below) I fulfill all these conditions but the second one. How can I specify that each tree of the forest contains a small number of attributes (3) and that these attributes randomnly change from tree to tree?
if true
treeTemplate1 = templateTree('MaxNumSplits', 2,'PredictorSelection','allsplits');
Mdl1 = fitensemble(X, Y, 'bag', 2000, treeTemplate1, 'type', 'classification');
Imp1 = oobPermutedPredictorImportance(Mdl1);
end
  댓글 수: 1
Ela Markovic
Ela Markovic 2022년 11월 28일
Replying to this question from 2017 as it is still relevant and I am dealing with the same problem.
I found in MathWorks documentation the following statement:
"By default, the number of predictors to select at random for each split is equal to the square root of the number of predictors for classification, and one third of the number of predictors for regression."
How to set a different value than default is the question.

댓글을 달려면 로그인하십시오.

답변 (1개)

Ela Markovic
Ela Markovic 2022년 11월 29일
편집: Ela Markovic 2022년 11월 29일
I managed to find the answer on how to limit the number of features (or atributes) that each specific tree chooses.
You need the following Name-Value pair:
'NumVariablesToSample', 3
Your template tree is then:
treeTemplate1 = templateTree('MaxNumSplits', 2,'PredictorSelection',...
'allsplits','NumVariablesToSample', 3);

카테고리

Help CenterFile Exchange에서 Classification Ensembles에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by