What is 'categorical predictor' in decision tree for regression

조회 수: 9 (최근 30일)
Salad Box
Salad Box 2019년 1월 10일
댓글: Rui An 2020년 11월 23일
Hi, I'd like to use Matlab's own example for the question. Please refer to https://uk.mathworks.com/help/stats/fitrtree.html for the original example.
>> load carsmall
>> whos
Name Size Bytes Class Attributes
Acceleration 100x1 800 double
Cylinders 100x1 800 double
Displacement 100x1 800 double
Horsepower 100x1 800 double
MPG 100x1 800 double
Mfg 100x13 2600 char
Model 100x33 6600 char
Model_Year 100x1 800 double
Origin 100x7 1400 char
Weight 100x1 800 double
>> tree = fitrtree([Weight, Cylinders],MPG,...
'categoricalpredictors',2,'MinParentSize',20,...
'PredictorNames',{'W','C'})
tree =
RegressionTree
PredictorNames: {'W' 'C'}
ResponseName: 'Y'
CategoricalPredictors: 2
ResponseTransform: 'none'
NumObservations: 94
Properties, Methods
What exactly is the Categorical Predictors in this case and why it is 2?

채택된 답변

Adam Danz
Adam Danz 2019년 1월 10일
편집: Adam Danz 2019년 1월 10일
Matlab's fitrtree() function returns a regression tree object. Read more about this object and its properties here:
As you'll read in the link above, the "CategoricalPredictors" contains index values corresponding to the columns of the categorical predictor data (if none of the predictors are categorical, this will be empty []).
So, why is it CategoricalPredictors equal to 2?
Now read about the function you're using fitrtree()
One of the name-value pairs (<- link) is 'CategoricalPredictors' which, is specified in your call to fitrtree() as 2. That's because you have two predictors being treated as categorical variables, [Weight, Cylinders].
  댓글 수: 3
Salad Box
Salad Box 2019년 1월 15일
Actually, as I read a few times, 'the "CategoricalPredictors" contains index values corresponding to the columns of the categorical predictor data', 'index value' (or the 'entry') means if index value is 1, that is the first column of the predictor data, in this case, it is 'Weight'; if 'index value' is 2, that is the second column of the predictor data, in this case, it is 'Cylinders'.
So
tree = fitrtree([Weight, Cylinders],MPG,...
'categoricalpredictors',2,'MinParentSize',20,...
'PredictorNames',{'W','C'})
indicates 'Cylinders' (the 2nd column in the predictor data) is a categorical predictor. In fact, there are only 4, 6, 8 cylinders, the number is not countinuous.
Rui An
Rui An 2020년 11월 23일
how could i combine the tree with boxplot

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Linear Regression에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by