Decision tree/regression tree, how does the algorithm chose a value for the root node?

조회 수: 9 (최근 30일)
Decision tree/regression tree, how does the algorithm chose a value for the root node? I'm getting a seemingly random value that starts the split. It isn't the median in the dataset. How is Matlab choosing the beginning value? For example in the data, my Max value is 25, my min value is 3, the median is 8, but the tree is choosing a root node split of 11.5 when that value isn't even in the dataset. How is this number chosen?

답변 (1개)

Raunak Gupta
Raunak Gupta 2020년 8월 7일
Hi,
In the documentation of fitctree (which is essentially a decision tree), the Node Splitting Rules are mentioned which clearly and in-detail talks about how a split is decided for a node. So basically, a weighted impurity is calculated for the current node and that decide where to put the split at. This impurity can be set from different options present in SplitCriterion Name-Value pair. You may look into the rules for through understanding about the splitting in decision trees.
Hope it clarifies.
  댓글 수: 2
Christiana Sasser
Christiana Sasser 2020년 8월 31일
I understand the criteria for splitting but not the actual number chosen. Is it just local optimization of the numbers? I understand why one variable would be chosen over another (lower MSE) but how is the value within that variable chosen? For example, why 11.5 rather than 10 or 12?
Risyad Zaidan
Risyad Zaidan 2023년 7월 15일
Sorry, have you find a way to know how fitctree chose the value for the decision tree node? Can it be done with manual calculation?

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by