regressionTreeComponent
Description
regressionTreeComponent is a pipeline component that creates a regression
model using a binary decision tree. The pipeline component uses the functionality of the
fitrtree function during the learn phase to train the tree regression model. The
component uses the functionality of the predict and loss functions during the run phase to perform
regression.
Creation
Description
creates
a pipeline component for a tree regression model.component = regressionTreeComponent
sets writable Properties using one or more
name-value arguments. For example, you can specify the maximum number of decision splits,
pruning criterion, and minimum leaf size.component = regressionTreeComponent(Name=Value)
Properties
Structural Parameters
The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.
This property is read-only after the component is created.
Observation weights flag, specified as 0 (false)
or 1 (true). If UseWeights is
true, the component adds a third input "Weights" to the
Inputs component property, and a third input tag
3 to the InputTags component
property.
Example: c = regressionTreeComponent(UseWeights=1)
Data Types: logical
Learn Parameters
The software sets learn parameters when you create the component. You can modify learn
parameters using dot notation any time before you use the learn object
function. Any unset learn parameters use the corresponding default values.
Maximal number of decision splits (or branch nodes), specified as a nonnegative
scalar. The software splits MaxNumSplits or fewer branch
nodes.
The default value is size(X-1) where
X is the number of observations in the first data argument of
learn.
Example: c =
regressionTreeComponent(MaxNumSplits=5)
Example: c.MaxNumSplits = 10
Data Types: single | double
Leaf merge flag, specified as "on" or
"off".
When MergeLeaves is "on", then the component:
Merges leaves originating from the same parent node if that yields a sum of risk values greater than or equal to the risk associated with the parent node.
Estimates the optimal sequence of pruned subtrees, but does not prune the regression tree.
Example: c =
regressionTreeComponent(MergeLeaves="off")
Example: c.MergeLeaves = "on"
Data Types: char | string
Minimum number of leaf node observations, specified as a positive integer scalar.
Each leaf has at least MinLeafSize observations per tree leaf. If
you supply both MinParentSize and MinLeafSize, then the component
uses the setting that gives larger leaves: .MinParentSize
=
max(MinParentSize,2*MinLeafSize)
Example: c =
regressionTreeComponent(MinLeafSize=3)
Example: c.MinLeafSize = 1
Data Types: single | double
Minimum number of branch node observations, specified as a positive integer
scalar. Each branch node has at least MinParentsSize
observations. If you supply both MinParentSize and MinLeafSize, then the component uses the setting that gives larger leaves:
. MinParentSize =
max(MinParentSize,2*MinLeafSize)
Example: c =
regressionTreeComponent(MinParentSize=8)
Example: c.MinParentSize = 12
Data Types: single | double
Number of bins for numeric predictors, specified as a positive integer scalar or
[] (empty).
If
NumBinsis empty, ([]), then the component does not bin any predictors.If you specify
NumBinsas a positive integer scalar, then the component bins every numeric predictor into at mostNumBinsequiprobable bins, and then grows trees on the bin indices instead of the original data.
Example: c = regressionTreeComponent(NumBins=50)
Example: c.NumBins = []
Data Types: single | double
Number of predictors to select at random for each split, specified as
"all" or a positive integer scalar.
Example: c =
regressionTreeComponent(NumVariablesToSample=3)
Example: c.NumVariablesToSample = "all"
Data Types: single | double | char | string
Algorithm used to select the best split predictor at each node, specified as a value in this table.
| Value | Description |
|---|---|
"allsplits" | Standard CART — Selects the split predictor that maximizes the split-criterion gain over all possible splits of all predictors [1]. |
"curvature" | Curvature test — Selects the split predictor that minimizes the p-value of chi-square tests of independence between each predictor and the response [2]. Training speed is similar to standard CART. |
"interaction-curvature" | Interaction test — Chooses the split predictor that minimizes the p-value of chi-square tests of independence between each predictor and the response, and that minimizes the p-value of a chi-square test of independence between each pair of predictors and response [2]. Training speed can be slower than standard CART. |
For "curvature" and "interaction-curvature",
if all tests yield p-values greater than 0.05, then the component
stops splitting nodes.
Example: c =
regressionTreeComponent(PredictorSelection="curvature")
Example: c.PredictorSelection =
"interaction-curvature"
Data Types: char | string
Flag to estimate the optimal sequence of pruned subtrees, specified as
"on" or "off". If Prune
is "on", then the component grows the regression tree without
pruning it, but estimates the optimal sequence of pruned subtrees. If
Prune is "off" and MergeLeaves
is also "off", then the component grows the regression tree without
estimating the optimal sequence of pruned subtrees.
Example: c = regressionTreeComponent(Prune="off")
Example: c.Prune = "on"
Data Types: char | string
Pruning criterion, specified as "mse".
Data Types: char | string
Quadratic error tolerance per node, specified as a positive scalar. The component
stops splitting nodes when the weighted mean squared error per node drops below
,
where ε is the weighted mean squared error of all
n responses computed before growing the decision tree.QuadraticErrorTolerance*ε
wi is the weight of observation i, given that the weights of all the observations sum to one (), and
is the weighted average of all the responses.
Example: c =
regressionTreeComponent(QuadraticErrorTolerance=1e-4)
Example: c.QuadraticErrorTolerance = 1e-5
Data Types: single | double
Flag to enforce reproducibility over repeated runs of training a model, specified
as 0 (false) or 1
(true).
If NumVariablesToSample is not "all", then the component
selects predictors at random for each split. To reproduce the random selections, you
must specify Reproducible as true and set the
seed of the random number generator using rng.
Example: c =
regressionTreeComponent(Reproducible=true)
Example: c.Reproducible = 0
Data Types: logical
Split criterion, specified as "MSE".
Data Types: char | string
Surrogate decision splits flag, specified as "off",
"on", "all", or a positive integer scalar.
If
Surrogateis"on", the component finds at most 10 surrogate splits at each branch node.If
Surrogateis"all", the component finds all surrogate splits at each branch model, which can use considerable time and memory.If
Surrogateis a positive integer scalar, the component finds at most the specified number of surrogate splits at each branch node.
Example: c =
regressionTreeComponent(Surrogate="on")
Example: c.Surrogate = "all"
Data Types: single | double | char | string
Run Parameters
The software sets run parameters when you create the component. You can modify the run parameters using dot notation at any time. Any unset run parameters use the corresponding default values.
Loss function, specified as "mse" (mean squared error) or a
function handle.
To specify a custom loss function, use function handle notation. For more
information on custom loss functions, see LossFun.
Example: c =
regressionTreeComponent(LossFun=@lossfun)
Example: c.LossFun = "mse"
Data Types: char | string | function_handle
Function for transforming raw response values, specified as a function handle or function
name. The default is "none", which means @(y)y, or
no transformation. The function must accept a vector (the original response values) and
return a vector of the same size (the transformed response values).
Example: c = regressionTreeComponent(ResponseTransform=@(y)exp(y))
Example: c.ResponseTransform = "exp"
Data Types: char | string | function_handle
Tree size, specified as one of the following values.
"se"— The component returns the best pruning level, which corresponds to the smallest tree whose mean squared error (MSE) is within one standard error of the minimum MSE."min"— The component returns the best pruning level, which corresponds to the minimal MSE tree.
Example: c =
regressionTreeComponent(TreeSize="min")
Example: c.TreeSize = "se"
Data Types: char | string
Component Properties
The software sets component properties when you create the component. You can modify the
component properties (excluding HasLearnables and
HasLearned) using dot notation at any time. You cannot modify the
HasLearnables and HasLearned properties
directly.
Component identifier, specified as a character vector or string scalar.
Example: c = regressionTreeComponent(Name="Tree")
Example: c.Name = "TreeRegression"
Data Types: char | string
Names of the input ports, specified as a character vector, string array, or cell
array of character vectors. If UseWeights is true, the software adds the input port
"Weights" to Inputs.
Example: c =
regressionTreeComponent(Inputs=["X","Y"])
Example: c.Inputs = ["X1","Y1"]
Data Types: char | string | cell
Names of the output ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
regressionTreeComponent(Outputs=["Responses","LossVal"])
Example: c.Outputs = ["X","Y"]
Data Types: char | string | cell
Tags that enable the automatic connection of the component inputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
InputTags, then the number of tags must match the number of
inputs in Inputs. If
UseWeights is true, the software adds a third input tag to
InputTags.
Example: c = regressionTreeComponent(InputTags=[0
1])
Example: c.InputTags = [1 0]
Data Types: single | double
Tags that enable the automatic connection of the component outputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
OutputTags, then the number of tags must match the number of
outputs in Outputs.
Example: c = regressionTreeComponent(OutputTags=[0
1])
Example: c.OutputTags=[1 2]
Data Types: single | double
This property is read-only.
Indicator for the learnables, returned as 1
(true). A value of 1 indicates that the
component contains Learnables.
Data Types: logical
This property is read-only.
Indicator showing the learning status of the component, returned as
0 (false) or 1
(true). A value of 1 indicates that the
learn
object function has been applied to the component and the Learnables are nonempty.
Data Types: logical
Learnables
The software sets learnables when you use the learn object
function. You cannot modify learnables directly.
This property is read-only.
Trained model, returned as a CompactRegressionTree model object.
Object Functions
learn | Initialize and evaluate pipeline or component |
run | Execute pipeline or component for inference after learning |
reset | Reset pipeline or component |
series | Connect components in series to create pipeline |
parallel | Connect components or pipelines in parallel to create pipeline |
view | View diagram of pipeline inputs, outputs, components, and connections |
Examples
Create a regressionTreeComponent component.
component = regressionTreeComponent
component =
regressionTreeComponent with properties:
Name: "RegressionTree"
Inputs: ["Predictors" "Response"]
InputTags: [1 2]
Outputs: ["Predictions" "Loss"]
OutputTags: [1 0]
Learnables (HasLearned = false)
TrainedModel: []
Structural Parameters (locked)
UseWeights: 0
Show all parameterscomponent is a regressionTreeComponent object
that contains one learnable, TrainedModel. This property remains
empty until you pass data to the component during the learn phase.
To limit the number of splits in the tree model, set the
MaxNumSplits property of the component to
7.
component.MaxNumSplits = 7;
Load the carsmall data set and remove missing entries from the
data. Separate the predictor and response variables into two tables.
load carsmall carData = table(Cylinders,Displacement,Horsepower,Weight,MPG); R = rmmissing(carData); X = R(:,["Cylinders","Displacement","Horsepower","Weight"]); Y = R(:,"MPG");
Train the regressionTreeComponent.
component = learn(component,X,Y)
component =
regressionTreeComponent with properties:
Name: "RegressionTree"
Inputs: ["Predictors" "Response"]
InputTags: [1 2]
Outputs: ["Predictions" "Loss"]
OutputTags: [1 0]
Learnables (HasLearned = true)
TrainedModel: [1×1 classreg.learning.regr.CompactRegressionTree]
Structural Parameters (locked)
UseWeights: 0
Learn Parameters (locked)
MaxNumSplits: 7
Show all parameters
Note that the HasLearned property is set to
true, which indicates that the software trained the tree model
TrainedModel. You can use component to predict
response values for new data using the run
function.
References
[1] Breiman, L., J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Boca Raton, FL: CRC Press, 1984.
[2] Loh, W.Y. “Regression Trees with Unbiased Variable Selection and Interaction Detection.” Statistica Sinica, Vol. 12, 2002, pp. 361–386.
[3] Loh, W.Y. and Y.S. Shih. “Split Selection Methods for Classification Trees.” Statistica Sinica, Vol. 7, 1997, pp. 815–840.
Version History
Introduced in R2026a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)