주요 콘텐츠

regressionLinearComponent

Pipeline component for regression of high-dimensional data using a linear model

Since R2026a

    Description

    regressionLinearComponent is a pipeline component that creates a linear model for regression. The pipeline component uses the functionality of the fitrlinear function during the learn phase to train the linear regression model. The component uses the functionality of the predict and loss functions during the run phase to perform regression.

    Creation

    Description

    component = regressionLinearComponent creates a pipeline component for a linear regression model.

    example

    component = regressionLinearComponent(Name=Value) sets writable Properties using one or more name-value arguments. For example, you can specify the type of linear regression model, the technique used to minimize the objective function, and the learning rate.

    Properties

    expand all

    Structural Parameters

    The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.

    This property is read-only after the component is created.

    Observation weights flag, specified as 0 (false) or 1 (true). If UseWeights is true, the component adds a third input "Weights" to the Inputs component property, and a third input tag 3 to the InputTags component property.

    Example: c = regressionLinearComponent(UseWeights=1)

    Data Types: logical

    Learn Parameters

    The software sets learn parameters when you create the component. You can modify learn parameters using dot notation any time before you use the learn object function. Any unset learn parameters use the corresponding default values.

    Maximal number of batches to process, specified as a positive integer. When the component processes BatchLimit batches, it terminates optimization. If you specify BatchLimit, then the component uses the argument that results in processing the fewest observations, either BatchLimit or PassLimit.

    This property is only valid when Solver is "sgd" or "asgd".

    The default value is ceil(1e6/BatchSize) if you specify multiple solvers and use (A)SGD to get an initial approximation for the next solver. Otherwise the component passes through the data PassLimit times by default.

    Example: c = regressionLinearComponent(BatchLimit=100)

    Example: c.BatchLimit = 500

    Data Types: single | double

    Mini-batch size, specified as a positive integer. At each iteration, the component estimates the subgradient using BatchSize observations from the first data argument of learn.

    This property is only valid when Solver is "sgd" or "asgd".

    The default value is 10 if the first data argument of learn is a numeric matrix. If it is a sparse matrix, the component uses the value max([10,ceil(sqrt(ff))]), where ff=numel(X)/nnx(X) and X is the first data argument of learn.

    Example: c = regressionLinearComponent(BatchSize=100)

    Example: c.BatchSize = 50

    Data Types: single | double

    Initial linear coefficient estimates, specified as a p-dimensional numeric vector or a p-by-L numeric matrix. p is the number of predictor variables after dummy variables are created for categorical variables, and L is the number of regularization-strength values in Lambda.

    The component optimizes the objective function L times.

    • If you specify a p-dimensional vector, the component uses Beta as the initial value for the first optimization. For each subsequent optimization, the component uses the estimate from the previous optimization as the initial value..

    • If you specify a p-by-L matrix, at iteration j, the component uses Beta(:,j) as the initial value.

    If you set Solver to "dual", then the component ignores Beta.

    Data Types: single | double

    Relative tolerance on the linear coefficients and the bias term (intercept), specified as a nonnegative scalar.

    Let Bt=[βtbt], that is, the vector of the coefficients and the bias term at optimization iteration t. If BtBt1Bt2<BetaTolerance, then optimization terminates.

    When Solver is "dual" and you also specify DeltaGradientTolerance, then optimization terminates when the component satisfies either stopping criterion. When Solver is "bfgs", "lbfgs", or "sparsa" and you also specify GradientTolerance, then optimization terminates when the component satisfies either stopping criterion.

    If the component converges for the last solver specified in Solver, then optimization terminates. Otherwise, the component uses the next solver specified in Solver.

    Example: c = regressionLinearComponent(BetaTolerance=1e-6)

    Example: c.BetaTolerance = 1e-5

    Data Types: single | double

    Initial intercept estimate, specified as a numeric scalar or an L-dimensional numeric vector. L is the number of regularization-strength values in Lambda.

    The component optimizes the objective function L times.

    • If you specify a scalar, then the component uses Bias as the initial value for the first optimization. For each subsequent iteration, the component uses the estimate from the previous optimization as the initial value.

    • If you specify an L-dimensional vector, at iteration j, the component uses Bias(j) as the initial value.

    By default:

    • If Learner is "leastsquares", then Bias is the weighted average of the second data input of learn.

    • If Learner is "svm", then Bias is the weighted median of the second data input of learn.

    Data Types: single | double

    Gradient-difference tolerance between upper and lower pool Karush-Kuhn-Tucker (KKT) complementarity conditions, specified as a nonnegative scalar. If the magnitude of the KKT violators is less than DeltaGradientTolerance, then the component terminates optimization.

    If the component converges for the last solver specified in Solver, then optimization terminates. Otherwise, the component uses the next solver specified in Solver.

    This property is only valid when Solver is "dual".

    Example: c = regressionLinearComponent(DeltaGradientTolerance=1e-2)

    Example: c.DeltaGradientTolerance = 0.1

    Data Types: single | double

    Half width of the epsilon-insensitive band, specified as a nonnegative scalar.

    This property is only valid when Learner is "svm".

    The default value is iqr(Y)/13.49, where Y is the second data input of learn. If iqr(Y) is equal to zero, then the default value is 0.1.

    Example: c = regressionLinearComponent(Epsilon=0.3)

    Example: c.Epsilon = 0.2

    Data Types: single | double

    Linear model intercept inclusion flag, specified as 1 (true) or 0 (false).

    If you specify FitBias as true, then the component includes the bias term, b, in the linear model and estimates it. Otherwise, the component sets b = 0 during estimation.

    Example: c = regressionLinearComponent(FitBias=false)

    Example: c.FitBias = true

    Data Types: logical

    Absolute gradient tolerance, specified as a nonnegative scalar.

    Let t be the gradient vector of the objective function with respect to the coefficients and bias term at optimization iteration t. If t=max|t|<GradientTolerance, then optimization terminates.

    If you also specify BetaTolerance, then optimization terminates when the software satisfies either stopping criterion.

    This property is only valid when Solver is "bfgs", "lbfgs", or "sparsa".

    Example: c = regressionLinearComponent(GradientTolerance=1e-7)

    Example: c.GradientTolerance = 1e-5

    Data Types: single | double

    Size of history buffer for Hessian approximation, specified as a positive integer. At each iteration, the component composes the Hessian using statistics from the latest HessianHistorySize iterations.

    This property is only valid when Solver is "bfgs" or "lbfgs".

    Example: c = regressionLinearComponent(HessianHistorySize=10)

    Example: c.HessianHistorySize = 20

    Data Types: single | double

    Maximal number of optimization iterations, specified as a positive integer.

    This property is only valid when Solver is "bfgs", "lbfgs", or "sparsa".

    Example: c = regressionLinearComponent(IterationLimit=500)

    Example: c.IterationLimit = 700

    Data Types: single | double

    Regularization term strength, specified as "auto", a nonnegative scalar, or a vector of nonnegative values.

    If you specify "auto", the value of Lambda is 1/n, where n is the number of observations in the first data argument of learn.

    If you specify a vector of nonnegative values, the component sequentially optimizes the objective function for each distinct value in Lambda in ascending order and computes coefficient estimates for each specified regularization strength.

    • The component uses the previous coefficient estimate as the initial estimate for the next optimization iteration unless Solver is "sgd" or "asgd" and Regularization is "lasso".

    • If Regularization is "lasso", then any coefficient estimate of 0 retains its value when the component optimizes using subsequent values in Lambda.

    Example: c = regressionLinearComponent(Lambda=10.^(-(10:-2:2)))

    Example: c.Lambda = "auto"

    Data Types: single | double | char | string

    Linear regression model type, specified as "svm" or "leastsquares".

    If you specify "svm", the component uses a support vector machine algorithm for linear regression. If you specify "leastsquares", the component uses a linear regression algorithm via ordinary least squares.

    Example: c = regressionLinearComponent(Learner="leastsquares")

    Example: c.Learner = "svm"

    Learning rate, specified as a positive scalar. LearnRate specifies how many steps to take per iteration. At each iteration, the gradient specifies the direction and magnitude of each step.

    • If Regularization is "lassso", then LearnRate is constant for all iterations.

    • If Regularization is "ridge", then LearnRate specifies the initial learning rate γ0. The component determines the learning rate for iteration t, γt, using

      γt=γ0(1+λγ0t)c.

      • λ is the value of Lambda.

      • If Solver is "sgd", then c = 1. If Solver is "asgd", then c is 3/4 when Learner is "svm" and 2/3 otherwise.

    This property is only valid when Solver is "sgd" or "asgd".

    By default, LearnRate = 1./sqrt(1+max((sum(X.^2,obsDim)))), where X is the first input value used by learn. obsDim is 1 if the observations compose the columns of X and 2 otherwise.

    Example: c = regressionLinearComponent(LearnRate=0.01)

    Example: c.LearnRate = 0.1

    Data Types: single | double

    Data to process before the next convergence check, specified as a positive integer.

    • If Solver is "sgd" or "asgd", NumCheckConvergence specifies the number of batches to process before the next convergence check. By default, the component checks for convergence about 10 times per pass through the entire data set.

    • If Solver is "dual", NumCheckConvergence specifies the number of passes through the entire data set to process before the next convergence check. By default, the component passes through the data set five times.

    Example: c = regressionLinearComponent(NumCheckConvergence=100)

    Example: c.NumCheckConvergence = 10

    Data Types: single | double

    Flag to decrease learning rate when the component detects divergence, specified as 1 (true) or 0 (false).

    If OptomizeLearnRate is true, then the component starts optimizing using LearnRate as the learning rate. If the value of the objective function increases, then the component restarts and uses half of LearnRate as the learning rate. This continues until the objective function decreases.

    This property is only valid when Solver is "sgd" or "asgd".

    Example: c = regressionLinearComponent(OptimizeLearnRate=false)

    Example: c.OptimizeLearnRate = true

    Data Types: logical

    Maximal number of passes through the data, specified as a positive integer. When the component passes through the data PassLimit times, it terminates optimization.

    If you specify BatchLimit, then the component uses the argument that results in processing the fewest observations, either BatchLimit or PassLimit.

    When Solver is "sgd" or "asgd", the default value is 1. When Solver is "dual", the default value is 10.

    Example: c = regressionLinearComponent(PassLimit=5)

    Example: c.PassLimit = 10

    Data Types: single | double

    Flag to fit linear model intercept after optimization, specified as 0 (false) or 1 (true).

    When PostFitBias is false, the component estimates the bias term and coefficients during optimization. Otherwise, the component estimates the bias terms and coefficients before refitting the bias term after optimization.

    This property is only valid when FitBias is true.

    Example: c = regressionLinearComponent(PostFitBias=true)

    Example: c.PostFitBias = false

    Data Types: logical

    Complexity penalty type, specified as "lasso" or "ridge". The component composes the objective function for minimization from the sum of the average loss function and the regularization term in this table.

    ValueDescription
    "lasso"Lasso (L1) penalty: λj=1p|βj|
    "ridge"Ridge (L2) penalty: λ2j=1pβj2

    To specify the regularization term strength (λ), use Lambda. The component excludes the bias term from the regularization penalty.

    If Solver is "sparsa", then the default value of Regularization is "lasso". Otherwise, the default is "ridge".

    Example: c = regressionLinearComponent(Regularization="lasso")

    Example: c.Regularization = "ridge"

    Objective function minimization technique, specified as a character vector or string scalar, a string array, or a cell array of character vectors with values from this table.

    ValueDescriptionRestrictions
    "sgd"Stochastic gradient descent (SGD) 
    "asgd"Average stochastic gradient descent (ASGD) 
    "dual"Dual SGD for SVMRegularization must be "ridge" and Learner must be "svm".
    "bfgs"Broyden-Fletcher-Goldfarb-Shanno quasi-Newton algorithm (BFGS)Regularization must be "ridge".
    "lbfgs"Limited-memory BFGS (LBFGS)Regularization must be "ridge".
    "sparsa"Sparse Reconstruction by Separable Approximation (SpaRSA)Regularization must be "lasso".

    If you specify multiple solvers, then, for each value in Lambda, the component uses the solutions of the previous solver as a warm start for the next solver.

    By default:

    • If Regularization is "ridge" and the first data argument of learn contains 100 or fewer predictor variables, then Solver is "bfgs".

    • If Learner is "svm", Regularization is "ridge", and the first data argument of learn contains more than 100 predictor variables, then Solver is "dual".

    • If Regularization is "lasso" and the first data argument of learn contains 100 or fewer predictor variables, then Solver is "sparsa".

    Otherwise, the default solver is "sgd".

    Example: c = regressionLinearComponent(Solver=["sgd","lbfgs"])

    Example: c.Solver = "sparsa"

    Data Types: char | string | cell

    Number of mini-batches between lasso truncation runs, specified as a positive integer.

    After a truncation run, the component applies a soft threshold to the linear coefficients. That is, after processing k = TruncationPeriod mini-batches, the component truncates the estimated coefficient j using

    β^j={β^jutifβ^j>ut,0if|β^j|ut,β^j+utifβ^j<ut.

    • When Solver is "sgd", β^j is the estimate of coefficient j after processing k mini-batches. ut=kγtλ. γt is the learning rate at iteration t. λ is the value of Lambda.

    • When Solver is "asgd", β^j is the averaged estimate coefficient j after processing k mini-batches, ut=kλ.

    This property is only valid when Solver is "sgd" or "asgd" and Regularization is "lasso".

    Example: c = regressionLinearComponent(TruncationPeriod=100)

    Example: c.TruncationPeriod = 50

    Data Types: single | double

    Run Parameters

    The software sets run parameters when you create the component. You can modify the run parameters using dot notation at any time. Any unset run parameters use the corresponding default values.

    Loss function, specified as a built-in loss function name or a function handle.

    • "mse" — Weighted mean squared error.

    • "epsiloninsensitive" — Epsilon-insensitive loss.

    • Function handle — To specify a custom loss function, use function handle notation. For more information on custom loss functions, see LossFun.

    Example: c = regressionLinearComponent(LossFun="epsiloninsensitive")

    Example: c.LossFun = "mse"

    Data Types: char | string | function_handle

    Function for transforming raw response values, specified as a function handle or function name. The default is "none", which means @(y)y, or no transformation. The function must accept a vector (the original response values) and return a vector of the same size (the transformed response values).

    Example: c = regressionLinearComponent(ResponseTransform=@(y)exp(y))

    Example: c.ResponseTransform = "exp"

    Data Types: char | string | function_handle

    Component Properties

    The software sets component properties when you create the component. You can modify the component properties (excluding HasLearnables and HasLearned) using dot notation at any time. You cannot modify the HasLearnables and HasLearned properties directly.

    Component identifier, specified as a character vector or string scalar.

    Example: c = regressionLinearComponent(Name="Linear")

    Example: c.Name = "LinearRegression"

    Data Types: char | string

    Names of the input ports, specified as a character vector, string array, or cell array of character vectors. If UseWeights is true, the software adds the input port "Weights" to Inputs.

    Example: c = regressionLinearComponent(Inputs=["X","Y"])

    Example: c.Inputs = ["X1","Y1"]

    Data Types: char | string | cell

    Names of the output ports, specified as a character vector, string array, or cell array of character vectors.

    Example: c = regressionLinearComponent(Outputs=["Responses","LossVal"])

    Example: c.Outputs = ["X","Y"]

    Data Types: char | string | cell

    Tags that enable the automatic connection of the component inputs with other components or pipelines, specified as a nonnegative integer vector. If you specify InputTags, then the number of tags must match the number of inputs in Inputs. If UseWeights is true, the software adds a third input tag to InputTags.

    Example: c = regressionLinearComponent(InputTags=[0 1])

    Example: c.InputTags = [1 0]

    Data Types: single | double

    Tags that enable the automatic connection of the component outputs with other components or pipelines, specified as a nonnegative integer vector. If you specify OutputTags, then the number of tags must match the number of outputs in Outputs.

    Example: c = regressionLinearComponent(OutputTags=[0 1])

    Example: c.OutputTags=[1 2]

    Data Types: single | double

    This property is read-only.

    Indicator for the learnables, returned as 1 (true). A value of 1 indicates that the component contains Learnables.

    Data Types: logical

    This property is read-only.

    Indicator showing the learning status of the component, returned as 0 (false) or 1 (true). A value of 1 indicates that the learn object function has been applied to the component and the Learnables are nonempty.

    Data Types: logical

    Learnables

    The software sets learnables when you use the learn object function. You cannot modify learnables directly.

    This property is read-only.

    Trained model, returned as a RegressionLinear model object.

    Object Functions

    learnInitialize and evaluate pipeline or component
    runExecute pipeline or component for inference after learning
    resetReset pipeline or component
    seriesConnect components in series to create pipeline
    parallelConnect components or pipelines in parallel to create pipeline
    viewView diagram of pipeline inputs, outputs, components, and connections

    Examples

    collapse all

    Create a regressionLinearComponent component.

    component = regressionLinearComponent
    component = 
    
      regressionLinearComponent with properties:
    
                Name: "RegressionLinear"
              Inputs: ["Predictors"    "Response"]
           InputTags: [1 2]
             Outputs: ["Predictions"    "Loss"]
          OutputTags: [1 0]
    
       
    Learnables (HasLearned = false)
        TrainedModel: []
    
       
    Structural Parameters (locked)
          UseWeights: 0
    
    
    Show all parameters

    component is a regressionLinearComponent object that contains one learnable, TrainedModel. This property remains empty until you pass data to the component during the learn phase.

    To use a least squares linear regression model, set the Learner property of the component to "leastsquares".

    component.Learner = "leastsquares";

    Load the carsmall data set and remove missing entries from the data. Separate the predictor and response variables into two tables.

    load carsmall
    carData = table(Cylinders,Displacement,Horsepower,Weight,MPG);
    R = rmmissing(carData);
    X = R(:,["Cylinders","Displacement","Horsepower","Weight"]);
    Y = R(:,"MPG");

    Train the regressionLinearComponent.

    component = learn(component,X,Y)
    component = 
      regressionLinearComponent with properties:
    
                Name: "RegressionLinear"
              Inputs: ["Predictors"    "Response"]
           InputTags: [1 2]
             Outputs: ["Predictions"    "Loss"]
          OutputTags: [1 0]
    
       
    Learnables (HasLearned = true)
        TrainedModel: [1×1 RegressionLinear]
    
       
    Structural Parameters (locked)
          UseWeights: 0
    
       
    Learn Parameters (locked)
             Learner: 'leastsquares'
    
    
    Show all parameters
    

    Note that the HasLearned property is set to true, which indicates that the software trained the linear model TrainedModel. You can use component to predict response values for new data using the run function.

    Version History

    Introduced in R2026a

    See Also

    | |