주요 콘텐츠

classificationEnsembleComponent

Pipeline component for ensemble classification

Since R2026a

    Description

    classificationEnsembleComponent is a pipeline component that creates a set of weak learner models for ensemble classification. The component uses the functionality of the fitcensemble function during the learn phase to train the ensemble classification model. The component uses the functionality of the predict and loss functions during the run phase to perform classification.

    Creation

    Description

    component = classificationEnsembleComponent creates a pipeline component for an ensemble classification model.

    example

    component = classificationEnsembleComponent(Name=Value) sets writable Properties using one or more name-value arguments. For example, you can specify the ensemble aggregation method, the number of learning cycles, and the cost of misclassification.

    Properties

    expand all

    Structural Parameters

    The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.

    This property is read-only after the component is created.

    Observation weights flag, specified as 0 (false) or 1 (true). If UseWeights is true, the component adds a third input "Weights" to the Inputs component property, and a third input tag 3 to the InputTags component property.

    Example: c = classificationEnsembleComponent(UseWeights=1)

    Data Types: logical

    Learn Parameters

    The software sets learn parameters when you create the component. You can modify learn parameters using dot notation any time before you use the learn object function. Any unset learn parameters use the corresponding default values.

    Misclassification cost, specified as a square matrix or a structure.

    • If Cost is a square matrix, Cost(i,j) is the cost of classifying a point into class j if its true class is i.

    • If Cost is a structure S, it has two fields: S.ClassificationCosts, which contains the cost matrix; and S.ClassNames, which contains the group names and defines the class order of the rows and columns of the cost matrix.

    The default is Cost(i,j)=1 if i~=j, and Cost(i,j)=0 if i=j.

    Example: c = classificationEnsembleComponent(Cost=[0 1 2; 1 0 2; 2 2 0])

    Example: c.Cost = [0 2 1; 1 0 1; 2 1 0]

    Data Types: single | double | struct

    Fraction of the training set to resample, specified as a positive scalar in the range (0,1]. To use FResample, set Resample to "on".

    This property is valid only when Method is a bagging or boosting aggregation method.

    Example: c = classificationEnsembleComponent(FResample=0.75)

    Example: c.FResample = 0.5

    Data Types: single | double

    Weak learners to use in the ensemble, specified as a weak learner template object, a cell array of weak learner template objects, or a value in this table.

    ValueDescriptionMethod Setting
    "discriminant"Discriminant analysis. For the default options, see templateDiscriminant. Recommended for "Subspace"
    "knn"k-nearest neighbors. For the default options, see templateKNN. For "Subspace" only
    "tree"Classification trees. For the default options, see templateTree. All methods except "Subspace"

    If Learners is a built-in learner template name, the component trains each weak learner using the default values of the specified algorithm. To train weak learners using custom options, create a template object using the corresponding template function.

    If Learners is a cell array of m weak learner template objects, the component grows m learners per learning cycle.

    If the value of Method is "Subspace", the default value of Learners is "knn". If the value of Method is "Bag" or any boosting method, the default value of Learners is "tree".

    Example: c = classificationEnsembleComponent(Learners=templateTree(MaxNumSplits=5))

    Example: c.Learners = "knn"

    Data Types: char | string | cell

    Learning rate for shrinkage, specified as a numeric scalar in the range (0,1]. If you specify a learning rate below 1, the ensemble learns at a slower rate, but can converge to a better solution.

    This property is valid only when Method is "AdaBoostM1", "AdaBoostM2", "LogitBoost", "GentleBoost", or "RUSBoost".

    Example: c = classificationEnsembleComponent(LearnRate=0.1)

    Example: c.LearnRate = 0.5

    Data Types: single | double

    Margin precision used to control the convergence speed, specified as a numeric scalar in the range [0,1]. MarginPrecision affects the number of boosting iterations required for convergence.

    This property is valid only when Method is "LPBoost" or "TotalBoost".

    Example: c = classificationEnsembleComponent(MarginPrecision=0.5)

    Example: c.MarginPrecision = 0.2

    Data Types: single | double

    Ensemble aggregation method, specified as a value in this table.

    ValueMethodClassification Problem Support
    "Bag"Bootstrap aggregationBinary and multiclass
    "Subspace"Random subspaceBinary and multiclass
    "AdaBoostM1"Adaptive boostingBinary only
    "AdaBoostM2"Adaptive boostingMulticlass only
    "GentleBoost"Gentle adaptive boostingBinary only
    "LogitBoost"Adaptive logistic regressionBinary only
    "LPBoost"Linear programming boosting (Requires Optimization Toolbox™)Binary and multiclass
    "RobustBoost"Robust boosting (Requires Optimization Toolbox)Binary only
    "RUSBoost"Random undersampling boostingBinary and multiclass
    "TotalBoost"Totally corrective boosting (Requires Optimization Toolbox)Binary and multiclass

    The default value of Method depends on the value of Learners:

    • If Learners includes only tree learners, the default value is "LogitBoost" for binary problems and "AdaBoostM2" for multiclass problems.

    • If Learners includes both tree and discriminant analysis learners, the default value is "AdaBoostM1" for binary problems and "AdaBoostM2" for multiclass problems.

    • If Learners does not include tree learners, the default value is "Subspace".

    Example: c = classificationEnsembleComponent(Method="Bag")

    Example: c.Method = "GentleBoost"

    Data Types: char | string

    Number of predictors to sample for each random subspace learner, specified as a positive integer in the range [1,P], where P is the total number of predictors in the first data argument of learn.

    This property is valid only when Method is "Subspace".

    Example: c = classificationEnsembleComponent(NPredToSample=2)

    Example: c.NPredToSample = 5

    Data Types: single | double

    Number of bins for binning numeric predictors, specified as a positive integer scalar.

    • If NumBins is empty ([]), the component does not bin any predictors.

    • If NumBins is a positive integer scalar, then the component bins every numeric predictor into at most NumBins equiprobable bins, and then grows trees on the bin indices instead of the original data.

    This property is valid only when Learners is "tree" or a template object created with templateTree.

    Example: c = classificationEnsembleComponent(NumBins=50)

    Example: c.NumBins = 20

    Data Types: single | double

    Number of ensemble learning cycles, specified as a positive integer or "AllPredictorCombinations".

    • If you specify a positive integer, the component trains one weak learner for each template object in Learners at every learning cycle.

    • If you specify "AllPredictorCombinations", you must set Method to "Subspace" and specify only one learner for Learners. The component trains learners for all possible combinations of predictors taken NPredToSample at a time.

    The component creates the ensemble using all trained learners and stores them in the Trained property of TrainedModel.

    Example: c = classificationEnsembleComponent(NumLearningCycles=500)

    Example: c.NumLearningCycles = "AllPredictorCombinations"

    Data Types: single | double | char | string

    Prior probabilities for each class, specified as a value in this table.

    ValueDescription
    "empirical"The class prior probabilities are the class relative frequencies. The class relative frequencies are determined by the second data argument of learn.
    "uniform"All class prior probabilities are equal to 1/K, where K is the number of classes.
    numeric vectorA numeric vector with one value for each class. Each element is a class prior probability. The component normalizes the elements such that they sum to 1.
    structure

    A structure S with two fields:

    • S.ClassNames contains a list of the class names.

    • S.ClassProbs contains a vector of corresponding prior probabilities. The component normalizes the elements such that they sum to 1.

    If you set UseWeights to true, the component renormalizes the weights to add up to the value of the prior probability in the respective class.

    Example: c = classificationEnsembleComponent(Prior="uniform")

    Example: c.Prior = "empirical"

    Data Types: single | double | char | string | struct

    Sampling proportion with respect to the lowest represented class, specified as a positive numeric scalar or a numeric vector of positive values with length equal to the number of classes in the second data argument of learn.

    Suppose that the training data has K classes, and the lowest represented class has m observations in the training data.

    • If you specify the positive numeric scalar s, the component samples s*m observations from each class.

    • If you specify the numeric vector [s1,s2,...,sK], the component samples si*m observations from class i, i = 1,...,K.

    The default value of RatioToSmallestis ones(K,1), which specifies to sample m observations from each class.

    This property is valid only when Method is "RUSBoost".

    Example: c = classificationEnsembleComponent(RatioToSmallest=[2,1])

    Example: c.RatioToSmallest = 2

    Data Types: single | double

    Flag indicating to sample with replacement, specified as "on" or "off".

    This property is valid only when Method is "bag" or when Resample is "on" and Method is a boosting aggregation method.

    Example: c = classificationEnsembleComponent(Replace="off")

    Example: c.Replace = "on"

    Data Types: char | string

    Flag indicating to resample, specified as "off" or "on".

    • If Method is a boosting method and Resample is "on", the component samples training observations using updated weights as the multinomial sampling probabilities. If Method is a boosting method and Resample is "off", the component reweights observations at every learning iteration.

    • If Method is "bag", Resample must be "on". The component resamples a fraction of the training observations specified by FResample, with or without replacement based on the value of Replace.

    This property is valid only when Method is a bagging or boosting aggregation method.

    Example: c = classificationEnsembleComponent(Resample="on")

    Example: c.Resample = "off"

    Data Types: char | string

    Target classification error, specified as a nonnegative numeric scalar. The upper bound of possible values depends on the values of RobustMarginSigma and RobustMaxMargin, but cannot exceed 1.

    This property is valid only when Method is "RobustBoost".

    Example: c = classificationEnsembleComponent(RobustErrorGoal=0.05)

    Example: c.RobustErrorGoal = 0.15

    Data Types: single | double

    Distribution spread of the classification margin, specified as a positive numeric scalar.

    This property is valid only when Method is "RobustBoost".

    Example: c = classificationEnsembleComponent(RobustMarginSigma=0.05)

    Example: c.RobustMarginSigma = 0.15

    Data Types: single | double

    Maximal classification margin, specified as a nonnegative numeric scalar. The component minimizes the number of observations in the training data with classification margins below RobustMaxMargin.

    This property is valid only when Method is "RobustBoost".

    Example: c = classificationEnsembleComponent(RobustMaxMargin=1)

    Example: c.RobustMaxMargin = 0.5

    Data Types: single | double

    Run Parameters

    The software sets run parameters when you create the component. You can modify the run parameters using dot notation at any time. Any unset run parameters use the corresponding default values.

    Loss function, specified as a built-in loss function name or a function handle.

    This table lists the available built-in loss functions.

    ValueDescription
    "binodeviance"Binomial deviance
    "classifcost"Observed misclassification cost
    "classiferror"Misclassified rate in decimal
    "exponential"Exponential loss
    "hinge"Hinge loss
    "logit"Logistic loss
    "mincost"Minimal expected misclassification cost (for classification scores that are posterior probabilities)
    "quadratic"Quadratic loss

    To specify a custom loss function, use function handle notation. For more information on custom loss functions, see LossFun.

    Example: c = classificationEnsembleComponent(LossFun="classifcost")

    Example: c.LossFun = "hinge"

    Data Types: char | string | function_handle

    Score transformation, specified as a built-in function name or a function handle.

    This table summarizes the available built-in score transform functions.

    ValueDescription
    "doublelogit"1/(1 + e–2x)
    "invlogit"log(x / (1 – x))
    "ismax"Sets the score for the class with the largest score to 1, and sets the scores for all other classes to 0
    "logit"1/(1 + ex)
    "none" or "identity"x (no transformation)
    "sign"–1 for x < 0
    0 for x = 0
    1 for x > 0
    "symmetric"2x – 1
    "symmetricismax"Sets the score for the class with the largest score to 1, and sets the scores for all other classes to –1
    "symmetriclogit"2/(1 + ex) – 1

    To specify a custom score transform function, use function handle notation. The function must accept a matrix containing the original scores and return a matrix of the same size containing the transformed scores.

    Example: c = classificationEnsembleComponent(ScoreTransform="logit")

    Example: c.ScoreTransform = "symmetric"

    Data Types: char | string | function_handle

    Option to use observations for learners, specified as a logical matrix of size N-by-T, where N is the number of observations in the first data argument of run, and T is the number of weak learners.

    When UseObsForLearner(i,j) is true, the component uses learner j to predict the class of observation i.

    Example: c = classificationEnsembleComponent(UseObsForLearner=logical([1 1; 0 1; 1 0]))

    Example: c.UseObsForLearner = logical([0 1; 1 1; 1 0])

    Data Types: logical

    Component Properties

    The software sets component properties when you create the component. You can modify the component properties (excluding HasLearnables and HasLearned) using dot notation at any time. You cannot modify the HasLearnables and HasLearned properties directly.

    Component identifier, specified as a character vector or string scalar.

    Example: c = classificationEnsembleComponent(Name="Ensemble")

    Example: c.Name = "EnsembleClassifier"

    Data Types: char | string

    Names of the input ports, specified as a character vector, string array, or cell array of character vectors. If UseWeights is true, the component adds the input port "Weights" to Inputs.

    Example: c = classificationEnsembleComponent(Inputs=["X","Y"])

    Example: c.Inputs = ["In1","In2"]

    Data Types: char | string | cell

    Names of the output ports, specified as a character vector, string array, or cell array of character vectors.

    Example: c = classificationEnsembleComponent(Outputs=["Class","ClassScore","LossVal"])

    Example: c.Outputs = ["X","Y","Z"]

    Data Types: char | string | cell

    Tags that enable automatic connection of the component inputs with other components or pipelines, specified as a nonnegative integer vector. If you specify InputTags, the number of tags must match the number of inputs in Inputs. If UseWeights is true, the component adds a third input tag to InputTags.

    Example: c = classificationEnsembleComponent(InputTags=[1 0])

    Example: c.InputTags = [0 1]

    Data Types: single | double

    Tags that enable the automatic connection of the component outputs with other components or pipelines, specified as a nonnegative integer vector. If you specify OutputTags, the number of tags must match the number of outputs in Outputs.

    Example: c = classificationEnsembleComponent(OutputTags=[1 0 4])

    Example: c.OutputTags = [1 2 0]

    Data Types: single | double

    This property is read-only.

    Indicator for learnables, returned as 1 (true). A value of 1 indicates that the component contains Learnables.

    Data Types: logical

    This property is read-only.

    Indicator showing the learning status of the component, returned as 0 (false) or 1 (true). A value of 1 indicates that the learn object function has been applied to the component, and the Learnables are nonempty.

    Data Types: logical

    Learnables

    The software sets learnables when you use the learn object function. You cannot modify learnables directly.

    This property is read-only.

    Trained model, returned as a CompactClassificationEnsemble model object.

    Object Functions

    learnInitialize and evaluate pipeline or component
    runExecute pipeline or component for inference after learning
    resetReset pipeline or component
    seriesConnect components in series to create pipeline
    parallelConnect components or pipelines in parallel to create pipeline
    viewView diagram of pipeline inputs, outputs, components, and connections

    Examples

    collapse all

    Create a classificationEnsembleComponent pipeline component.

    component = classificationEnsembleComponent
    component = 
      classificationEnsembleComponent with properties:
    
                Name: "ClassificationEnsemble"
              Inputs: ["Predictors"    "Response"]
           InputTags: [1 2]
             Outputs: ["Predictions"    "Scores"    "Loss"]
          OutputTags: [1 0 0]
    
       
    Learnables (HasLearned = false)
        TrainedModel: []
    
       
    Structural Parameters (locked)
          UseWeights: 0
    
    
    Show all parameters
    

    component is a classificationEnsembleComponent object that contains one learnable, TrainedModel. This property remains empty until you pass data to the component during the learn phase.

    To use a bootstrap ensemble aggregation algorithm, set the Method property of the component to "Bag".

    component.Method = "Bag";

    Read the fisheriris data set into a table. Store the predictor and response data in the tables X and Y, respectively.

    fisheriris = readtable("fisheriris.csv");
    X = fisheriris(:,1:end-1);
    Y = fisheriris(:,end);

    Use the learn object function to train the classificationEnsembleComponent using the entire data set.

    component = learn(component,X,Y)
    component = 
      classificationEnsembleComponent with properties:
    
                Name: "ClassificationEnsemble"
              Inputs: ["Predictors"    "Response"]
           InputTags: [1 2]
             Outputs: ["Predictions"    "Scores"    "Loss"]
          OutputTags: [1 0 0]
    
       
    Learnables (HasLearned = true)
        TrainedModel: [1×1 classreg.learning.classif.CompactClassificationEnsemble]
    
       
    Structural Parameters (locked)
          UseWeights: 0
    
       
    Learn Parameters (locked)
              Method: 'Bag'
    
    
    Show all parameters
    

    Note that the HasLearned property is set to true, which indicates that the software trained the ensemble model TrainedModel. You can use component to classify new data using the run object function.

    Version History

    Introduced in R2026a