perObservationLoss

Per observation classification error of model for incremental learning

Since R2022a

Syntax

Err = perObservationLoss(Mdl,X,Y)

Err = perObservationLoss(Mdl,X,Y,Name=Value)

Description

Err = perObservationLoss(Mdl,X,Y) returns per observation classification error for model Mdl trained using predictors in X and true labels in Y.

Err is an n-by-1 vector, where n is the number of observations.

example

Err = perObservationLoss(Mdl,X,Y,Name=Value) specifies additional options using one or more Name=Value arguments.

Examples

collapse all

Compute per Observation Loss for Incremental Classification Model

Open Live Script

Load the human activity data set. Randomly shuffle the data.

load humanactivity
n = numel(actid);
rng(1); % For reproducibility
idx = randsample(n,n);
X = feat(idx,:);
Y = actid(idx);

For details on the data set, enter Description at the command line.

Responses can be one of five classes: Sitting, Standing, Walking, Running, or Dancing. Dichotomize the response by identifying whether the subject is moving (actid > 2).

Y = Y > 2;

Create an incremental linear SVM model for binary classification. Configure it for loss by specifying the class names, prior class distribution (uniform), and arbitrary coefficient and bias values. Specify a metrics window size of 1000 observations.

p = size(X,2);
Beta = randn(p,1);
Bias = randn(1);
Mdl = incrementalClassificationLinear('Beta',Beta,'Bias',Bias,...
    'ClassNames',unique(Y),'Prior','uniform','MetricsWindowSize',1000,'Metrics','classiferror');

Mdl is an incrementalClassificationLinear model. All its properties are read-only.

Preallocate the number of variables in each chunk for creating a stream of data and variables to store the performance metrics.

numObsPerChunk = 50;
nchunk = floor(n/numObsPerChunk);
L = zeros(nchunk,1); % To store loss values
PoL = zeros(nchunk,50); % To store per observation loss values

Simulate a data stream with incoming chunks of 50 observations each. At each iteration:

Call updateMetricsAndFit to update the performance metrics and fit the model to the incoming data. Overwrite the previous incremental model with the new one.
Call loss to measure the model performance on the incoming data and perObservationLoss to compute the classification error for each observation in the chunk of data and store the performance metrics.

for j = 1:nchunk
    ibegin = min(n,numObsPerChunk*(j-1) + 1);
    iend   = min(n,numObsPerChunk*j);
    idx = ibegin:iend;    
    Mdl = updateMetricsAndFit(Mdl,X(idx,:),Y(idx));
    L(j) = loss(Mdl,X(idx,:),Y(idx));
    PoL(j,:) = perObservationLoss(Mdl,X(idx,:),Y(idx));
end

PerObservationLoss computes the loss for each observation in each chunk of data after the warmup period (after IsWarm property is 1 (or true)). PoL is a nchunk-by-numObsPerChunk matrix, which, in this example, corresponds to a 481-by-50 matrix. Each row corresponds to a chunk of observation in the stream and each column corresponds to an observation in the corresponding chunk. The default warmup period is 1000 observations, which corresponds to 20 chunks of incoming data. Hence, first 20 rows of PoL only has NaN values. loss starts computing the classification error for each chunk of data, whether the model is warm or not, so L has a loss value for the first 20 as well.

Input Arguments

collapse all

`Mdl` — Incremental learning model
`incrementalClassificationKernel` model object | `incrementalClassificationLinear` model object | `incrementalClassificationECOC` model object | `incrementalClassificationNaiveBayes` model object

Incremental learning model, specified as an incrementalClassificationKernel, incrementalClassificationLinear, incrementalClassificationECOC, or incrementalClassificationNaiveBayes model object. You can create Mdl directly or by converting a supported, traditionally trained machine learning model using the incrementalLearner function. For more details, see the corresponding reference page.

`X` — Chunk of predictor data
floating-point matrix

Chunk of predictor data with which to compute the per observation loss, specified as a floating-point matrix of n observations and Mdl.NumPredictors predictor variables. The value of the ObservationsIn name-value argument determines the orientation of the variables and observations.

The length of the observation labels Y and the number of observations in X must be equal; Y(j) is the label of observation j (row or column) in X.

Note

perObservationLoss supports only floating-point input predictor data. If your input data includes categorical data, you must prepare an encoded version of the categorical data. Use dummyvar to convert each categorical variable to a numeric matrix of dummy variables. Then, concatenate all dummy variable matrices and any other numeric predictors. For more details, see Dummy Variables.

Data Types: single | double

`Y` — Chunk of labels
categorical array | character array | string array | logical vector | cell array of character vectors

Chunk of labels with which to compute the per observation loss, specified as a categorical, character, or string array, logical vector, or cell array of character vectors.

The length of the observation labels Y and the number of observations in X must be equal; Y(j) is the label of observation j (row or column) in X.

For classification problems:

If Y contains a label that is not a member of Mdl.ClassNames, perObservationLoss issues an error.
The data type of Y and Mdl.ClassNames must be the same.

Data Types: char | string | cell | categorical | logical

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: ObservationsIn="columns",LossFun="hinge" specifies that the observations are in columns and the loss function is the built-in hinge loss.

`ObservationsIn` — Orientation of data in `X`
`"rows"` (default) | `"columns"`

Orientation of data in X, specified as either "rows" or "columns".

Example: ObservationsIn="columns"

`LossFun` — Loss function
`"classiferror"` | `"binodeviance"` | `"exponential"` | `"hinge"` | `"logit"` | `"quadratic"` | `"mincost"` | function handle

Loss function, specified as a built-in loss function name or function handle.

The following table lists the built-in loss function names.

Name	Description
`"binodeviance"`	Binomial deviance
`"classiferror"`	Misclassification error rate
`"exponential"`	Exponential
`"hinge"`	Hinge
`"logit"`	Logistic
`"quadratic"`	Quadratic
`"mincost"`	Minimal expected misclassification cost (for `incrementalClassificationNaiveBayes` only)

Default is "mincost" for incrementalClassificationNaiveBayes model object and "classiferror" for other objects.

Note

You can only specify "classiferror" for incrementalClassificationECOC.

To specify a custom loss function, use function handle notation. The function must have this form:

lossval = lossfcn(C,S)

The output argument lossval is an n-by-1 floating-point vector, where n is the number of observations in X. The value in lossval(j) is the classification loss of observation j.
You specify the function name (lossfcn).
C is an n-by-K logical matrix with rows indicating the class to which the corresponding observation belongs. K is the number of distinct classes (numel(Mdl.ClassNames)), and the column order corresponds to the class order in the ClassNames property. Create C by setting C(p,q) = 1, if observation p is in class q, for each observation in the specified data. Set the other elements in row p to 0.
S is an n-by-K numeric matrix of predicted classification scores. S is similar to the Score output of predict, where rows correspond to observations in the data and the column order corresponds to the class order in the ClassNames property. S(p,q) is the classification score of observation p being classified in class q.

Example: LossFun="logit"

Example: LossFun=@lossfcn

Data Types: char | string | function_handle

Version History

Introduced in R2022a

perObservationLoss

Syntax

Description

Examples

Compute per Observation Loss for Incremental Classification Model

Input Arguments

Mdl — Incremental learning model incrementalClassificationKernel model object | incrementalClassificationLinear model object | incrementalClassificationECOC model object | incrementalClassificationNaiveBayes model object

X — Chunk of predictor data floating-point matrix

Y — Chunk of labels categorical array | character array | string array | logical vector | cell array of character vectors

Name-Value Arguments

ObservationsIn — Orientation of data in X "rows" (default) | "columns"

LossFun — Loss function "classiferror" | "binodeviance" | "exponential" | "hinge" | "logit" | "quadratic" | "mincost" | function handle

Version History

See Also

`Mdl` — Incremental learning model
`incrementalClassificationKernel` model object | `incrementalClassificationLinear` model object | `incrementalClassificationECOC` model object | `incrementalClassificationNaiveBayes` model object

`X` — Chunk of predictor data
floating-point matrix

`Y` — Chunk of labels
categorical array | character array | string array | logical vector | cell array of character vectors

`ObservationsIn` — Orientation of data in `X`
`"rows"` (default) | `"columns"`

`LossFun` — Loss function
`"classiferror"` | `"binodeviance"` | `"exponential"` | `"hinge"` | `"logit"` | `"quadratic"` | `"mincost"` | function handle