# loss

Regression loss for Gaussian kernel regression model

## Syntax

``L = loss(Mdl,X,Y)``
``L = loss(Mdl,Tbl,ResponseVarName)``
``L = loss(Mdl,Tbl,Y)``
``L = loss(___,Name,Value)``

## Description

example

````L = loss(Mdl,X,Y)` returns the mean squared error (MSE) for the Gaussian kernel regression model `Mdl` using the predictor data in `X` and the corresponding responses in `Y`.```
````L = loss(Mdl,Tbl,ResponseVarName)` returns the MSE for the model `Mdl` using the predictor data in `Tbl` and the true responses in `Tbl.ResponseVarName`.```
````L = loss(Mdl,Tbl,Y)` returns the MSE for the model `Mdl` using the predictor data in table `Tbl` and the true responses in `Y`.```

example

````L = loss(___,Name,Value)` specifies options using one or more name-value pair arguments in addition to any of the input argument combinations in previous syntaxes. For example, you can specify a regression loss function and observation weights. Then, `loss` returns the weighted regression loss using the specified loss function. NoteIf the predictor data `X` or the predictor variables in `Tbl` contain any missing values, the `loss` function can return NaN. For more details, see loss can return NaN for predictor data with missing values. ```

## Examples

collapse all

Train a Gaussian kernel regression model for a tall array, then calculate the resubstitution mean squared error and epsilon-insensitive error.

When you perform calculations on tall arrays, MATLAB® uses either a parallel pool (default if you have Parallel Computing Toolbox™) or the local MATLAB session. To run the example using the local MATLAB session when you have Parallel Computing Toolbox, change the global execution environment by using the `mapreducer` function.

`mapreducer(0)`

Create a datastore that references the folder location with the data. The data can be contained in a single file, a collection of files, or an entire folder. Treat `'NA'` values as missing data so that `datastore` replaces them with `NaN` values. Select a subset of the variables to use. Create a tall table on top of the datastore.

```varnames = {'ArrTime','DepTime','ActualElapsedTime'}; ds = datastore('airlinesmall.csv','TreatAsMissing','NA',... 'SelectedVariableNames',varnames); t = tall(ds);```

Specify `DepTime` and `ArrTime` as the predictor variables (`X`) and `ActualElapsedTime` as the response variable (`Y`). Select the observations for which `ArrTime` is later than `DepTime`.

```daytime = t.ArrTime>t.DepTime; Y = t.ActualElapsedTime(daytime); % Response data X = t{daytime,{'DepTime' 'ArrTime'}}; % Predictor data```

Standardize the predictor variables.

`Z = zscore(X); % Standardize the data`

Train a default Gaussian kernel regression model with the standardized predictors. Set `'Verbose',0` to suppress diagnostic messages.

`[Mdl,FitInfo] = fitrkernel(Z,Y,'Verbose',0)`
```Mdl = RegressionKernel PredictorNames: {'x1' 'x2'} ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 64 KernelScale: 1 Lambda: 8.5385e-06 BoxConstraint: 1 Epsilon: 5.9303 Properties, Methods ```
```FitInfo = struct with fields: Solver: 'LBFGS-tall' LossFunction: 'epsiloninsensitive' Lambda: 8.5385e-06 BetaTolerance: 1.0000e-03 GradientTolerance: 1.0000e-05 ObjectiveValue: 30.7814 GradientMagnitude: 0.0191 RelativeChangeInBeta: 0.0228 FitTime: 31.5546 History: [] ```

`Mdl` is a trained `RegressionKernel` model, and the structure array `FitInfo` contains optimization details.

Determine how well the trained model generalizes to new predictor values by estimating the resubstitution mean squared error and epsilon-insensitive error.

`lossMSE = loss(Mdl,Z,Y) % Resubstitution mean squared error`
```lossMSE = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : : ```
`lossEI = loss(Mdl,Z,Y,'LossFun','epsiloninsensitive') % Resubstitution epsilon-insensitive error`
```lossEI = MxNx... tall array ? ? ? ... ? ? ? ... ? ? ? ... : : : : : : ```

Evaluate the tall arrays and bring the results into memory by using `gather`.

`[lossMSE,lossEI] = gather(lossMSE,lossEI)`
```Evaluating tall expression using the Local MATLAB Session: - Pass 1 of 1: Completed in 0.84 sec Evaluation completed in 1 sec ```
```lossMSE = 2.8851e+03 ```
```lossEI = 28.0050 ```

Specify a custom regression loss (Huber loss) for a Gaussian kernel regression model.

Load the `carbig` data set.

`load carbig`

Specify the predictor variables (`X`) and the response variable (`Y`).

```X = [Weight,Cylinders,Horsepower,Model_Year]; Y = MPG;```

Delete rows of `X` and `Y` where either array has `NaN` values. Removing rows with `NaN` values before passing data to `fitrkernel` can speed up training and reduce memory usage.

```R = rmmissing([X Y]); X = R(:,1:4); Y = R(:,end); ```

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

```rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices```

Standardize the training data and train the regression kernel model.

```Xtrain = X(idxTrn,:); Ytrain = Y(idxTrn); [Ztrain,tr_mu,tr_sigma] = zscore(Xtrain); % Standardize the training data tr_sigma(tr_sigma==0) = 1; Mdl = fitrkernel(Ztrain,Ytrain)```
```Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 128 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617 Properties, Methods ```

`Mdl` is a `RegressionKernel` model.

Create an anonymous function that measures Huber loss $\left(\delta =1\right)$, that is,

`$L=\frac{1}{\sum {w}_{j}}\sum _{j=1}^{n}{w}_{j}{\ell }_{j},$`

where

`$\begin{array}{l}\\ {\ell }_{j}=\left\{\begin{array}{c}0.5{\underset{}{\overset{ˆ}{{e}_{j}}}}^{2};\\ |\underset{}{\overset{ˆ}{{e}_{j}}}|-0.5;\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}\end{array}\begin{array}{c}\phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}|\underset{}{\overset{ˆ}{{e}_{j}}}|\le 1\\ \phantom{\rule{0.2777777777777778em}{0ex}}\phantom{\rule{0.2777777777777778em}{0ex}}|\underset{}{\overset{ˆ}{{e}_{j}}}|>1\end{array}.\end{array}$`

$\underset{}{\overset{ˆ}{{e}_{j}}}$ is the residual for observation j. Custom loss functions must be written in a particular form. For rules on writing a custom loss function, see the `'LossFun'` name-value pair argument.

```huberloss = @(Y,Yhat,W)sum(W.*((0.5*(abs(Y-Yhat)<=1).*(Y-Yhat).^2) + ... ((abs(Y-Yhat)>1).*abs(Y-Yhat)-0.5)))/sum(W);```

Estimate the training set regression loss using the Huber loss function.

`eTrain = loss(Mdl,Ztrain,Ytrain,'LossFun',huberloss)`
```eTrain = 1.7210 ```

Standardize the test data using the same mean and standard deviation of the training data columns. Estimate the test set regression loss using the Huber loss function.

```Xtest = X(idxTest,:); Ztest = (Xtest-tr_mu)./tr_sigma; % Standardize the test data Ytest = Y(idxTest); eTest = loss(Mdl,Ztest,Ytest,'LossFun',huberloss)```
```eTest = 1.3062 ```

## Input Arguments

collapse all

Kernel regression model, specified as a `RegressionKernel` model object. You can create a `RegressionKernel` model object using `fitrkernel`.

Predictor data, specified as an n-by-p numeric matrix, where n is the number of observations and p is the number of predictors. p must be equal to the number of predictors used to train `Mdl`.

Data Types: `single` | `double`

Response data, specified as an n-dimensional numeric vector. The length of `Y` must be equal to the number of observations in `X` or `Tbl`.

Data Types: `single` | `double`

Sample data used to train the model, specified as a table. Each row of `Tbl` corresponds to one observation, and each column corresponds to one predictor variable. Optionally, `Tbl` can contain additional columns for the response variable and observation weights. `Tbl` must contain all the predictors used to train `Mdl`. Multicolumn variables and cell arrays other than cell arrays of character vectors are not allowed.

If `Tbl` contains the response variable used to train `Mdl`, then you do not need to specify `ResponseVarName` or `Y`.

If you train `Mdl` using sample data contained in a table, then the input data for `loss` must also be in a table.

Response variable name, specified as the name of a variable in `Tbl`. The response variable must be a numeric vector. If `Tbl` contains the response variable used to train `Mdl`, then you do not need to specify `ResponseVarName`.

If you specify `ResponseVarName`, then you must specify it as a character vector or string scalar. For example, if the response variable is stored as `Tbl.Y`, then specify `ResponseVarName` as `'Y'`. Otherwise, the software treats all columns of `Tbl`, including `Tbl.Y`, as predictors.

Data Types: `char` | `string`

### Name-Value Arguments

Specify optional pairs of arguments as `Name1=Value1,...,NameN=ValueN`, where `Name` is the argument name and `Value` is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose `Name` in quotes.

Example: ```L = loss(Mdl,X,Y,'LossFun','epsiloninsensitive','Weights',weights)``` returns the weighted regression loss using the epsilon-insensitive loss function.

Loss function, specified as the comma-separated pair consisting of `'LossFun'` and a built-in loss function name or a function handle.

• The following table lists the available loss functions. Specify one using its corresponding character vector or string scalar. Also, in the table, $f\left(x\right)=T\left(x\right)\beta +b.$

• x is an observation (row vector) from p predictor variables.

• $T\left(·\right)$ is a transformation of an observation (row vector) for feature expansion. T(x) maps x in ${ℝ}^{p}$ to a high-dimensional space (${ℝ}^{m}$).

• β is a vector of m coefficients.

• b is the scalar bias.

ValueDescription
`'epsiloninsensitive'`Epsilon-insensitive loss: $\ell \left[y,f\left(x\right)\right]=\mathrm{max}\left[0,|y-f\left(x\right)|-\epsilon \right]$
`'mse'`MSE: $\ell \left[y,f\left(x\right)\right]={\left[y-f\left(x\right)\right]}^{2}$

`'epsiloninsensitive'` is appropriate for SVM learners only.

• Specify your own function by using function handle notation.

Let `n` be the number of observations in `X`. Your function must have this signature:

``lossvalue = lossfun(Y,Yhat,W)``

• The output argument `lossvalue` is a scalar.

• You choose the function name (`lossfun`).

• `Y` is an n-dimensional vector of observed responses. `loss` passes the input argument `Y` in for `Y`.

• `Yhat` is an n-dimensional vector of predicted responses, which is similar to the output of `predict`.

• `W` is an `n`-by-1 numeric vector of observation weights.

Specify your function using `'LossFun',@lossfun`.

Data Types: `char` | `string` | `function_handle`

Observation weights, specified as the comma-separated pair consisting of `'Weights'` and a numeric vector or the name of a variable in `Tbl`.

• If `Weights` is a numeric vector, then the size of `Weights` must be equal to the number of rows in `X` or `Tbl`.

• If `Weights` is the name of a variable in `Tbl`, you must specify `Weights` as a character vector or string scalar. For example, if the weights are stored as `Tbl.W`, then specify `Weights` as `'W'`. Otherwise, the software treats all columns of `Tbl`, including `Tbl.W`, as predictors.

If you supply the observation weights, `loss` computes the weighted regression loss, that is, the Weighted Mean Squared Error or Epsilon-Insensitive Loss Function.

`loss` normalizes `Weights` to sum to 1.

Data Types: `double` | `single` | `char` | `string`

## Output Arguments

collapse all

Regression loss, returned as a numeric scalar. The interpretation of `L` depends on `Weights` and `LossFun`. For example, if you use the default observation weights and specify `'epsiloninsensitive'` as the loss function, then `L` is the epsilon-insensitive loss.

collapse all

### Weighted Mean Squared Error

The weighted mean squared error is calculated as follows:

`$\text{mse}=\frac{\sum _{j=1}^{n}{w}_{j}{\left(f\left({x}_{j}\right)-{y}_{j}\right)}^{2}}{\sum _{j=1}^{n}{w}_{j}}\text{\hspace{0.17em}},$`

where:

• n is the number of observations.

• xj is the jth observation (row of predictor data).

• yj is the observed response to xj.

• f(xj) is the response prediction of the Gaussian kernel regression model `Mdl` to xj.

• w is the vector of observation weights.

Each observation weight in w is equal to `ones(n,1)/n` by default. You can specify different values for the observation weights by using the `'Weights'` name-value pair argument. `loss` normalizes `Weights` to sum to 1.

### Epsilon-Insensitive Loss Function

The epsilon-insensitive loss function ignores errors that are within the distance epsilon (ε) of the function value. The function is formally described as:

`$Los{s}_{\epsilon }=\left\{\begin{array}{c}0\text{\hspace{0.17em}},\text{\hspace{0.17em}}if\text{\hspace{0.17em}}|y-f\left(x\right)|\le \epsilon \\ |y-f\left(x\right)|-\epsilon \text{\hspace{0.17em}},\text{\hspace{0.17em}}otherwise.\end{array}$`

The mean epsilon-insensitive loss is calculated as follows:

`$Loss=\frac{\sum _{j=1}^{n}{w}_{j}\mathrm{max}\left(0,|{y}_{j}-f\left({x}_{j}\right)|-\epsilon \right)}{\sum _{j=1}^{n}{w}_{j}}\text{\hspace{0.17em}},$`

where:

• n is the number of observations.

• xj is the jth observation (row of predictor data).

• yj is the observed response to xj.

• f(xj) is the response prediction of the Gaussian kernel regression model `Mdl` to xj.

• w is the vector of observation weights.

Each observation weight in w is equal to `ones(n,1)/n` by default. You can specify different values for the observation weights by using the `'Weights'` name-value pair argument. `loss` normalizes `Weights` to sum to 1.

## Version History

Introduced in R2018a

expand all

Behavior changed in R2022a