# predict

Predict responses for Gaussian kernel regression model

## Syntax

``YFit = predict(Mdl,X)``
``YFit = predict(Mdl,X,PredictionForMissingValue=prediction)``

## Description

example

````YFit = predict(Mdl,X)` returns a vector of predicted responses for the predictor data in the matrix or table `X`, based on the binary Gaussian kernel regression model `Mdl`.```
````YFit = predict(Mdl,X,PredictionForMissingValue=prediction)` uses the `prediction` value as the predicted response for observations with missing values in the predictor data `X`. By default, `predict` uses the median of the observed response values in the training data. (since R2023b)```

## Examples

collapse all

Predict the test set responses using a Gaussian kernel regression model for the `carbig` data set.

Load the `carbig` data set.

`load carbig`

Specify the predictor variables (`X`) and the response variable (`Y`).

```X = [Weight,Cylinders,Horsepower,Model_Year]; Y = MPG;```

Delete rows of `X` and `Y` where either array has `NaN` values. Removing rows with `NaN` values before passing data to `fitrkernel` can speed up training and reduce memory usage.

```R = rmmissing([X Y]); X = R(:,1:4); Y = R(:,end); ```

Reserve 10% of the observations as a holdout sample. Extract the training and test indices from the partition definition.

```rng(10) % For reproducibility N = length(Y); cvp = cvpartition(N,'Holdout',0.1); idxTrn = training(cvp); % Training set indices idxTest = test(cvp); % Test set indices```

Train the regression kernel model. Standardize the training data.

```Xtrain = X(idxTrn,:); Ytrain = Y(idxTrn); Mdl = fitrkernel(Xtrain,Ytrain,'Standardize',true)```
```Mdl = RegressionKernel ResponseName: 'Y' Learner: 'svm' NumExpansionDimensions: 128 KernelScale: 1 Lambda: 0.0028 BoxConstraint: 1 Epsilon: 0.8617 ```

`Mdl` is a `RegressionKernel` model.

Predict responses for the test set.

```Xtest = X(idxTest,:); Ytest = Y(idxTest); YFit = predict(Mdl,Xtest);```

Create a table containing the first 10 observed response values and predicted response values.

```table(Ytest(1:10),YFit(1:10),'VariableNames', ... {'ObservedValue','PredictedValue'})```
```ans=10×2 table ObservedValue PredictedValue _____________ ______________ 18 17.616 14 25.799 24 24.141 25 25.018 14 13.637 14 14.557 18 18.584 27 26.096 21 25.031 13 13.324 ```

Estimate the test set regression loss using the mean squared error loss function.

`L = loss(Mdl,Xtest,Ytest)`
```L = 9.2664 ```

## Input Arguments

collapse all

Kernel regression model, specified as a `RegressionKernel` model object. You can create a `RegressionKernel` model object using `fitrkernel`.

Predictor data used to generate responses, specified as a numeric matrix or table.

Each row of `X` corresponds to one observation, and each column corresponds to one variable.

• For a numeric matrix:

• The variables in the columns of `X` must have the same order as the predictor variables that trained `Mdl`.

• If you trained `Mdl` using a table (for example, `Tbl`) and `Tbl` contains all numeric predictor variables, then `X` can be a numeric matrix. To treat numeric predictors in `Tbl` as categorical during training, identify categorical predictors using the `CategoricalPredictors` name-value pair argument of `fitrkernel`. If `Tbl` contains heterogeneous predictor variables (for example, numeric and categorical data types) and `X` is a numeric matrix, then `predict` throws an error.

• For a table:

• `predict` does not support multicolumn variables or cell arrays other than cell arrays of character vectors.

• If you trained `Mdl` using a table (for example, `Tbl`), then all predictor variables in `X` must have the same variable names and data types as those that trained `Mdl` (stored in `Mdl.PredictorNames`). However, the column order of `X` does not need to correspond to the column order of `Tbl`. Also, `Tbl` and `X` can contain additional variables (response variables, observation weights, and so on), but `predict` ignores them.

• If you trained `Mdl` using a numeric matrix, then the predictor names in `Mdl.PredictorNames` and corresponding predictor variable names in `X` must be the same. To specify predictor names during training, see the `PredictorNames` name-value pair argument of `fitrkernel`. All predictor variables in `X` must be numeric vectors. `X` can contain additional variables (response variables, observation weights, and so on), but `predict` ignores them.

Data Types: `double` | `single` | `table`

Since R2023b

Predicted response value to use for observations with missing predictor values, specified as `"median"`, `"mean"`, or a numeric scalar.

ValueDescription
`"median"``predict` uses the median of the observed response values in the training data as the predicted response value for observations with missing predictor values.
`"mean"``predict` uses the mean of the observed response values in the training data as the predicted response value for observations with missing predictor values.
Numeric scalar`predict` uses this value as the predicted response value for observations with missing predictor values.

Example: `"mean"`

Example: `NaN`

Data Types: `single` | `double` | `char` | `string`

## Output Arguments

collapse all

Predicted responses, returned as a numeric vector.

`YFit` is an n-by-1 vector of the same data type as the response data (`Y`) used to train `Mdl`, where n is the number of observations in `X`.

## Version History

Introduced in R2018a

expand all