Documentation

## Create Regression Models with ARIMA Errors

### Default Regression Model with ARIMA Errors

This example shows how to apply the shorthand `regARIMA(p,D,q)` syntax to specify the regression model with ARIMA errors.

Specify the default regression model with ARIMA(3,1,2) errors:

`$\begin{array}{c}{y}_{t}=c+{X}_{t}\beta +{u}_{t}\\ \left(1-{a}_{1}L-{a}_{2}{L}^{2}-{a}_{3}{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+{b}_{1}L+{b}_{2}{L}^{2}\right){\epsilon }_{t}.\end{array}$`

`Mdl = regARIMA(3,1,2)`
```Mdl = regARIMA with properties: Description: "ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: NaN Beta: [1×0] P: 4 D: 1 Q: 2 AR: {NaN NaN NaN} at lags [1 2 3] SAR: {} MA: {NaN NaN} at lags [1 2] SMA: {} Variance: NaN ```

The software sets each parameter to `NaN`, and the innovation distribution to `Gaussian`. The AR coefficients are at lags 1 through 3, and the MA coefficients are at lags 1 and 2. The property `P` = p + D = 3 + 1 = 4. Therefore, the software requires at least four presample values to initialize the time series.

Pass `Mdl` into `estimate` with data to estimate the parameters set to `NaN`. The `regARIMA` model sets `Beta` to `[]` and does not display it. If you pass a matrix of predictors (${X}_{t}$) into `estimate`, then `estimate` estimates `Beta`. The `estimate` function infers the number of regression coefficients in `Beta` from the number of columns in ${X}_{t}$.

Tasks such as simulation and forecasting using `simulate` and `forecast` do not accept models with at least one `NaN` for a parameter value. Use dot notation to modify parameter values.

Be aware that the regression model intercept (`Intercept`) is not identifiable in regression models with ARIMA errors. If you want to `estimate` `Mdl`, then you must set `Intercept` to a value using, for example, dot notation. Otherwise, `estimate` might return a spurious estimate of `Intercept`.

### ARIMA Error Model Without an Intercept

This example shows how to specify a regression model with ARIMA errors without a regression intercept.

Specify the default regression model with ARIMA(3,1,2) errors:

`$\begin{array}{c}{y}_{t}={X}_{t}\beta +{u}_{t}\\ \left(1-{a}_{1}L-{a}_{2}{L}^{2}-{a}_{3}{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+{b}_{1}L+{b}_{2}{L}^{2}\right){\epsilon }_{t}.\end{array}$`

`Mdl = regARIMA('ARLags',1:3,'MALags',1:2,'D',1,'Intercept',0)`
```Mdl = regARIMA with properties: Description: "ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [1×0] P: 4 D: 1 Q: 2 AR: {NaN NaN NaN} at lags [1 2 3] SAR: {} MA: {NaN NaN} at lags [1 2] SMA: {} Variance: NaN ```

The software sets `Intercept` to 0, but all other parameters in `Mdl` are `NaN` values by default.

Since `Intercept` is not a `NaN`, it is an equality constraint during estimation. In other words, if you pass `Mdl` and data into `estimate`, then `estimate` sets `Intercept` to 0 during estimation.

In general, if you want to use `estimate` to estimate a regression models with ARIMA errors where D > 0 or s > 0, then you must set `Intercept` to a value before estimation.

You can modify the properties of `Mdl` using dot notation.

### ARIMA Error Model with Nonconsecutive Lags

This example shows how to specify a regression model with ARIMA errors, where the nonzero AR and MA terms are at nonconsecutive lags.

Specify the regression model with ARIMA(8,1,4) errors:

`$\begin{array}{c}{y}_{t}={X}_{t}\beta +{u}_{t}\\ \left(1-{a}_{1}L-{a}_{4}{L}^{4}-{a}_{8}{L}^{8}\right)\left(1-L\right){u}_{t}=\left(1+{b}_{1}L+{b}_{4}{L}^{4}\right){\epsilon }_{t}.\end{array}$`

```Mdl = regARIMA('ARLags',[1,4,8],'D',1,'MALags',[1,4],... 'Intercept',0)```
```Mdl = regARIMA with properties: Description: "ARIMA(8,1,4) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [1×0] P: 9 D: 1 Q: 4 AR: {NaN NaN NaN} at lags [1 4 8] SAR: {} MA: {NaN NaN} at lags [1 4] SMA: {} Variance: NaN ```

The AR coefficients are at lags 1, 4, and 8, and the MA coefficients are at lags 1 and 4. The software sets the interim lags to 0.

Pass `Mdl` and data into `estimate`. The software estimates all parameters that have the value `NaN`. Then `estimate` holds all interim lag coefficients to 0 during estimation.

### Known Parameter Values for a Regression Model with ARIMA Errors

This example shows how to specify values for all parameters of a regression model with ARIMA errors.

Specify the regression model with ARIMA(3,1,2) errors:

`$\begin{array}{c}{y}_{t}={X}_{t}\left[\begin{array}{l}2.5\\ -0.6\end{array}\right]+{u}_{t}\\ \left(1-0.7L+0.3{L}^{2}-0.1{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+0.5L+0.2{L}^{2}\right){\epsilon }_{t},\end{array}$`

where ${\epsilon }_{t}$ is Gaussian with unit variance.

```Mdl = regARIMA('Intercept',0,'Beta',[2.5; -0.6],... 'AR',{0.7, -0.3, 0.1},'MA',{0.5, 0.2},... 'Variance',1,'D',1)```
```Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (Gaussian Distribution)" Distribution: Name = "Gaussian" Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1 ```

The parameters in `Mdl` do not contain `NaN` values, and therefore there is no need to estimate it. However, you can simulate or forecast responses by passing `Mdl` to `simulate` or `forecast`.

### Regression Model with ARIMA Errors and t Innovations

This example shows how to set the innovation distribution of a regression model with ARIMA errors to a t distribution.

Specify the regression model with ARIMA(3,1,2) errors:

`$\begin{array}{c}{y}_{t}={X}_{t}\left[\begin{array}{l}2.5\\ -0.6\end{array}\right]+{u}_{t}\\ \left(1-0.7L+0.3{L}^{2}-0.1{L}^{3}\right)\left(1-L\right){u}_{t}=\left(1+0.5L+0.2{L}^{2}\right){\epsilon }_{t},\end{array}$`

where ${\epsilon }_{t}$ has a t distribution with the default degrees of freedom and unit variance.

```Mdl = regARIMA('Intercept',0,'Beta',[2.5; -0.6],... 'AR',{0.7, -0.3, 0.1},'MA',{0.5, 0.2},'Variance',1,... 'Distribution','t','D',1)```
```Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (t Distribution)" Distribution: Name = "t", DoF = NaN Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1 ```

The default degrees of freedom is `NaN`. If you don't know the degrees of freedom, then you can estimate it by passing `Mdl` and the data to `estimate`.

Specify a ${t}_{10}$ distribution.

`Mdl.Distribution = struct('Name','t','DoF',10)`
```Mdl = regARIMA with properties: Description: "Regression with ARIMA(3,1,2) Error Model (t Distribution)" Distribution: Name = "t", DoF = 10 Intercept: 0 Beta: [2.5 -0.6] P: 4 D: 1 Q: 2 AR: {0.7 -0.3 0.1} at lags [1 2 3] SAR: {} MA: {0.5 0.2} at lags [1 2] SMA: {} Variance: 1 ```

You can simulate or forecast responses by passing `Mdl` to `simulate` or `forecast` because `Mdl` is completely specified.

In applications, such as simulation, the software normalizes the random t innovations. In other words, `Variance` overrides the theoretical variance of the t random variable (which is `DoF`/(`DoF` - 2)), but preserves the kurtosis of the distribution.