## Vector Autoregression (VAR) Model Creation

Econometrics Toolbox™ has a class of functions for modeling multivariate time series using a VAR model. The `varm` function creates a `varm` object that represents a VAR model. `varm` properties specify the VAR model structure, including the number of response series (dimensionality), number of autoregressive (AR) lags, and the presence of constant or time trend coefficients in the model.

A `varm` object can serve as a model template for estimation, in which case you must specify at least the number of response series and the degree of the AR polynomial. Optionally, you can specify values for other parameters (coefficients or innovations covariance matrix) to test hypotheses or economic theory. The `estimate` object function fits unspecified estimable parameters of the model to specified data, and returns a fully specified `varm` object. Supply a fully specified model to other `varm` object functions for further analysis.

### Create VAR Model

You can create a `varm` object using one of two syntaxes: shorthand or longhand.

The shorthand syntax is suited for the quick creation of a model, usually when the model serves as a template for estimation. The required inputs are the response series dimensionality (`numseries`) and the degree of the AR polynomial (`p`). The AR polynomial of the resulting VAR model has nonzero lags 1 through `p`. For an example, see Create and Adjust VAR Model Using Shorthand Syntax.

The longhand syntax allows for more flexibility in parameter specification than the shorthand syntax. For example, you can specify values of autoregressive coefficient matrices or which lags have nonzero coefficient matrices. Whereas the `varm` function requires the inputs `numseries` and `p` when you use the shorthand syntax, the function must be able to infer these structural characteristics from the values you supply when you use the longhand syntax. In other words, these structural characteristics are not estimable. For an example, see Create and Adjust VAR Model Using Longhand Syntax.

Regardless of syntax, the resulting VAR model is an object. Values of the object properties completely determine the structure of the VAR model. After creating a model, you can display it to verify its structure, and you can change parameter values by adjusting properties using dot notation (see Display and Change Model Objects).

Depending on your analysis goals, you can use one of several methods to create a model using the `varm` function.

• Fully Specified Model Object – Use this method when you know the values of all parameters of your model. That is, you do not plan to fit the model to data.

• Model Template for Unrestricted Estimation – Use this method when you know the response dimensionality and the AR polynomial degree, and you want to fit the entire model to data using `estimate`.

• Partially Specified Model Object for Restricted Estimation – Use this method when you know the response dimensionality, AR polynomial degree, as well as some of the parameter values. For example:

• You know the values of some AR coefficient matrices or you want to test hypotheses.

• You want to exclude some lags from an equation.

• You want to exclude some exogenous predictor variables from an equation.

To estimate any unknown parameter values, pass the model object and data to `estimate`, which applies equality constraints to all known parameters at their specified values during optimization.

• Model objects with a regression component for exogenous variables:

• If you plan to estimate a multivariate model containing an unrestricted regression component, specify the structure of the model, except the regression component, when you create the model. Then, specify the model and exogenous data (`'X'` name-value pair argument) when you call `estimate`. Consequently, `estimate` includes an appropriately sized regression coefficient matrix in the model, and estimates it. `estimate` includes all exogenous variables in the regression component of each response equation by default.

• If you plan to specify equality constraints in the regression coefficient matrix for estimation, or you want to fully specify the matrix, use the longhand syntax and the `'Beta'` name-value pair argument to specify the matrix when you create the model. Alternatively, after creating the model, you can specify the `Beta` model property by using dot notation. For example, to exclude an exogenous variable from an equation, set the coefficient element corresponding to the variable (column) and equation (row) to `0`.

`varm` objects do not store data. Instead, you specify data when you operate on a model by using an object function.

### Fully Specified Model Object

If you know the values of all model coefficients and the innovations covariance matrix, create a model object and specify the parameter values using the longhand syntax. This table describes the name-value pair arguments you can pass to the `varm` function for known parameter values in a `numseries`-dimensional VAR(`p`) model.

NameValue
`'Constant'`

A `numseries`-D vector of model constants or intercepts. The default is a `NaN` vector.

`'Lags'`

A numeric vector of autoregressive polynomial lags. The largest lag determines `p`, the degree of the VAR model. If you specify `Lags`, then each element corresponds to the elements of `AR`. The default value is `1:p`.

`'AR'`

A cell vector of `numseries`-by-`numseries` AR coefficient matrices. Cells correspond to the elements of `Lags`. By default, all cells contain matrices composed of `NaN` values.

`'Trend'`

A `numseries`-D vector of linear time trends. The default is a vector of zeros.

`'Beta'`

A `numseries`-by-`numpreds` matrix of regression coefficients. The default is empty, which means no regression component exists in the model.

`'Covariance'`

A `numseries`-by-`numseries` symmetric, positive definite innovations covariance matrix. The default is a matrix composed of `NaN` values.

You can also create a model object using the shorthand syntax, and then adjust corresponding property values (except `Lags`) using dot notation.

The `Lags` name-value pair argument allows you to specify which lags you want to include. For example, to specify AR lags 1 and 3 without lag 2, set `'Lags'` to `[1 3]`. Although this syntax specified only two lags, `p` is `3`.

The following example shows how to create a model object when you have known parameters. Consider the VAR(1) model

`${y}_{t}=\left[\begin{array}{c}0.05\\ 0\\ -0.05\end{array}\right]+\left[\begin{array}{ccc}0.5& 0& 0\\ 0.1& 0.1& 0.3\\ 0& 0.2& 0.3\end{array}\right]{y}_{t-1}+{\epsilon }_{t}.$`

The independent disturbances εt are distributed as standard 3-D normal random variables.

This code shows how to create a model object using `varm`.

```c = [0.05; 0; -0.05]; AR = {[.5 0 0;.1 .1 .3;0 .2 .3]}; Covariance = eye(3); Mdl = varm('Constant',c,'AR',AR,'Covariance',Covariance)```
```Mdl = varm with properties: Description: "AR-Stationary 3-Dimensional VAR(1) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 1 Constant: [0.05 0 -0.05]' AR: {3×3 matrix} at lag [1] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 diagonal matrix]```

The object display shows property values. The `varm` function identifies this model as a stationary VAR(1) model with three dimensions, additive constants, no time trend, and no regression component.

### Model Template for Unrestricted Estimation

The easiest way to create a multivariate model template for estimation is by using the shorthand syntax. For example, to create a VAR(`2`) model template for `3` response series by using `varm` and its shorthand syntax, enter this code.

```numseries = 3; p = 2; Mdl = varm(numseries,p);```
`Mdl` represents a VAR(`2`) model containing unknown, estimable parameters, including the constant vector and `3`-by-`3` lag coefficient matrices from lags 1 through `2`.

`NaN` elements in the arrays of the model properties indicate estimable parameters. The `Beta` property can be a `numseries`-by-0 array and can be estimable; `estimate` infers its column dimension from specified exogenous data. When you use the shorthand syntax, `varm` sets the constant vector, all autoregressive coefficient matrices, and the innovations covariance matrix to appropriately sized arrays of `NaN`s.

To display the VAR(`2`) model template `Mdl` and see which parameters are estimable, enter this code.

`Mdl`
```Mdl = varm with properties: Description: "3-Dimensional VAR(2) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 2 Constant: [3×1 vector of NaNs] AR: {3×3 matrices of NaNs} at lags [1 2] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs]```
`Mdl.Trend` is a vector of zeros, which indicates that the linear time trend is not a model parameter.

To specify model characteristics that are different from the defaults, use the longhand syntax or adjust writable properties of an existing model by using dot notation. For example, this code shows how to create a model containing a linear time-trend term, with an estimable coefficient, by using the longhand syntax.

```AR = cell(p,1); AR(:) = {nan(numseries)}; % varm can infer response dimension and AR degree from AR MdlLT = varm('AR',AR,'Trend',nan(numseries,1));```
This code shows how to adjust the model `Mdl` to include an estimable linear time-trend term.
`Mdl.Trend = nan(numseries,1);`

`estimate` fits all unspecified parameters, including the model constant vector, autoregressive coefficient matrices, regression coefficient matrix, linear time-trend vector, and innovations covariance matrix.

### Partially Specified Model Object for Restricted Estimation

You can create a model object with some known parameters to test hypotheses about their values. `estimate` treats the known values as equality constraints during estimation, and fits the remaining unknown parameters to the data. All VAR model coefficients can contain a mix of `NaN` and valid real numbers, but the innovations covariance matrix must be completely unknown (composed entirely of `NaN`s) or completely known (a positive definite matrix).

This code shows how to specify the model in Fully Specified Model Object, but the AR parameters have a diagonal autoregressive structure and an unknown innovation covariance matrix. `varm` infers the dimensionality of the response variable from the parameters `c` and `AR`, and infers the degree of the VAR model from `AR`.

```c = [.05; 0; -.05]; AR = {diag(nan(3,1))}; Mdl = varm('Constant',c,'AR',AR) Mdl.AR{:}```
```Mdl = varm with properties: Description: "3-Dimensional VAR(1) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 1 Constant: [0.05 0 -0.05]' AR: {3×3 matrix} at lag [1] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs] ans = NaN 0 0 0 NaN 0 0 0 NaN```

### Display and Change Model Objects

Suppose the variable name of a model object is `Mdl`. After you create `Mdl`, you can examine it in several ways:

• Enter `Mdl` at the MATLAB® command line.

• Double-click the object in the MATLAB Workspace browser.

• Enter `Mdl.PropertyName` at the MATLAB command line, where `PropertyName` is the name of the property you want to examine or reassign.

You can change any writable property of a model object using dot notation:

`Mdl.PropertyValue = value;`

#### Display Model Object

Create a VAR(2) model object for three response variables. Use the shorthand syntax.

```numseries = 3; p = 2; Mdl = varm(numseries,p);```

Display the VAR(2) model.

`Mdl`
```Mdl = varm with properties: Description: "3-Dimensional VAR(2) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 2 Constant: [3×1 vector of NaNs] AR: {3×3 matrices of NaNs} at lags [1 2] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs] ```

`Mdl` is a `varm` model object. Its properties (left) and corresponding values (right) are listed at the command line.

The coefficients included in the model are the model constant vector `Constant` and the autoregressive polynomial coefficient matrices `AR` at lags 1 and 2. Their corresponding property values are appropriately sized arrays of `NaN`s, which indicates that the values are unknown but estimable. Similarly, the innovations covariance matrix `Covariance` is a `NaN` matrix, so it is also unknown but estimable.

By default, the linear time-trend vector `Trend` is composed of zeros, and the regression coefficient matrix `Beta` has a column dimension of zero. If you supply exogenous data when you estimate `Mdl` by using `estimate`, MATLAB® infers the column dimension of `Beta` from the specified data, sets `Beta` to a matrix of `NaN`s, and estimates it. Otherwise, MATLAB® ignores the regression component of the model.

#### Adjust Property of Existing Model

This example shows how to exclude the first lag from the AR polynomial of a VAR(2) model.

Create a VAR(2) model template that represents three response variables. Use the shorthand syntax.

```numseries = 3; p = 2; Mdl = varm(numseries,p)```
```Mdl = varm with properties: Description: "3-Dimensional VAR(2) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 2 Constant: [3×1 vector of NaNs] AR: {3×3 matrices of NaNs} at lags [1 2] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs] ```

The `AR` property of `Mdl` stores the AR polynomial coefficient matrices in a cell array. The first cell contains the lag 1 coefficient matrix, and the second cell contains the lag 2 coefficient matrix.

Set the lag 1 AR coefficient to a matrix of zeros by using dot notation. Display the updated model.

```Mdl.AR{1} = zeros(numseries); Mdl```
```Mdl = varm with properties: Description: "3-Dimensional VAR(2) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 2 Constant: [3×1 vector of NaNs] AR: {3×3 matrix} at lag [2] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs] ```

The lag 1 coefficient is removed from the AR polynomial of the model.

#### Select Exogenous Variables for Response Equations

This example shows how to choose which exogenous variables occur in the regression component of a VARX(4) model.

Create a VAR(4) model template that represents three response variables. Use the shorthand syntax.

```numseries = 3; p = 4; Mdl = varm(numseries,p)```
```Mdl = varm with properties: Description: "3-Dimensional VAR(4) Model" SeriesNames: "Y1" "Y2" "Y3" NumSeries: 3 P: 4 Constant: [3×1 vector of NaNs] AR: {3×3 matrices of NaNs} at lags [1 2 3 ... and 1 more] Trend: [3×1 vector of zeros] Beta: [3×0 matrix] Covariance: [3×3 matrix of NaNs] ```

The `Beta` property contains the model regression coefficient matrix, a `3`-by-`0` matrix. Because it has `0` columns, `Mdl` does not have a regression component.

Assume the following:

• You plan to include two exogenous variables in the regression component of `Mdl` to make it a VARX(4) model.

• Your exogenous data is in the matrix `X`, which is not loaded in memory.

• You want to include exogenous variable 1 (stored in `X(:,1)`) in all response equations, and exclude exogenous variable 2 (stored in `X(:,2)`) from the response variable equations 2 and 3.

• You plan to fit `Mdl` to data.

Set the regression coefficient to a matrix of `NaN`s. Then, set the elements corresponding to excluded exogenous variables to zero.

```numpreds = 2; Mdl.Beta = nan(numseries,numpreds); Mdl.Beta(2:3,2) = 0; Mdl.Beta```
```ans = 3×2 NaN NaN NaN 0 NaN 0 ```

During estimation, `estimate` fits all estimable parameters (NaN-valued elements) to the data while applying these equality constraints during optimization:

• ${\beta }_{22}=0.$

• ${\beta }_{32}=0.$

### Select Appropriate Lag Order

A goal of time series model development is to identify a lag order p yielding a model that represents the data-generating process well and produces reliable forecasts. These functions help identify an appropriate lag order:

• `lratiotest` performs a likelihood ratio test to compare specifications of nested models by assessing the significance of restrictions to an extended model with unrestricted parameters. In context, the lag order of the restricted model is less than the lag order of the unrestricted model.

• `aicbic` returns the Akaike and Bayesian information criteria (AIC and BIC, respectively) given loglikelihoods and active parameter counts of fitted candidate models. `aicbic` does not conduct a statistical hypothesis test. The model that yields the minimum fit statistic has the best, parsimonious fit among the candidate models.

#### Determine Minimal Number of Lags Using Likelihood Ratio Test

`lratiotest` requires inputs of the loglikelihood of an unrestricted model, the loglikelihood of a restricted model, and the number of degrees of freedom (DoF). DoF is the difference between the active parameter counts of the unrestricted and restricted models. The lag order of the restricted model is less than the lag order of the unrestricted model.

`lratiotest` returns a logical value: `1` means reject the restricted model in favor of the unrestricted model, and `0` means insufficient evidence exists to reject the restricted model.

To conduct a likelihood ratio test:

1. Obtain the loglikelihood of the restricted and unrestricted models when you fit them to data using `estimate`. The loglikelihood is the third output (`logL`).

`[EstMdl,EstSE,logL,E] = estimate(...)`
2. Obtain the active parameter count of each estimated model (`numparams`) from the `NumEstimatedParameters` field in the output structure of `summarize`.

```results = summarize(EstMdl); numparams = results.NumEstimatedParameters;```
3. Conduct a likelihood ratio test, with 5% level of significance, by passing the following to `lratiotest`: the loglikelihood of the unrestricted model `logLU`, the loglikelihood of the restricted model `logLR`, and the DoF (`dof`).

`h = lratiotest(logLU,logLR,dof)`

For example, suppose you fit four models: model 1 has a lag order of 1, model 2 has a lag order of 2, and so on. The models have loglikelihoods `logL1`, `logL2`, `logL3`, and `logL4`, and active parameter counts `numparams1`, `numparams2`, `numparams3`, and `numparams4`. Conduct likelihood ratio tests of models 1, 2, and 3 against model 4, as follows:

```h1 = lratiotest(logL4,logL1,(numparams4 - numparams1)) h2 = lratiotest(logL4,logL2,(numparams4 - numparams2)) h3 = lratiotest(logL4,logL3,(numparams4 - numparams3))```

If `h1` = `1`, reject model 1; proceed in the same way for models 2 and 3. If `lratiotest` returns `0`, insufficient evidence exists to reject the model with a lag order lower than 4.

#### Determine Minimal Number of Lags Using Information Criterion

You can obtain the AIC and BIC fit statistics in two ways:

• Pass an estimated model to `summarize`, and extract the appropriate fit statistic from the output structure.

1. Estimate a model using `estimate`.

`EstMdl = estimate(...);`
2. Obtain the AIC and BIC of the estimated model from the `AIC` and `BIC` fields of the output structure `results`.

```results = summarize(EstMdl); aic = results.AIC; bic = results.BIC;```
• Use `aicbic`, which requires the loglikelihood of a candidate model and its active parameter count. `aicbic` also accepts a vector of loglikelihoods and a vector of corresponding active parameter counts, enabling you to compare multiple model fits using one function call.

1. Obtain the loglikelihood of each candidate model when you fit each model to data using `estimate`. The loglikelihood is the third output (`logL`).

`[EstMdl,EstSE,logL,E] = estimate(...)`
2. Obtain the active parameter count of each candidate model (`numparams`) from the `NumEstimatedParameters` field in the output structure of `summarize`.

```results = summarize(EstMdl); numparams = results.NumEstimatedParameters;```

For example, suppose you fit four models: model 1 has a lag order of 1, model 2 has a lag order of 2, and so on. The models have loglikelihoods `logL1`, `logL2`, `logL3`, and `logL4`, and active parameter counts `numparams1`, `numparams2`, `numparams3`, and `numparams4`. Calculate the AIC of each model.

```AIC = aicbic([logL1 logL2 logL3 logL4],... [numparams1 numparams2 numparams3 numparams4])```

The most suitable model has the lowest value of the AIC.