forecast

Forecast responses from Bayesian vector autoregression (VAR) model

Since R2020a

Syntax

YF = forecast(PriorMdl,numperiods,Y)

[YF,YFStd] = forecast(PriorMdl,numperiods,Y)

Description

forecast is well suited for computing out-of-sample unconditional forecasts of a Bayesian VAR(p) model that does not contain an exogenous regression component. For advanced applications, such as out-of-sample conditional forecasting, VARX(p) model forecasting, missing value imputation, and Gibbs sampler specification for posterior predictive distribution estimation, see simsmooth.

example

YF = forecast(PriorMdl,numperiods,Y) returns a path of forecasted responses YF over the length numperiods forecast horizon. Each period in YF is the mean of the posterior predictive distribution, which is derived from the posterior distribution of the prior Bayesian VAR(p) model PriorMdl given the response data Y. The output YF represents the continuation of Y.

NaNs in the data indicate missing values, which forecast removes using list-wise deletion.

example

[YF,YFStd] = forecast(PriorMdl,numperiods,Y) also returns the corresponding standard deviations of the posterior predictive distribution YFStd.

Examples

collapse all

Forecast Responses from Posterior Predictive Distribution

Open Live Script

Consider the 3-D VAR(4) model for the US inflation (INFL), unemployment (UNRATE), and federal funds (FEDFUNDS) rates.

$[\begin{array}{llllllllllllllllllll} {INFL}_{t} \\ {UNRATE}_{t} \\ {FEDFUNDS}_{t} \end{array}] = c + \sum_{j = 1}^{4} Φ_{j} [\begin{array}{llllllllllllllllllll} {INFL}_{t - j} \\ {UNRATE}_{t - j} \\ {FEDFUNDS}_{t - j} \end{array}] + [\begin{array}{cccccccccccccccccccc} ε_{1, t} \\ ε_{2, t} \\ ε_{3, t} \end{array}] .$

For all $t$ , $ε_{t}$ is a series of independent 3-D normal innovations with a mean of 0 and covariance $Σ$ . Assume a diffuse prior distribution for the parameters $({[Φ_{1}, . . ., Φ_{4}, c]}^{'}, Σ)$ .

Load and Preprocess Data

Load the US macroeconomic data set. Compute the inflation rate, stabilize the unemployment and federal funds rates, and remove missing values.

load Data_USEconModel
seriesnames = ["INFL" "UNRATE" "FEDFUNDS"];
DataTimeTable.INFL = 100*[NaN; price2ret(DataTimeTable.CPIAUCSL)];
DataTimeTable.DUNRATE = [NaN; diff(DataTimeTable.UNRATE)];
DataTimeTable.DFEDFUNDS = [NaN; diff(DataTimeTable.FEDFUNDS)];
seriesnames(2:3) = "D" + seriesnames(2:3);
rmDataTimeTable = rmmissing(DataTimeTable);

Create Prior Model

Create a diffuse prior model. Specify the response series names.

numseries = numel(seriesnames);
numlags = 4;

PriorMdl = bayesvarm(numseries,numlags,'SeriesNames',seriesnames);

Forecast Responses

Directly forecast two years (eight quarters) of observations from the posterior predictive distribution. forecast estimates the posterior distribution of the parameters, and then forms the posterior predictive distribution.

numperiods = 8;
YF = forecast(PriorMdl,numperiods,rmDataTimeTable{:,seriesnames});

YF is an 8-by-3 matrix of forecasted responses.

Plot the forecasted responses.

fh = rmDataTimeTable.Time(end) + calquarters(0:8);
tiledlayout(3,1)
for j = 1:PriorMdl.NumSeries
    nexttile
    plot(rmDataTimeTable.Time(end - 20:end),rmDataTimeTable{end - 20:end,seriesnames(j)},'r',...
        fh,[rmDataTimeTable{end,seriesnames(j)}; YF(:,j)],'b');
    legend("Observed","Forecasted",'Location','NorthWest')
    title(seriesnames(j))
end

Estimate Standard Deviations of Posterior Predictive Distribution

Open Live Script

Consider the 3-D VAR(4) model of Forecast Responses from Posterior Predictive Distribution.

Load the US macroeconomic data set. Compute the inflation rate, stabilize the unemployment and federal funds rates, and remove missing values.

load Data_USEconModel
seriesnames = ["INFL" "UNRATE" "FEDFUNDS"];
DataTimeTable.INFL = 100*[NaN; price2ret(DataTimeTable.CPIAUCSL)];
DataTimeTable.DUNRATE = [NaN; diff(DataTimeTable.UNRATE)];
DataTimeTable.DFEDFUNDS = [NaN; diff(DataTimeTable.FEDFUNDS)];
seriesnames(2:3) = "D" + seriesnames(2:3);
rmDataTimeTable = rmmissing(DataTimeTable);

Create a diffuse prior model. Specify the response series names.

numseries = numel(seriesnames);
numlags = 4;

PriorMdl = bayesvarm(numseries,numlags,'SeriesNames',seriesnames);

Directly forecast two years (eight quarters) of response observations from the posterior predictive distribution. Return the posterior standard deviations.

numperiods = 8;
[YF,YStd] = forecast(PriorMdl,numperiods,rmDataTimeTable{:,seriesnames});

YF and YStd are 8-by-3 matrices of forecasted responses and corresponding standard deviations, respectively.

Plot the forecasted responses and approximate 95% credible intervals.

fh = rmDataTimeTable.Time(end) + calquarters(0:8);
for j = 1:PriorMdl.NumSeries
subplot(3,1,j)
plot(rmDataTimeTable.Time(end - 20:end),rmDataTimeTable{end - 20:end,seriesnames(j)},'r',...
    fh,[rmDataTimeTable{end,seriesnames(j)}; YF(:,j)],'b',...
    fh,[rmDataTimeTable{end,seriesnames(j)}; YF(:,j) + 1.96*YStd(:,j)],'b--',...
    fh,[rmDataTimeTable{end,seriesnames(j)}; YF(:,j) - 1.96*YStd(:,j)],'b--');
legend("Observed","Forecasted","Approximate 95% Credible Interval",'Location','NorthWest')
title(seriesnames(j))
end

Input Arguments

collapse all

`PriorMdl` — Prior Bayesian VAR model
`conjugatebvarm` model object | `semiconjugatebvarm` model object | `diffusebvarm` model object | `normalbvarm` model object

Prior Bayesian VAR model, specified as a model object in this table.

Model Object	Description
`conjugatebvarm`	Dependent, matrix-normal-inverse-Wishart conjugate model returned by `bayesvarm`, `conjugatebvarm`, or `estimate`
`semiconjugatebvarm`	Independent, normal-inverse-Wishart semiconjugate prior model returned by `bayesvarm` or `semiconjugatebvarm`
`diffusebvarm`	Diffuse prior model returned by `bayesvarm` or `diffusebvarm`
`normalbvarm`	Normal conjugate model with a fixed innovations covariance matrix, returned by `bayesvarm`, `normalbvarm`, or `estimate`

`numperiods` — Forecast horizon
positive integer

Forecast horizon, or the number of time points in the forecast period, specified as a positive integer.

Data Types: double

`Y` — Presample and estimation sample multivariate response series
numeric matrix

Presample and estimation sample multivariate response series, specified as a (numlags + numobs)-by-numseries numeric matrix.

Rows correspond to observations, and the last row contains the latest observation. forecast uses the first numlags = PriorMdl.P observations as a presample to initialize the prior model PriorMdl for posterior estimation. forecast estimates the posterior using the remaining numobs observations and PriorMdl.

numseries is the number of response variables PriorMdl.NumSeries. Columns correspond to individual response variables PriorMdl.SeriesNames.

For more details, see Algorithms.

Data Types: double

Output Arguments

collapse all

`YF` — Path of multivariate response series forecasts
numeric matrix

Path of multivariate response series forecasts, returned as a numperiods-by-numseries numeric matrix. YF is the mean of the posterior predictive distribution of each period in the forecast horizon.

YF represents the continuation of the response series Y. Rows correspond to observations; row j is the j-period-ahead forecast. Columns correspond to the columns in Y.

`YFStd` — Forecast standard deviations
numeric matrix

Forecast standard deviations, returned as a numperiods-by-numseries numeric matrix. YFStd is the standard deviation of the posterior predictive distribution of each period in the forecast horizon. Dimensions correspond to the dimensions of YF.

More About

collapse all

Bayesian Vector Autoregression (VAR) Model

A Bayesian VAR model treats all coefficients and the innovations covariance matrix as random variables in the m-dimensional, stationary VARX(p) model. The model has one of the three forms described in this table.

Model	Equation
Reduced-form VAR(p) in difference-equation notation	$y_{t} = Φ_{1} y_{t - 1} + ... + Φ_{p} y_{t - p} + c + δ t + Β x_{t} + ε_{t} .$
Multivariate regression	$y_{t} = Z_{t} λ + ε_{t} .$
Matrix regression	$y_{t} = Λ^{'} z_{t}^{'} + ε_{t} .$

For each time t = 1,...,T:

y_t is the m-dimensional observed response vector, where m = numseries.
Φ₁,…,Φ_p are the m-by-m AR coefficient matrices of lags 1 through p, where p = numlags.
c is the m-by-1 vector of model constants if IncludeConstant is true.
δ is the m-by-1 vector of linear time trend coefficients if IncludeTrend is true.
Β is the m-by-r matrix of regression coefficients of the r-by-1 vector of observed exogenous predictors x_t, where r = NumPredictors. All predictor variables appear in each equation.
$z_{t} = [\begin{matrix} y_{t - 1}^{'} & y_{t - 2}^{'} & \dots & y_{t - p}^{'} & 1 & t & x_{t}^{'} \end{matrix}],$ which is a 1-by-(mp + r + 2) vector, and Z_t is the m-by-m(mp + r + 2) block diagonal matrix

$[\begin{matrix} z_{t} & 0_{z} & \dots & 0_{z} \\ 0_{z} & z_{t} & \dots & 0_{z} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0_{z} & 0_{z} & 0_{z} & z_{t} \end{matrix}],$
where 0_z is a 1-by-(mp + r + 2) vector of zeros.
$Λ = {[\begin{matrix} Φ_{1} & Φ_{2} & \dots & Φ_{p} & c & δ & Β \end{matrix}]}^{'}$ , which is an (mp + r + 2)-by-m random matrix of the coefficients, and the m(mp + r + 2)-by-1 vector λ = vec(Λ).
ε_t is an m-by-1 vector of random, serially uncorrelated, multivariate normal innovations with the zero vector for the mean and the m-by-m matrix Σ for the covariance. This assumption implies that the data likelihood is

$ℓ (Λ, Σ | y, x) = \prod_{t = 1}^{T} f (y_{t}; Λ, Σ, z_{t}),$
where f is the m-dimensional multivariate normal density with mean z_tΛ and covariance Σ, evaluated at y_t.

Before considering the data, you impose a joint prior distribution assumption on (Λ,Σ), which is governed by the distribution π(Λ,Σ). In a Bayesian analysis, the distribution of the parameters is updated with information about the parameters obtained from the data likelihood. The result is the joint posterior distribution π(Λ,Σ|Y,X,Y₀), where:

Y is a T-by-m matrix containing the entire response series {y_t}, t = 1,…,T.
X is a T-by-m matrix containing the entire exogenous series {x_t}, t = 1,…,T.
Y₀ is a p-by-m matrix of presample data used to initialize the VAR model for estimation.

Posterior Predictive Distribution

A posterior predictive distribution of a posterior Bayesian VAR(p) model π(Y_f|Y,X) is the distribution of the next τ future response variables after the final observation in the estimation sample Y_f = [y_T+1, y_T+2,…,y_T+τ] given the following, marginalized over Λ and Σ:

Presample and estimation sample response data Y
Coefficients Λ
Innovations covariance matrix Σ
Estimation and future sample exogenous data X

Symbolically,

$π (Y_{f} | Y, X) = \int π (Y_{f} | Y, X, Λ, Σ) π (Λ, Σ | Y, X) d Λ d Σ .$

Tips

Monte Carlo simulation is subject to variation. If forecast uses Monte Carlo simulation, then estimates and inferences might vary when you call forecast multiple times under seemingly equivalent conditions. To reproduce estimation results, set a random number seed by using rng before calling forecast.

Algorithms

If the posterior predictive distribution is analytically intractable (true for most cases), forecast implements Markov Chain Monte Carlo (MCMC) sampling with Bayesian data augmentation to compute the mean and standard deviation of the posterior predictive distribution. To do so, forecast calls simsmooth, which uses a computationally intensive procedure.
Most Econometrics Toolbox™ forecast functions accept an estimated or posterior model object from which to generate forecasts. Such a model encompasses the parametric structure and data. However, the forecast function of Bayesian VAR models requires presample and estimation sample data to do the following:
- Perform Bayesian parameter updating to estimate posterior distributions. forecast implements MCMC sampling with Bayesian data augmentation, which includes a Kalman filter smoothing step that requires the entire observed series.
- Predict future responses in the presence of two sources of uncertainty:
  - Estimation noise ε₁,…,ε_T, which induces parameter uncertainty
  - Forecast period noise ε_T+1,…,ε_T+numperiods

References

[1] Litterman, Robert B. "Forecasting with Bayesian Vector Autoregressions: Five Years of Experience." Journal of Business and Economic Statistics 4, no. 1 (January 1986): 25–38. https://doi.org/10.2307/1391384.

Version History

Introduced in R2020a

forecast

Syntax

Description

Examples

Forecast Responses from Posterior Predictive Distribution

Estimate Standard Deviations of Posterior Predictive Distribution

Input Arguments

`PriorMdl` — Prior Bayesian VAR model
`conjugatebvarm` model object | `semiconjugatebvarm` model object | `diffusebvarm` model object | `normalbvarm` model object

`numperiods` — Forecast horizon
positive integer

`Y` — Presample and estimation sample multivariate response series
numeric matrix

Output Arguments

`YF` — Path of multivariate response series forecasts
numeric matrix

`YFStd` — Forecast standard deviations
numeric matrix

More About

Bayesian Vector Autoregression (VAR) Model

Posterior Predictive Distribution

Tips

Algorithms

References

Version History

See Also

Objects

Functions

forecast

Syntax

Description

Examples

Forecast Responses from Posterior Predictive Distribution

Estimate Standard Deviations of Posterior Predictive Distribution

Input Arguments

PriorMdl — Prior Bayesian VAR model conjugatebvarm model object | semiconjugatebvarm model object | diffusebvarm model object | normalbvarm model object

numperiods — Forecast horizon positive integer

Y — Presample and estimation sample multivariate response series numeric matrix

Output Arguments

YF — Path of multivariate response series forecasts numeric matrix

YFStd — Forecast standard deviations numeric matrix

More About

Bayesian Vector Autoregression (VAR) Model

Posterior Predictive Distribution

Tips

Algorithms

References

Version History

See Also

Objects

Functions

`PriorMdl` — Prior Bayesian VAR model
`conjugatebvarm` model object | `semiconjugatebvarm` model object | `diffusebvarm` model object | `normalbvarm` model object

`numperiods` — Forecast horizon
positive integer

`Y` — Presample and estimation sample multivariate response series
numeric matrix

`YF` — Path of multivariate response series forecasts
numeric matrix

`YFStd` — Forecast standard deviations
numeric matrix