Documentation

summarize

Distribution summary statistics of standard Bayesian linear regression model

Description

To obtain a summary of a Bayesian linear regression model for predictor selection, see summarize.

example

summarize(Mdl) displays a tabular summary of the random regression coefficients and disturbance variance of the standard Bayesian linear regression model Mdl at the command line. For each parameter, the summary includes the:

• Standard deviation (square root of the variance)

• 95% equitailed credible intervals

• Probability that the parameter is greater than 0

• Description of the distributions, if known

example

SummaryStatistics = summarize(Mdl) returns a structure array that stores a:

• Table containing the summary of the regression coefficients and disturbance variance

• Table containing the covariances between variables

• Description of the joint distribution of the parameters

Examples

collapse all

Consider the multiple linear regression model that predicts the US real gross national product (GNPR) using a linear combination of industrial production index (IPI), total employment (E), and real wages (WR).

${\text{GNPR}}_{t}={\beta }_{0}+{\beta }_{1}{\text{IPI}}_{t}+{\beta }_{2}{\text{E}}_{t}+{\beta }_{3}{\text{WR}}_{t}+{\epsilon }_{t}.$

For all $t$ time points, ${\epsilon }_{t}$ is a series of independent Gaussian disturbances with a mean of 0 and variance ${\sigma }^{2}$.

Assume these prior distributions:

• $\beta |{\sigma }^{2}\sim {N}_{4}\left(M,{\sigma }^{2}V\right)$. $M$ is a 4-by-1 vector of means, and $V$ is a scaled 4-by-4 positive definite covariance matrix.

• ${\sigma }^{2}\sim IG\left(A,B\right)$. $A$ and $B$ are the shape and scale, respectively, of an inverse gamma distribution.

These assumptions and the data likelihood imply a normal-inverse-gamma conjugate model.

Create a normal-inverse-gamma conjugate prior model for the linear regression parameters. Specify the number of predictors p and the variable names.

p = 3;
VarNames = ["IPI" "E" "WR"];
PriorMdl = bayeslm(p,'ModelType','conjugate','VarNames',VarNames);

PriorMdl is a conjugateblm Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance.

Summarize the prior distribution.

summarize(PriorMdl)

|  Mean     Std            CI95         Positive       Distribution
-----------------------------------------------------------------------------------
Intercept |  0      70.7107  [-141.273, 141.273]    0.500   t (0.00, 57.74^2,  6)
IPI       |  0      70.7107  [-141.273, 141.273]    0.500   t (0.00, 57.74^2,  6)
E         |  0      70.7107  [-141.273, 141.273]    0.500   t (0.00, 57.74^2,  6)
WR        |  0      70.7107  [-141.273, 141.273]    0.500   t (0.00, 57.74^2,  6)
Sigma2    | 0.5000   0.5000    [ 0.138,  1.616]     1.000   IG(3.00,    1)

The function displays a table of summary statistics and other information about the prior distribution at the command line.

Load the Nelson-Plosser data set and create variables for the predictor and response data.

X = DataTable{:,PriorMdl.VarNames(2:end)};
y = DataTable.GNPR;

Estimate the posterior distributions. Suppress the estimation display.

PosteriorMdl = estimate(PriorMdl,X,y,'Display',false);

PosteriorMdl is a conjugateblm model object that contains the posterior distributions of $\beta$ and ${\sigma }^{2}$.

Obtain summary statistics from the posterior distribution.

summary = summarize(PosteriorMdl);

summary is a structure array containing three fields: MarginalDistributions, Covariances, and JointDistribution.

Display the marginal distribution summary and covariances by using dot notation.

summary.MarginalDistributions
ans=5×5 table
Mean          Std                  CI95              Positive            Distribution
_________    __________    ________________________    _________    __________________________

Intercept      -24.249        8.7821       -41.514       -6.9847    0.0032977    {'t (-24.25, 8.65^2, 68)'}
IPI             4.3913        0.1414        4.1134        4.6693            1    {'t (4.39, 0.14^2, 68)'  }
E            0.0011202    0.00032931    0.00047284     0.0017676      0.99952    {'t (0.00, 0.00^2, 68)'  }
WR              2.4683       0.34895        1.7822        3.1543            1    {'t (2.47, 0.34^2, 68)'  }
Sigma2          44.135         7.802        31.427        61.855            1    {'IG(34.00, 0.00069)'    }

summary.Covariances
ans=5×5 table
Intercept         IPI             E             WR         Sigma2
__________    ___________    ___________    ___________    ______

Intercept        77.125        0.77133     -0.0023655         0.5311         0
IPI             0.77133       0.019994    -6.5001e-06       -0.02948         0
E            -0.0023655    -6.5001e-06     1.0844e-07    -8.0013e-05         0
WR               0.5311       -0.02948    -8.0013e-05        0.12177         0
Sigma2                0              0              0              0    60.871

The MarginalDistributions field is a table of summary statistics and other information about the posterior distribution. Covariances is a table containing the covariance matrix of the parameters.

Input Arguments

collapse all

Standard Bayesian linear regression model, specified as a model object in this table.

Model ObjectDescription
conjugateblmDependent, normal-inverse-gamma conjugate model returned by bayeslm or estimate
semiconjugateblmIndependent, normal-inverse-gamma semiconjugate model returned by bayeslm
diffuseblmDiffuse prior model returned by bayeslm
empiricalblmPrior model characterized by samples from prior distributions, returned by bayeslm or estimate
customblmPrior distribution function that you declare returned by bayeslm

Output Arguments

collapse all

Parameter distribution summary, returned as a structure array containing the information in this table.

Structure FieldDescription
MarginalDistributions

Table containing a summary of the parameter distributions. Rows correspond to parameters. Columns correspond to the:

• Estimated posterior mean (Mean)

• Standard deviation (Std)

• 95% equitailed credible interval (CI95)

• Posterior probability that the parameter is greater than 0 (Positive)

• Description of the marginal or conditional posterior distribution of the parameter (Distribution)

Row names are the names in Mdl.VarNames, and the name of the last row is Sigma2.

Covariances

Table containing covariances between parameters. Rows and columns correspond to the intercept (if one exists) the regression coefficients, and disturbance variance. Row and column names are the same as the row names in MarginalDistributions.

JointDistribution

A string scalar that describes the distributions of the regression coefficients (Beta) and the disturbance variance (Sigma2) when known.

For distribution descriptions:

• N(Mu,V) denotes the normal distribution with mean Mu and variance matrix V. This distribution can be multivariate.

• IG(A,B) denotes the inverse gamma distribution with shape A and scale B.

• t(Mu,V,DoF) denotes the Student’s t distribution with mean Mu, variance V, and degrees of freedom DoF.

collapse all

Bayesian Linear Regression Model

A Bayesian linear regression model treats the parameters β and σ2 in the multiple linear regression (MLR) model yt = xtβ + εt as random variables.

For times t = 1,...,T:

• yt is the observed response.

• xt is a 1-by-(p + 1) row vector of observed values of p predictors. To accommodate a model intercept, x1t = 1 for all t.

• β is a (p + 1)-by-1 column vector of regression coefficients corresponding to the variables that compose the columns of xt.

• εt is the random disturbance with a mean of zero and Cov(ε) = σ2IT×T, while ε is a T-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is

$\ell \left(\beta ,{\sigma }^{2}|y,x\right)=\prod _{t=1}^{T}\varphi \left({y}_{t};{x}_{t}\beta ,{\sigma }^{2}\right).$

ϕ(yt;xtβ,σ2) is the Gaussian probability density with mean xtβ and variance σ2 evaluated at yt;.

Before considering the data, you impose a joint prior distribution assumption on (β,σ2). In a Bayesian analysis, you update the distribution of the parameters by using information about the parameters obtained from the likelihood of the data. The result is the joint posterior distribution of (β,σ2) or the conditional posterior distributions of the parameters.