summarize

Distribution summary statistics of Bayesian linear regression model for predictor variable selection

Syntax

summarize(Mdl)

SummaryStatistics = summarize(Mdl)

Description

summarize(Mdl) displays a tabular summary of the random regression coefficients and disturbance variance of the Bayesian linear regression model Mdl at the command line. For each parameter, the summary includes the:

Standard deviation (square root of the variance)
95% equitailed credible intervals
Probability that the parameter is greater than 0
Description of the distributions, if known
Marginal probability that a coefficient should be included in the model, for stochastic search variable selection (SSVS) predictor-variable-selection models

example

SummaryStatistics = summarize(Mdl) returns a structure array with a table summarizing the regression coefficients and disturbance variance, and a description of the joint distribution of the parameters.

example

Examples

collapse all

Summarize Prior and Posterior Distributions

Open Live Script

Consider the multiple linear regression model that predicts the US real gross national product (GNPR) using a linear combination of industrial production index (IPI), total employment (E), and real wages (WR).

${GNPR}_{t} = β_{0} + β_{1} {IPI}_{t} + β_{2} E_{t} + β_{3} {WR}_{t} + ε_{t} .$

For all $t$ , $ε_{t}$ is a series of independent Gaussian disturbances with a mean of 0 and variance $σ^{2}$ .

Assume these prior distributions for $k$ = 0,...,3:

$β_{k} | σ^{2}, γ_{k} = γ_{k} σ \sqrt{V_{k 1}} Z_{1} + (1 - γ_{k}) σ \sqrt{V_{k 2}} Z_{2}$ , where $Z_{1}$ and $Z_{2}$ are independent, standard normal random variables. Therefore, the coefficients have a Gaussian mixture distribution. Assume all coefficients are conditionally independent, a priori, but they are dependent on the disturbance variance.
$σ^{2} \sim I G (A, B)$ . $A$ and $B$ are the shape and scale, respectively, of an inverse gamma distribution.
$γ_{k} \in {0, 1}$ and it represents the random variable-inclusion regime variable with a discrete uniform distribution.

Create a prior model for SSVS. Specify the number of predictors p.

p = 3;
VarNames = ["IPI" "E" "WR"];
PriorMdl = bayeslm(p,'ModelType','mixconjugateblm','VarNames',VarNames);

PriorMdl is a mixconjugateblm Bayesian linear regression model object for SSVS predictor selection representing the prior distribution of the regression coefficients and disturbance variance.

Summarize the prior distribution.

summarize(PriorMdl)

 
           |  Mean     Std         CI95        Positive      Distribution     
------------------------------------------------------------------------------
 Intercept |  0      1.5890  [-3.547,  3.547]    0.500   Mixture distribution 
 IPI       |  0      1.5890  [-3.547,  3.547]    0.500   Mixture distribution 
 E         |  0      1.5890  [-3.547,  3.547]    0.500   Mixture distribution 
 WR        |  0      1.5890  [-3.547,  3.547]    0.500   Mixture distribution 
 Sigma2    | 0.5000  0.5000  [ 0.138,  1.616]    1.000   IG(3.00,    1)

The function displays a table of summary statistics and other information about the prior distribution at the command line.

Load the Nelson-Plosser data set, and create variables for the predictor and response data.

load Data_NelsonPlosser
X = DataTable{:,PriorMdl.VarNames(2:end)};
y = DataTable.GNPR;

Estimate the posterior distributions. Suppress the estimation display.

PosteriorMdl = estimate(PriorMdl,X,y,'Display',false);

PosteriorMdl is an empiricalblm model object that contains the posterior distributions of $β$ and $σ^{2}$ .

Obtain summary statistics from the posterior distribution.

summary = summarize(PosteriorMdl);

summary is a structure array containing two fields: MarginalDistributions and JointDistribution.

Display the marginal distribution summary by using dot notation.

summary.MarginalDistributions

ans=5×5 table
                    Mean          Std                 CI95              Positive    Distribution 
                 __________    _________    ________________________    ________    _____________

    Intercept        -18.66       10.348       -37.006        0.8406     0.0412     {'Empirical'}
    IPI              4.4555      0.15287        4.1561        4.7561          1     {'Empirical'}
    E            0.00096765    0.0003759    0.00021479     0.0016644     0.9968     {'Empirical'}
    WR               2.4739      0.36337        1.7607        3.1882          1     {'Empirical'}
    Sigma2           47.773       8.6863        33.574        67.585          1     {'Empirical'}

The MarginalDistributions field is a table of summary statistics and other information about the posterior distribution.

Input Arguments

collapse all

`Mdl` — Bayesian linear regression model for predictor variable selection
`mixconjugateblm` model object | `mixsemiconjugateblm` model object | `lassoblm` model object

Bayesian linear regression model for predictor variable selection, specified as a model object in this table.

Model Object	Description
`mixconjugateblm`	Dependent, Gaussian-mixture-inverse-gamma conjugate model for SSVS predictor variable selection, returned by `bayeslm`
`mixsemiconjugateblm`	Independent, Gaussian-mixture-inverse-gamma semiconjugate model for SSVS predictor variable selection, returned by `bayeslm`
`lassoblm`	Bayesian lasso regression model returned by `bayeslm`

Output Arguments

collapse all

`SummaryStatistics` — Parameter distribution summary
structure array

Parameter distribution summary, returned as a structure array containing the information in this table.

Structure Field Description

Structure Field	Description
`MarginalDistributions`	Table containing a summary of the parameter distributions. Rows correspond to parameters. Columns correspond to the: Estimated posterior mean (`Mean`) Standard deviation (`Std`) 95% equitailed credible interval (`CI95`) Posterior probability that the parameter is greater than 0 (`Positive`) Description of the marginal or conditional posterior distribution of the parameter (`Distribution`) Row names are the names in `Mdl.VarNames`. The name of the last row is `Sigma2`.
`JointDistribution`	A string scalar that describes the distributions of the regression coefficients (`Beta`) and the disturbance variance (`Sigma2`) when known.

MarginalDistributions

Table containing a summary of the parameter distributions. Rows correspond to parameters. Columns correspond to the:

Estimated posterior mean (Mean)
Standard deviation (Std)
95% equitailed credible interval (CI95)
Posterior probability that the parameter is greater than 0 (Positive)
Description of the marginal or conditional posterior distribution of the parameter (Distribution)

Row names are the names in Mdl.VarNames. The name of the last row is Sigma2.

JointDistribution

A string scalar that describes the distributions of the regression coefficients (Beta) and the disturbance variance (Sigma2) when known.

For distribution descriptions:

N(Mu,V) denotes the normal distribution with mean Mu and variance matrix V. This distribution can be multivariate.
IG(A,B) denotes the inverse gamma distribution with shape A and scale B.
Mixture distribution denotes a Student’s t mixture distribution.

Note

If Mdl is a lassoblm model and Mdl.Probability is a function handle representing the regime probability distribution, then summarize cannot estimate prior distribution statistics for the coefficients. Therefore, entries corresponding to coefficient statistics are NaN values.

More About

collapse all

Bayesian Linear Regression Model

A Bayesian linear regression model treats the parameters β and σ² in the multiple linear regression (MLR) model y_t = x_tβ + ε_t as random variables.

For times t = 1,...,T:

y_t is the observed response.
x_t is a 1-by-(p + 1) row vector of observed values of p predictors. To accommodate a model intercept, x_1t = 1 for all t.
β is a (p + 1)-by-1 column vector of regression coefficients corresponding to the variables that compose the columns of x_t.
ε_t is the random disturbance with a mean of zero and Cov(ε) = σ²I_T×T, while ε is a T-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is

$ℓ (β, σ^{2} | y, x) = \prod_{t = 1}^{T} ϕ (y_{t}; x_{t} β, σ^{2}) .$
ϕ(y_t;x_tβ,σ²) is the Gaussian probability density with mean x_tβ and variance σ² evaluated at y_t;.

Before considering the data, you impose a joint prior distribution assumption on (β,σ²). In a Bayesian analysis, you update the distribution of the parameters by using information about the parameters obtained from the likelihood of the data. The result is the joint posterior distribution of (β,σ²) or the conditional posterior distributions of the parameters.

Algorithms

If Mdl is a lassoblm model object and Mdl.Probability is a numeric vector, then the 95% credible intervals on the regression coefficients are Mean + [–2 2]*Std, where Mean and Std are variables in the summary table.
If Mdl is a mixconjugateblm or mixsemiconjugateblm model object, then the 95% credible intervals on the regression coefficients are estimated from the mixture cdf. If the estimation fails, then summarize returns NaN values instead.

Version History

Introduced in R2018b

summarize

Syntax

Description

Examples

Summarize Prior and Posterior Distributions

Input Arguments

`Mdl` — Bayesian linear regression model for predictor variable selection
`mixconjugateblm` model object | `mixsemiconjugateblm` model object | `lassoblm` model object

Output Arguments

`SummaryStatistics` — Parameter distribution summary
structure array

More About

Bayesian Linear Regression Model

Algorithms

Version History

See Also

Objects

Functions

Topics

summarize

Syntax

Description

Examples

Summarize Prior and Posterior Distributions

Input Arguments

Mdl — Bayesian linear regression model for predictor variable selection mixconjugateblm model object | mixsemiconjugateblm model object | lassoblm model object

Output Arguments

SummaryStatistics — Parameter distribution summary structure array

More About

Bayesian Linear Regression Model

Algorithms

Version History

See Also

Objects

Functions

Topics

`Mdl` — Bayesian linear regression model for predictor variable selection
`mixconjugateblm` model object | `mixsemiconjugateblm` model object | `lassoblm` model object

`SummaryStatistics` — Parameter distribution summary
structure array