Note: This page has been translated by MathWorks. Click here to see

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

**MathWorks Machine Translation**

The automated translation of this page is provided by a general purpose third party translator tool.

MathWorks does not warrant, and disclaims all liability for, the accuracy, suitability, or fitness for purpose of the translation.

Bayesian linear regression model with conjugate priors for stochastic search variable selection (SSVS)

The Bayesian linear regression model object `mixconjugateblm`

specifies
that the joint prior distribution of the regression coefficients and the disturbance variance
(*β*, *σ*^{2}) for implementing
SSVS (see [1] and [2]) assuming *β* and
*σ*^{2} are dependent random
variables.

In general, when you create a Bayesian linear regression model object, it specifies the joint prior distribution and characteristics of the linear regression model only. That is, the model object is a template intended for further use. Specifically, to incorporate data into the model for posterior distribution analysis and feature selection, pass the model object and data to the appropriate object function.

`PriorMdl = mixconjugateblm(NumPredictors)`

`PriorMdl = mixconjugateblm(NumPredictors,Name,Value)`

creates a Bayesian linear regression
model object (`PriorMdl`

= mixconjugateblm(`NumPredictors`

)`PriorMdl`

) composed of
`NumPredictors`

predictors and an intercept. The joint prior
distribution of (*β*, *σ*^{2})
is appropriate for implementing SSVS for predictor selection [2]. `PriorMdl`

is a template defining the prior distributions and dimensionality of
*β*.

use
additional options specified by one or more `PriorMdl`

= mixconjugateblm(`NumPredictors`

,`Name,Value`

)`Name,Value`

pair
arguments. `Name`

is a property name, except
`NumPredictors`

, and `Value`

is the corresponding
value. `Name`

must appear inside quotes. You can specify several
`Name,Value`

pair arguments in any order as
`Name1,Value1,...,NameN,ValueN`

.

For example, `mixconjugateblm(3,'Probability',abs(rand(4,1)))`

specifies random prior regime probabilities for all four coefficients in the
model.

You can set property values when you create the model object using name-value pair argument syntax, or after model creation using dot notation. For example, to exclude an intercept from the model, enter

PriorMdl.Intercept = false;

`NumPredictors`

— Number of predictor variablesnonnegative integer

Number of predictor variables in the Bayesian multiple linear regression model, specified as a nonnegative integer.

`NumPredictors`

must be the same as the number of columns in your predictor data, which you specify during model estimation or simulation.

When specifying `NumPredictors`

, exclude any intercept term for the value.

After creating a model, if you change the value `NumPredictors`

using dot notation, then all these parameters revert to the default values:

Variables names (

`VarNames`

)The prior mean of

*β*(`Mu`

)The prior variances of

*β*for each regime (`V`

)The prior correlation matrix of

*β*(`Correlation`

)The prior regime probabilities (

`Probability`

)

**Data Types: **`double`

`Intercept`

— Indicate whether to include regression model intercept`true`

(default) | `false`

Indicate whether to include regression model intercept, specified
as the comma-separated pair consisting of `'Intercept'`

and
a value in this table.

Value | Description |
---|---|

`false` | Exclude an intercept from the regression model. Hence, β is
a -dimensional vector, where `p` is
the value of the `p` `NumPredictors` property. |

`true` | Include an intercept in the regression model. Hence, β is a
( + 1)-dimensional vector. During
estimation, simulation, and forecasting, MATLAB`p` ^{®} prepends the predictor data with an
appropriately-sized vector of ones. |

If you include a column of
ones in the predictor data for an intercept term, then set `Intercept`

to `false`

.

**Example: **`'Intercept',false`

**Data Types: **`logical`

`VarNames`

— Predictor variable namesstring vector | cell vector of character vectors

Predictor variable names for displays, specified as a string vector or cell vector of
character vectors. `VarNames`

must contain
`NumPredictors`

elements.
`VarNames(`

is the name of
variable in column * j*)

`j`

The default is `{'Beta(1)','Beta(2),...,Beta(`

,
where * p*)}

`p`

`NumPredictors`

.**Example: **`'VarNames',["UnemploymentRate"; "CPI"]`

**Data Types: **`string`

| `cell`

| `char`

`Mu`

— Component-wise mean hyperparameter of Gaussian mixture prior on `zeros(Intercept + NumPredictors,2)`

(default) | numeric matrixComponent-wise mean hyperparameter of Gaussian mixture prior on *β*, specified as a (`Intercept + NumPredictors`

)-by-2 numeric matrix. The first column contains the prior means for component 1, which is for the variable-inclusion regime (that is, *γ* = 1). The second column contains the prior means for component 2, which is for the variable-exclusion regime (that is, *γ* = 0).

If

`Intercept`

is`false`

, then`Mu`

has`NumPredictors`

rows.`mixconjugateblm`

sets the prior mean of the`NumPredictors`

coefficients corresponding to the columns in the predictor data set, which is specified during estimation, simulation, or forecasting.Otherwise,

`Mu`

has`NumPredictors + 1`

elements, the first element corresponds to the prior means of the intercept, and all other elements correspond to the predictor variables.

To perform SSVS, use the default value of `Mu`

.

**Data Types: **`double`

`V`

— Component-wise variance factor hyperparameter of Gaussian mixture prior on
```
repmat([10 0.1],Intercept +
NumPredictors,1)
```

(default) | positive numeric matrixComponent-wise variance factor hyperparameter of Gaussian mixture prior on
*β*, specified as a (```
Intercept +
NumPredictors
```

)-by-2 positive numeric matrix. The first column contains the prior
variance factors for component 1, which is for the variable-inclusion regime (that is,
*γ* = 1). The second column contains the prior variance factors for
component 2, which is the variable-exclusion regime (that is, *γ* =
0). Regardless of regime or coefficient, the prior variance of a coefficient
is the variance factor times *σ*^{2}.

If

`Intercept`

is`false`

, then`V`

has`NumPredictors`

rows.`mixconjugateblm`

sets the prior variance factor of the`NumPredictors`

coefficients corresponding to the columns in the predictor data set, which is specified during estimation, simulation, or forecasting.Otherwise,

`V`

has`NumPredictors + 1`

elements, the first element corresponds to the prior variance factor of the intercept, and all other elements correspond to the predictor variables.

To perform SSVS, specify a larger variance factor for regime 1 than for regime 2. That is, for all

, specify`j`

`V(`

>,1)`j`

`V(`

.,2)`j`

For details on what value to specify for

`V`

, see [1].

**Data Types: **`double`

`Probability`

— Prior probability distribution for the variable inclusion and exclusion regimes`0.5*ones(Intercept + NumPredictors,1)`

(default) | numeric vector of values in [0,1] | function handlePrior probability distribution for the variable inclusion and exclusion regimes, specified as
a (`Intercept`

+ `NumPredictors`

)-by-1 numeric vector
of values in [0,1] or a function handle in the form `@fcnName`

, where
`fcnName`

is the function name. `Probability`

represents the prior probability distribution of *γ* =
{*γ*_{1},…,*γ _{K}*}, where:

*K*=`Intercept`

+`NumPredictors`

, which is the number of coefficients in the regression model.*γ*∈ {0,1} for_{k}*k*=*1*,…,*K*. Hence, the sample space has a cardinality of 2^{K}.*γ*= 1 indicates variable_{k}`VarNames`

`(`

is included in the model, and)`k`

*γ*= 0 indicates that the variable is excluded from the model._{k}

If `Probability`

is a numeric vector:

Rows correspond to the variable names in

`VarNames`

. For models containing an intercept, the prior probability for intercept inclusion is`Probability(1)`

.For

= 1,…,`k`

*K*, the prior probability for excluding variable

is 1 -`k`

`Probability(`

).`k`

Prior probabilities of the variable-inclusion regime, among all variables and the intercept, are independent.

If `Probability`

is a function handle, then it represents a custom prior distribution of the variable-inclusion regime probabilities. The corresponding function must have this declaration statement (the argument and function names can vary):

logprob = regimeprior(varinc)

where:

`logprob`

is a numeric scalar representing the log of the prior distribution. you can write the prior distribution up to a proportionality constant.`varinc`

is a*K*-by-1 logical vector. Elements correspond to the variable names in`VarNames`

, and indicate in which regime the corresponding variable exists.`varinc(`

=)`k`

`true`

indicates`VarName(`

is included in the model,)`k`

`varinc(`

=)`k`

`false`

indicates otherwise.

You can include more input arguments, but they must be known when you call `mixconjugateblm`

.

For details on what value to specify for `Probability`

, see [1].

**Data Types: **`double`

| `function_handle`

`Correlation`

— Prior correlation matrix of `eye(Intercept + NumPredictors)`

(default) | numeric, positive definite matrixPrior correlation matrix of *β* for both components in the mixture model, specified as an (`Intercept`

+ `NumPredictors`

)-by-(`Intercept`

+ `NumPredictors`

) numeric, positive definite matrix. Consequently, the prior covariance matrix for component

in the mixture model is
`j`

`sigma2*diag(sqrt(V(:,`

,
where * j*)))*Correlation*diag(sqrt(V(:,

`j`

`sigma2`

is `V`

is the matrix of coefficient variance factors.Rows and columns correspond to the variable names in `VarNames`

.

By default, regression coefficients are uncorrelated, conditional on the regime.

You can supply any appropriately-sized numeric matrix. However, if your specification is not positive definite, `mixconjugateblm`

issues a warning and replaces your specification with `CorrelationPD`

, where:

CorrelationPD = 0.5*(Correlation + Correlation.');

For details on what value to specify for `Correlation`

, see [1].

**Data Types: **`double`

`A`

— Shape hyperparameter of inverse gamma prior on `3`

(default) | numeric scalarShape parameter of inverse gamma prior on *σ*^{2},
specified as a numeric scalar.

`A`

must be at least ```
-(Intercept +
NumPredictors)/2
```

.

With `B`

held fixed, as `A`

increases,
the inverse gamma distribution becomes taller and more concentrated.
This characteristic weighs the prior model of *σ*^{2} more
heavily than the likelihood during posterior estimation.

For the functional form of the inverse gamma distribution, see Analytically Tractable Posteriors.

**Example: **`'A',0.1`

**Data Types: **`double`

`B`

— Scale hyperparameter of inverse gamma prior on `1`

(default) | positive scalar | `Inf`

Scale parameter of inverse gamma prior on *σ*^{2},
specified as a positive scalar or `Inf`

.

With `A`

held fixed, as `B`

increases,
the inverse gamma distribution becomes taller and more concentrated.
This characteristic weighs the prior model of *σ*^{2} more
heavily than the likelihood during posterior estimation.

**Example: **`'B',5`

**Data Types: **`double`

`estimate` | Perform predictor variable selection for Bayesian linear regression models |

`simulate` | Simulate regression coefficients and disturbance variance of Bayesian linear regression model |

`forecast` | Forecast responses of Bayesian linear regression model |

`plot` | Visualize prior and posterior densities of Bayesian linear regression model parameters |

`summarize` | Distribution summary statistics of Bayesian linear regression model for predictor variable selection |

Consider the multiple linear regression model that predicts U.S. real gross national product (`GNPR`

) using a linear combination of industrial production index (`IPI`

), total employment (`E`

), and real wages (`WR`

).

For all , is a series of independent Gaussian disturbances with a mean of 0 and variance .

Assume that the prior distributions are, for = 0,...,3:

, where and are independent, standard normal random variables. Hence, the coefficients have a Gaussian mixture distribution. Assume all coefficients are conditionally independent, a priori, but they are dependent on the disturbance variance.

. and are the shape and scale, respectively, of an inverse gamma distribution.

and it represents the random variable-inclusion regime variable with a discrete uniform distribution.

Create a prior model for SSVS. Specify the number of predictors, `p`

.

p = 3; Mdl = mixconjugateblm(p);

`Mdl`

is a `mixconjugateblm`

Bayesian linear regression model object representing the prior distribution of the regression coefficients and disturbance variance. At the command window, `mixconjugateblm`

displays a summary of the prior distributions.

Alternatively, you can create a prior model for SSVS by passing the number of predictors to `bayeslm`

and setting the `ModelType`

name-value pair argument to `'mixconjugate'`

.

MdlBayesLM = bayeslm(p,'ModelType','mixconjugate')

MdlBayesLM = mixconjugateblm with properties: NumPredictors: 3 Intercept: 1 VarNames: {4x1 cell} Mu: [4x2 double] V: [4x2 double] Probability: [4x1 double] Correlation: [4x4 double] A: 3 B: 1 | Mean Std CI95 Positive Distribution ------------------------------------------------------------------------------ Intercept | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution Beta(1) | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution Beta(2) | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution Beta(3) | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution Sigma2 | 0.5000 0.5000 [ 0.138, 1.616] 1.000 IG(3.00, 1)

`Mdl`

and `MdlBayesLM`

are equivalent model objects.

You can set writable property values of created models using dot notation. Specify the regression coefficient names to the corresponding variable names.

Mdl.VarNames = ["IPI" "E" "WR"]

Mdl = mixconjugateblm with properties: NumPredictors: 3 Intercept: 1 VarNames: {4x1 cell} Mu: [4x2 double] V: [4x2 double] Probability: [4x1 double] Correlation: [4x4 double] A: 3 B: 1 | Mean Std CI95 Positive Distribution ------------------------------------------------------------------------------ Intercept | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution IPI | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution E | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution WR | 0 1.5890 [-3.547, 3.547] 0.500 Mixture distribution Sigma2 | 0.5000 0.5000 [ 0.138, 1.616] 1.000 IG(3.00, 1)

MATLAB® associates the variable names to the regression coefficients in displays.

Plot the prior distributions.

plot(Mdl);

The distributions of the coefficient are centered at 0 and have the *spike-and-slab* appearance.

This example is based on Create Prior Model for SSVS.

Create a prior model for performing SSVS. Assume that are dependent on (a conjugate mixture model). Specify the number of predictors, `p`

, and the names of the regression coefficients.

p = 3; PriorMdl = mixconjugateblm(p,'VarNames',["IPI" "E" "WR"]);

Display the prior regime probabilities and Gaussian mixture variance factors of the prior .

priorProbabilities = table(PriorMdl.Probability,'RowNames',PriorMdl.VarNames,... 'VariableNames',"Probability")

`priorProbabilities=`*4×1 table*
Probability
___________
Intercept 0.5
IPI 0.5
E 0.5
WR 0.5

priorV = array2table(PriorMdl.V,'RowNames',PriorMdl.VarNames,... 'VariableNames',["gammaIs1" "gammaIs0"])

`priorV=`*4×2 table*
gammaIs1 gammaIs0
________ ________
Intercept 10 0.1
IPI 10 0.1
E 10 0.1
WR 10 0.1

`PriorMdl`

stores prior regime probabilities in the `Probability`

property and the regime variance factors in the `V`

property. The default prior probability of variable inclusion is 0.5. The default variance factors for each coefficent are 10 for the variable-inclusion regime and 0.01 for the variable-exclusion regime.

Load the Nelson-Plosser data set. Create variables for the response and predictor series.

load Data_NelsonPlosser X = DataTable{:,PriorMdl.VarNames(2:end)}; y = DataTable{:,'GNPR'};

Implement SSVS by estimating the marginal posterior distributions of and . Because SSVS uses Markov chain Monte Carlo for estimation, set a random number seed to reproduce the results.

rng(1); PosteriorMdl = estimate(PriorMdl,X,y);

Method: MCMC sampling with 10000 draws Number of observations: 62 Number of predictors: 4 | Mean Std CI95 Positive Distribution Regime ---------------------------------------------------------------------------------- Intercept | -18.8333 10.1851 [-36.965, 0.716] 0.037 Empirical 0.8806 IPI | 4.4554 0.1543 [ 4.165, 4.764] 1.000 Empirical 0.4545 E | 0.0010 0.0004 [ 0.000, 0.002] 0.997 Empirical 0.0925 WR | 2.4686 0.3615 [ 1.766, 3.197] 1.000 Empirical 0.1734 Sigma2 | 47.7557 8.6551 [33.858, 66.875] 1.000 Empirical NaN

`PosteriorMdl`

is an `empiricalblm`

model object storing object storing draws from the posterior distributions of and given the data. `estimate`

displays a summary of the marginal posterior distributions to the command window. Rows of the summary correspond to regression coefficients and the disturbance variance, and columns to characteristics of the posterior distribution. The characteristics include:

`CI95`

, which contains the 95% Bayesian equitailed credible intervals for the parameters. For example, the posterior probability that the regression coefficient of`E`

(standardized) is in [-0.110, 0.365] is 0.95.`Regime`

, which contains the marginal posterior probability of variable inclusion ( for a variable) . For example, the posterior probability`E`

should be included in the model is 0.0925.

Assuming, arbitrarily, that variables having `Regime`

< 0.1 should be removed from the model, the results suggest that you can exclude the unemployment rate from the model.

By default, `estimate`

draws and discards a burn-in sample of size 5000. However, it is good practice to inspect a trace plot of the draws for adequate mixing and lack of transience. Plot a trace plot of the draws for each parameter. You can access the draws that compose the distribution, that is, the properties `BetaDraws`

and `Sigma2Draws`

, using dot notation.

figure; for j = 1:(p + 1) subplot(2,2,j); plot(PosteriorMdl.BetaDraws(j,:)); title(sprintf('%s',PosteriorMdl.VarNames{j})); end

```
figure;
plot(PosteriorMdl.Sigma2Draws);
title('Sigma2');
```

The trace plots indicate that the draws seem to be mixing well, that is, there is no detectable transience or serial correlation, and the draws do not jump between states.

This example is based on Create Prior Model for SSVS.

Load the Nelson-Plosser data set. Create variables for the response and predictor series. To the MATLAB® path, add example-specific files.

load Data_NelsonPlosser VarNames = ["IPI" "E" "WR"]; X = DataTable{:,VarNames}; y = DataTable{:,"GNPR"}; path = fullfile(matlabroot,'examples','econ','main'); addpath(path);

Suppose you have prior knowledge that:

The intercept is in the model with probability 0.9.

`IPI`

and`E`

are in the model with probability 0.75.If

`E`

is included in the model, then the probability that`WR`

is included in the model is 0.9.If

`E`

is excluded from the model, then the probability that`WR`

is included is 0.25.

Declare a function called `priorssvsexample.m`

that:

Accepts a logical vector indicating whether the intercept and variables are in the model (

`true`

for model inclusion). Element 1 corresponds to the intercept, and the rest of the elements correspond to the the variables in the data.Returns a numeric scalar representing the log of the described prior regime probability distribution<include>

function logprior = priorssvsexample(varinc) %PRIORSSVSEXAMPLE Log prior regime probability distribution for SSVS % PRIORSSVSEXAMPLE is an example of a custom log prior regime probability % distribution for SSVS with dependent random variables. varinc is % a 4-by-1 logical vector indicating whether 4 coefficients ae in a model % and logPrior is a numeric scalar represnting the log of the prior % distribution of the regime probabilities. % % Coefficients enter a model following these rules: % * varinc(1) is included with probability 0.9. % * varinc(2) and varinc(3) are in the model with probability 0.75. % * If varinc(3) is included in the model, then the probability that % varinc(4) is included in the model is 0.9. % * If varinc(3) is excluded from the model, then the probability % that varinc(4) is included is 0.25. logprior = log(0.9) + 2*log(0.75) + log(varinc(3)*0.9 + (1-varinc(3))*0.25); end

`priorssvsexample.m`

is an example-specific file included with the Econometrics Toolbox™. To access it, enter `edit priorssvsexample.m`

.

Create a prior model for performing SSVS. Assume that is dependent on (a conjugate mixture model). Specify the number of predictors, `p`

, the names of the regression coefficients, and the custom, prior probability distribution of the variable-inclusion regimes.

p = 3; PriorMdl = mixconjugateblm(p,'VarNames',["IPI" "E" "WR"],... 'Probability',@priorssvsexample);

Implement SSVS by estimating the marginal posterior distributions of and . Because SSVS uses Markov chain Monte Carlo for estimation, set a random number seed to reproduce the results.

rng(1); PosteriorMdl = estimate(PriorMdl,X,y);

Method: MCMC sampling with 10000 draws Number of observations: 62 Number of predictors: 4 | Mean Std CI95 Positive Distribution Regime ---------------------------------------------------------------------------------- Intercept | -18.7971 10.1644 [-37.002, 0.765] 0.039 Empirical 0.8797 IPI | 4.4559 0.1530 [ 4.166, 4.760] 1.000 Empirical 0.4623 E | 0.0010 0.0004 [ 0.000, 0.002] 0.997 Empirical 0.2665 WR | 2.4684 0.3618 [ 1.759, 3.196] 1.000 Empirical 0.1727 Sigma2 | 47.7391 8.6741 [33.823, 67.024] 1.000 Empirical NaN

Assuming, arbitrarily, that variables having `Regime`

< 0.1 should be removed from the model, the results suggest that you can include all variables in the model.

Clean up the workspace by removing `path`

from the MATLAB® path.

rmpath(path);

This example is based on Create Prior Model for SSVS.

Perform SSVS by:

Creating a Bayesian regression model for SSVS with a conjugate prior for the data likelihood. Use the default settings.

Holding out the the last 10 periods of data from estimation.

Estimating the marginal posterior distributions.

p = 3; PriorMdl = bayeslm(p,'ModelType','mixconjugate','VarNames',["IPI" "E" "WR"]); load Data_NelsonPlosser fhs = 10; % Forecast horizon size X = DataTable{1:(end - fhs),PriorMdl.VarNames(2:end)}; y = DataTable{1:(end - fhs),'GNPR'}; XF = DataTable{(end - fhs + 1):end,PriorMdl.VarNames(2:end)}; % Future predictor data yFT = DataTable{(end - fhs + 1):end,'GNPR'}; % True future responses rng(1); % For reproducibility PosteriorMdl = estimate(PriorMdl,X,y,'Display',false);

Forecast responses using the posterior predictive distribution and using the future predictor data `XF`

. Plot the true values of the response and the forecasted values.

yF = forecast(PosteriorMdl,XF); figure; plot(dates,DataTable.GNPR); hold on plot(dates((end - fhs + 1):end),yF) h = gca; hp = patch([dates(end - fhs + 1) dates(end) dates(end) dates(end - fhs + 1)],... h.YLim([1,1,2,2]),[0.8 0.8 0.8]); uistack(hp,'bottom'); legend('True GNPR','Forecasted GNPR','Forecast Horizon','Location','NW') title('Real Gross National Product: 1909 - 1970'); ylabel('rGNP'); xlabel('Year'); hold off

`yF`

is a 10-by-1 vector of future values of real GNP corresponding to the future predictor data.

Estimate the forecast root mean squared error (RMSE).

frmse = sqrt(mean((yF - yFT).^2))

frmse = 18.8470

Forecast RMSE is a relative measure of forecast accuracy. Specifically, you estimate several models using different assumptions. The model with the lowest forecast RMSE is the best performing model of the ones being compared.

When you perform Bayesian regression with SSVS, it is best practice to tune the hyperparameters. One way to tune the hyperparameters is to estimate forecast RMSE over a grid of hyperparameter values, and choose the values that minimize forecast RMSE.

A *Bayesian linear regression model* treats
the parameters *β* and *σ*^{2} in
the multiple linear regression (MLR) model *y _{t}* =

For times *t* = 1,...,*T*:

*y*is the observed response._{t}*x*is a 1-by-(_{t}*p*+ 1) row vector of observed values of*p*predictors. To accommodate a model intercept,*x*_{1t}= 1 for all*t*.*β*is a (*p*+ 1)-by-1 column vector of regression coefficients corresponding to the variables composing the columns of*x*._{t}*ε*is the random disturbance having a mean of zero and Cov(_{t}*ε*) =*σ*^{2}*I*_{T×T}, while*ε*is a*T*-by-1 vector containing all disturbances. These assumptions imply that the data likelihood is$$\ell \left(\beta ,{\sigma}^{2}|y,x\right)={\displaystyle \prod _{t=1}^{T}\varphi \left({y}_{t};{x}_{t}\beta ,{\sigma}^{2}\right).}$$

*ϕ*(*y*_{t};*x*,_{t}β*σ*^{2}) is the Gaussian probability density with mean*x*and variance_{t}β*σ*^{2}evaluated at*y*._{t};

Before considering the data, a *joint prior distribution* assumption
is imposed on (*β*,*σ*^{2}).
In a Bayesian analysis, the beliefs about the distribution of the
parameters are updated using information about the parameters gleaned
from the likelihood of the data. The result is the *joint
posterior distribution* of (*β*,*σ*^{2})
or the *conditional posterior distributions* of
the parameters.

*Stochastic search variable selection* (SSVS) is a
predictor variable selection method for Bayesian linear regression that searches the space
of potential models for models with high posterior probability, and averages the models it
visits after it completes the search.

SSVS assumes that the prior distribution of each regression coefficient
is a mixture of two Gaussian distributions, and the prior distribution of
*σ*^{2} is inverse gamma with shape
*A* and scale *B*. Let *γ* =
{*γ*_{1},…,*γ _{K}*}
be a latent, random

*K*is the number of coefficients in the model (`Intercept`

+`NumPredictors`

).*γ*= 1 means that_{k}*β*|_{k}*σ*^{2},*γ*is Gaussian with mean 0 and variance_{k}*c*_{1}*γ*= 0 means that a predictor is Gaussian with mean 0 and variance_{k}*c*_{2}.A probability mass function governs the distribution of

*γ*, and the sample space of*γ*is composed of 2^{K}elements.

More concretely, given *γ _{k}* and

*Z*is a standard normal random variable.For conjugate models (

`mixconjugateblm`

),*c*=_{j}*σ*^{2}*V*,_{j}*j*= 1,2.For semiconjugate models (

`mixsemiconjugateblm`

),*c*=_{j}*V*._{j}

*c*_{1} is relatively large, which
implies that the corresponding predictor is more likely to be in the model.
*c*_{2} is relatively small, which implies that the
corresponding predictor less likely to be in the model because distribution is dense around
0.

In this framework, if there is the potential for a total of *K*
coefficients in a model, then there are 2^{K } through which to
search. Because computing posterior probabilities of all 2^{K}
models can be computationally expensive, SSVS uses MCMC to sample *γ* =
{*γ*_{1},…,*γ _{K}*},
and estimate posterior probabilities of corresponding models. Those models that the
algorithm chooses more often have higher posterior probabilities. Estimated posterior
distributions of

The resulting posterior distribution for conjugate mixture models is analytically tractable (see Algorithms). For details on the posterior distribution, see Analytically Tractable Posteriors.

A closed-form posterior exists for conjugate mixture priors in the SSVS framework with
*K* coefficients. However, because the prior
*β*|*σ*^{2},*γ*,
marginalized by *γ*, is a
2^{K}-component Gaussian mixture, MATLAB uses Markov Chain Monte Carlo instead to sample from the posterior for numerical
stability.

The `bayeslm`

function can create any supported prior model object for Bayesian linear regression.

[1]
George, E. I. and R. E. McCulloch. "Variable Selection Via Gibbs Sampling." *Journal of the American Statistical Association*. Vol. 88, No. 423, 1993, pp. 881–889.

[2]
Koop, G., D. J. Poirier, and J. L. Tobias. *Bayesian Econometric Methods*. New York, NY: Cambridge University Press, 2007.

아래 MATLAB 명령에 해당하는 링크를 클릭하셨습니다.

이 명령을 MATLAB 명령 창에 입력해 실행하십시오. 웹 브라우저에서는 MATLAB 명령을 지원하지 않습니다.

Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .

Select web siteYou can also select a web site from the following list:

Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.

- América Latina (Español)
- Canada (English)
- United States (English)

- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)

- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)