ecmlsrmle
Leastsquares regression with missing data
Syntax
[Parameters,Covariance,Resid,Info] = ecmlsrmle(Data,Design,MaxIterations,TolParam,TolObj,Param0,Covar0,CovarFormat)
Arguments


 A matrix or a cell array that handles two model structures:

 (Optional) Maximum number of iterations for the estimation algorithm. Default value is 100. 
 (Optional) Convergence tolerance for estimation algorithm
based on changes in model parameter estimates. Default value is 
$$\Vert Para{m}_{k}Para{m}_{k1}\Vert <TolParam\times \left(1+\Vert Para{m}_{k}\Vert \right)$$  
where  
 (Optional) Convergence tolerance for estimation algorithm based on changes in the objective function. Default value is eps ∧ 3/4 which is about 1.0e12 for double precision. The convergence test for changes in the objective function is $$\leftOb{j}_{k}Ob{j}_{k1}\right<\text{\hspace{0.17em}}TolObj\times \left(1+\leftOb{j}_{k}\right\right)$$ for iteration k =
2, 3, ... . Convergence is assumed when both the 
 (Optional) 
 (Optional) For covarianceweighted leastsquares calculations, this matrix corresponds with weights for each series in the regression. The matrix also serves as an initial guess for the residual covariance in the expectation conditional maximization (ECM) algorithm. 
 (Optional) Character vector that specifies the format for the covariance matrix. The choices are:

Description
[Parameters, Covariance, Resid, Info] = ecmlsrmle(Data,
Design, MaxIterations, TolParam, TolObj, Param0, Covar0, CovarFormat)
estimates
a leastsquares regression model with missing data. The model has
the form
$$Dat{a}_{k}\sim N\left(Desig{n}_{k}\times Parameters,\text{\hspace{0.17em}}Covariance\right)$$
for samples k = 1, ... , NUMSAMPLES
.
ecmlsrmle
estimates a NUMPARAMS
by1
column vector of model parameters called Parameters
, and a
NUMSERIES
byNUMSERIES
matrix of covariance
parameters called Covariance
.
ecmlsrmle(Data, Design)
with no output arguments
plots the loglikelihood function for each iteration of the algorithm.
To summarize the outputs of ecmlsrmle
:
Parameters
is aNUMPARAMS
by1
column vector of estimates for the parameters of the regression model.Covariance
is aNUMSERIES
byNUMSERIES
matrix of estimates for the covariance of the regression model's residuals. For leastsquares models, this estimate may not be a maximum likelihood estimate except under special circumstances.Resid
is aNUMSAMPLES
byNUMSERIES
matrix of residuals from the regression.
Another output, Info
, is a structure that
contains additional information from the regression. The structure
has these fields:
Info.Obj
— A variableextent column vector, with no more thanMaxIterations
elements, that contain each value of the objective function at each iteration of the estimation algorithm. The last value in this vector,Obj
(end)
, is the terminal estimate of the objective function. If you do leastsquares, the objective function is the leastsquares objective function.Info.PrevParameters
—NUMPARAMS
by1
column vector of estimates for the model parameters from the iteration just prior to the terminal iteration.Info.PrevCovariance
—NUMSERIES
byNUMSERIES
matrix of estimates for the covariance parameters from the iteration just prior to the terminal iteration.
Notes
If doing covarianceweighted leastsquares, Covar0
should
usually be a diagonal matrix. Series with greater influence should
have smaller diagonal elements in Covar0
and series
with lesser influence should have larger diagonal elements. Note that
if doing CWLS, Covar0
do not need to be a diagonal
matrix even if CovarFormat
= 'diagonal'
.
You can configure Design
as a matrix if NUMSERIES
= 1
or as a cell array if NUMSERIES
≥ 1
.
If
Design
is a cell array andNUMSERIES
=1
, each cell contains aNUMPARAMS
row vector.If
Design
is a cell array andNUMSERIES
>1
, each cell contains aNUMSERIES
byNUMPARAMS
matrix.
These points concern how Design
handles missing
data:
Although
Design
should not haveNaN
values, ignored samples due toNaN
values inData
are also ignored in the correspondingDesign
array.If
Design
is a1
by1
cell array, which has a singleDesign
matrix for each sample, noNaN
values are permitted in the array. A model with this structure must haveNUMSERIES
≥NUMPARAMS
withrank(Design{1}) = NUMPARAMS
.ecmlsrmle
is more strict thanmvnrmle
about the presence ofNaN
values in theDesign
array.
Use the estimates in the optional output structure Info
for
diagnostic purposes.
Examples
See Multivariate Normal Regression, LeastSquares Regression, CovarianceWeighted Least Squares, Feasible Generalized Least Squares, and Seemingly Unrelated Regression.
References
Roderick J. A. Little and Donald B. Rubin. Statistical Analysis with Missing Data. 2nd Edition. John Wiley & Sons, Inc., 2002.
XiaoLi Meng and Donald B. Rubin. “Maximum Likelihood Estimation via the ECM Algorithm.” Biometrika. Vol. 80, No. 2, 1993, pp. 267–278.
Joe Sexton and Anders Rygh Swensen. “ECM Algorithms that Converge at the Rate of EM.” Biometrika. Vol. 87, No. 3, 2000, pp. 651–662.
A. P. Dempster, N.M. Laird, and D. B. Rubin. “Maximum Likelihood from Incomplete Data via the EM Algorithm.” Journal of the Royal Statistical Society. Series B, Vol. 39, No. 1, 1977, pp. 1–37.