Rank deficiency issues with GeneralizedLinearModel

I'm not quite sure if I'm encountering a bug associated with GeneralizedLinearModel objects, or if this is simply a result of my shaky knowledge of math underlying this process. Hopefully someone can point me in the right direction...
I'm attempting to fit logistic regression models to several datasets via stepwise regression (xobs is a 3615 x n array, where n ranges from 2-6, and yobs is a 3615 x 1 vector):
mdl = GeneralizedLinearModel.stepwise(xobs, yobs>0, 'purequadratic', ...
'distribution', 'binomial', ...
'link', 'logit', ...
'criterion', 'aic');
For some of my datasets, this results in a regression model with a rank-deficient regression matrix.
Warning: Regression design matrix is rank deficient to within machine precision.
> In TermsRegression>TermsRegression.checkDesignRank at 98
In GeneralizedLinearModel>GeneralizedLinearModel.stepwise at 1553
Here's an example of one such model:
Generalized Linear regression model:
logit(H) ~ 1 + Smax + Smin + Tavg + Savg*Srange + Smax^2 + Smin^2 + Tavg^2
Distribution = Binomial
Estimated Coefficients:
Estimate SE tStat pValue
(Intercept) -30.528 10.947 -2.7887 0.0052926
Savg -0.54303 0.17605 -3.0845 0.0020389
Smax 0.95502 0.26295 3.632 0.00028129
Smin 0 0 NaN NaN
Srange -0.56718 0.20256 -2.8 0.0051102
Tavg 1.5211 0.75819 2.0063 0.044827
Savg:Srange 0.051677 0.016717 3.0913 0.0019928
Smax^2 -0.027523 0.0078208 -3.5192 0.00043288
Smin^2 0.022062 0.0071608 3.0809 0.0020637
Tavg^2 -0.027077 0.013544 -1.9992 0.045585
3249 observations, 3240 error degrees of freedom
Dispersion: 1
Chi^2-statistic vs. constant model: 26, p-value = 0.00104
With this model, several functions associated with GLMs, such as plotSlice and predict, throw errors. Is it by design that this occurs? I'm honestly not sure if the resulting model is mathematically sound, or if I need to manually add and/or remove terms manually to get an allowable model.
Any suggestions for working with models like this? Tests to run on the predictor matrix ( xobs ) to isolate potential troublemaker interactions? Or alternatively, is there a way to prevent the stepwise method from adding terms that would result in a rank-deficient regression matrix?

답변 (0개)

카테고리

질문:

2013년 6월 28일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by