multivariate regression with many dummy variables

조회 수: 3 (최근 30일)
Mutlu
Mutlu 2012년 12월 24일
Hello I have a data set with 1800 observations and 7 predictors, first one of which needs to be coded as a dummy variable. The issue is that the variable to be coded as dummy has 73 categories in it. I used dummyvar() function to turn this variable into a 1800 row x 73 column dummy variable matrix. Now I would like to do a linear regression using this dummy variable matrix plus the original remaining 6 predictors but the issue is I am not sure how to do it? I tried merging the dummy variable matrix with the 6 column matrix and did a regress() regression on it but it gave me errors and warnings. How would I do a regression with so many categories, extract the slope and intercepts (I assume all categories will have a similar slope) and then use these coefficients to predict an outcome for individual categories? I am using 2010b release with all the toolboxes. Thank you. Mutlu..

답변 (1개)

Greg Heath
Greg Heath 2012년 12월 27일
It should work. You will have to post more info, e.g., relevant code and error messages.
How, exactly, are your dummies coded?
Are they mutually exclusive?
Are the prior probabilities comparable?
Are the means and variances of all variables comparable?
Are you weighting the error function?
  댓글 수: 1
Mutlu
Mutlu 2012년 12월 28일
Dear Greg thank you for the answer. Here is additional information: The warning I get is "X is rank deficient to within machine precision" I am coding my dummy vars using the dummyvar function which generates a 1800x73 matrix where 73 corresponds to the categories. I then drop one of the columns as suggested in dummyvar help. They seem to be mutually exclusive. I am not sure what you mean by prior probabilities! And I have not checked the means and variances to see if they are comparable (I assume you mean statistics of each category for the 6 predictor variables). Finally I am not weighing the error function. Should I be? At this point I am simply looking for an hypothetical example with say 5 or more dummy vars where regress is applied to find the coefficients and then those coefficients are used to predict outcomes for individual categories. Thank you again Mutlu

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Analysis of Variance and Covariance에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by