I'm confused about how ridge regression coefficients are generated in matlab. Any help would be appreciated. An example of the issue is shown below.
Thanks,
JG
N = 200;
p = 30;
y = rand(N,1);
X = [ones(N,1),rand(N,p)];
lambda = 1;
R = X'*X + lambda*eye(size(X,2));
Rinv = inv(R);
b_ridge = Rinv*X'*y;
y_ridge = X*b_ridge;
XX = X(:,2:end);
b_ridge_matlab = ridge(y,XX,lambda,0);
y_ridge_matlab = X*b_ridge_matlab;
% why are b_ridge and b_ridge_matlab different? I thought that
%the 0 option in ridge eliminated all scaling and was useful for
%prediction (i.e., y_pred = X_new*b).

 채택된 답변

Tom Lane
Tom Lane 2012년 2월 17일

0 개 추천

Good question! This took a while to figure out, and I can see the help text is not clear about it. The calculations are actually always based on a scaled X under the hood, but the results are adjusted later to be usable with the unscaled data. In particular, the ridge parameter is interpreted as applying to the scaled data. You can reproduce the ridge results by computing R in your code as follows:
R = X'*X + lambda*diag(var(X));

댓글 수: 3

Jason
Jason 2012년 2월 17일
Hi Tom,
Thank you for your response. Yes, the modified definition of R gives results consistent with Matlab. However, this points to a broader question: why would I want to use a set of coefficients for prediction which are not defined according to the standard definition (i.e., that which appears in Matlab's documentation: beta_hat = inv(X'*X + lambda*eye(p+1))*X'*y). Maybe I don't understand enough about ridge regression generally or maybe the coefficients coming from b_ridge_matlab = ridge(y,XX,lambda,0) are to be used with some special prediction routine and not just y_ridge_matlab = X*b_ridge_matlab;
Thanks again.
JG
Tom Lane
Tom Lane 2012년 2월 17일
I agree the help text is confusing. The definition you quote is accurate when X is scaled. I think the alternative with the "0" flag ought to be described as presenting the ridge coefficients, computed the same way, but then post-processed so they can be used with the original X variables. Unless I misunderstand, they do serve that purpose. Try changing your script to include a real relationship between X and y, and at the end plot the fitted and observed values:
y = X*(5./(1:31)')+rand(N,1);
...
scatter(y_ridge_matlab,y)
Jason
Jason 2012년 2월 19일
Thanks Tom

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

질문:

2012년 2월 16일

편집:

2013년 10월 16일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by