regress and stats

조회 수: 3 (최근 30일)
jenka . 2012년 1월 20일
In regress function there is an option to save stats that includes R^2 among the other things. I am trying to see the relationship between R^2 and corrcoef. When we have only simple linear regression (variable y (response) and variable x (independent variable), R = corrcoef (x, y); Also, R = corrcoef(y, y_from_regress_function); However, when I have say two independent variables x1 and x2, the relationship above do not hold. However, one relationship still has to hold. That is R from the regress output should still be equal to corrcoef(y,y_from_regress_function). Any suggestions on why matlab does not produce expected R2 in multiple regression? Here is the code I use: X = [one(size(x1)) x1 x2 x1.*x2]; [b,bind,r,rint,stats] = regress(y,X); model = b(1) + b(2)*x1 + b(3)*x3 + b(4).*x1.*x2; corr = corrcoef(model,y); I expected stats(1) = corr^2. But it is not. Any suggestions?
  댓글 수: 2
the cyclist
the cyclist 2012년 1월 20일
It would also be helpful if you posted code with specification of x1, etc., such that it is a self-contained example that exhibits the issue. That saves people who might help you a lot of guesswork, and gives a common example to work with.

댓글을 달려면 로그인하십시오.

답변 (2개)

Léon 2012년 1월 24일
That is not a matlab related questions, since it relies on econometrics/statistics.
In every case the coefficient of determination R^2 is the relation of the sum of explained squares and the sum of of all squares, R^2 = SSE / SST. In the bivariate case we can show that the correlation coefficient (Pearson) is sufficient to describe the explanatory power of the model, so that r^2 = R^2. Meaning that the covariance between y and x (where x is just 1 explanatory variable) increases with the variance in y and x and represents the variation that can be explained by that specific model. In other words, the R^2 in the bivariate case can be rewritten as r_(y,x) * (beta_x * s_x/s_y). Hence the coefficient of determination is as well the correlation coefficient weighted by the standardized regression coefficient in x. This relationship holds for the trivariate and multivariate case where R^2 can be expressed as the sum of all bivariate correlations, weighted by their specific standardized regression coefficients. So the point is in fact that in the bivariate case the standardized regression coefficient equals the correlation coefficient (!), such that, --> R^2 = r * r = r * (beta_x * s_x / s_y), (for the bivariate case).
I hope this helps you seeing the relation between the R^2 and the correlation between your variables clearer. But once again this is subject of elementary econometrics courses/books and you should be aware of these things before using such models that might give you biased/wrong results.

Tom Lane
Tom Lane 2012년 1월 24일
One problem is that the model you fit is not the same as the "model" value you computed afterward. Or maybe the "x3" was just a typo. Either way, here's some code showing that the square of the correlation between the observed and fitted y is equal to the R^2 value in the stats structure:
x1 = randn(100,1); x2 = 5*rand(100,1);
y = 100 + 10*x1 - 4*x1.*x2 + 3*x2.^2;
X = [ones(size(x1)) x1 x2 x1.*x2];
[b,bind,r,rint,stats] = regress(y,X);
model = X*b;
corr = corrcoef(model,y)


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by