regress and stats
조회 수: 3 (최근 30일)
In regress function there is an option to save stats that includes R^2 among the other things. I am trying to see the relationship between R^2 and corrcoef. When we have only simple linear regression (variable y (response) and variable x (independent variable), R = corrcoef (x, y); Also, R = corrcoef(y, y_from_regress_function); However, when I have say two independent variables x1 and x2, the relationship above do not hold. However, one relationship still has to hold. That is R from the regress output should still be equal to corrcoef(y,y_from_regress_function). Any suggestions on why matlab does not produce expected R2 in multiple regression? Here is the code I use: X = [one(size(x1)) x1 x2 x1.*x2]; [b,bind,r,rint,stats] = regress(y,X); model = b(1) + b(2)*x1 + b(3)*x3 + b(4).*x1.*x2; corr = corrcoef(model,y); I expected stats(1) = corr^2. But it is not. Any suggestions?
Léon 2012년 1월 24일
That is not a matlab related questions, since it relies on econometrics/statistics.
In every case the coefficient of determination R^2 is the relation of the sum of explained squares and the sum of of all squares, R^2 = SSE / SST. In the bivariate case we can show that the correlation coefficient (Pearson) is sufficient to describe the explanatory power of the model, so that r^2 = R^2. Meaning that the covariance between y and x (where x is just 1 explanatory variable) increases with the variance in y and x and represents the variation that can be explained by that specific model. In other words, the R^2 in the bivariate case can be rewritten as r_(y,x) * (beta_x * s_x/s_y). Hence the coefficient of determination is as well the correlation coefficient weighted by the standardized regression coefficient in x. This relationship holds for the trivariate and multivariate case where R^2 can be expressed as the sum of all bivariate correlations, weighted by their specific standardized regression coefficients. So the point is in fact that in the bivariate case the standardized regression coefficient equals the correlation coefficient (!), such that, --> R^2 = r * r = r * (beta_x * s_x / s_y), (for the bivariate case).
I hope this helps you seeing the relation between the R^2 and the correlation between your variables clearer. But once again this is subject of elementary econometrics courses/books and you should be aware of these things before using such models that might give you biased/wrong results.
Tom Lane 2012년 1월 24일
One problem is that the model you fit is not the same as the "model" value you computed afterward. Or maybe the "x3" was just a typo. Either way, here's some code showing that the square of the correlation between the observed and fitted y is equal to the R^2 value in the stats structure:
x1 = randn(100,1); x2 = 5*rand(100,1);
y = 100 + 10*x1 - 4*x1.*x2 + 3*x2.^2;
X = [ones(size(x1)) x1 x2 x1.*x2];
[b,bind,r,rint,stats] = regress(y,X);
model = X*b;
corr = corrcoef(model,y)