Different confidence intervals for regression slope
이전 댓글 표시
Can anyone explain why I am getting different answers for the confidence limits for the slope of a linear regression when I use polyfit and polyparci compared with using fitlm and coefCI. For example the following code generates some linearly correlated data with added noise, then does the least squares fit directly, using polyfit and using fitlm, extracting the key items of data at each step:
clear variables
x = (0:10)';
Y = 3.5*x + (((rand(size(x))-0.5)/3).*x);
% option 1
X = [ones(size(Y)), x];
B1 = X\Y;
Ycalc = X*B1;
R21 = 1 - sum((Y - Ycalc).^2)/sum((Y - mean(Y)).^2);
R2a1 = 1 - ((1-R21)*(length(Y)-1)/(length(Y)-length(B1)));
clear X Ycalc
% option 2
[p,S] = polyfit(x,Y,1);
B2 = fliplr(p)';
coef = corrcoef(x,Y);
R22 = coef(1,2)^2;
R2a2 = 1 - ((1-R22)*(length(Y)-1)/(length(Y)-length(B2)));
ci2 = polyparci(p,S,0.95);
clear p S coef
% option 3
mdl = fitlm(x,Y,'y ~ x1');
B3 = mdl.Coefficients{:,1};
R23 = mdl.Rsquared.Ordinary;
R2a3 = mdl.Rsquared.Adjusted;
ci3 = coefCI(mdl,0.05);
ci3 = fliplr(ci3');
clear mdl
As one would expect, all of the approaches produce the same regression coefficients, R-squared and adjusted R-squared values. However, the confidence intervals generated by polyparci and coefCI are different. In all cases I have tried, the range of the confidence limits returned by coefCI is wider than that from polyparci.
Can anyone explain why the methods produce different results?
Thanks, Brian
답변 (2개)
Star Strider
2017년 4월 28일
I originally tested polyparci only with nlparci, and the estimates then were essentially the same. I posted it before fitlm appeared.
Change the ‘tstat’ assignment in polyparci to:
tstat = @(tval) (max(alpha,(1-alpha)) - t_cdf(tval,PolyS.df) ); % Function to calculate t-statistic for p = ‘alpha’ and v = ‘PolyS.df’
and the results are identical with nlparci, fitlm and regress.
Thank you for discovering this glitch with the ‘alpha’ argument. I’ll update polyparci and post it.
댓글 수: 2
Brian Scannell
2017년 4월 28일
Star Strider
2017년 4월 28일
My pleasure.
With the correction I posted, there is no ambiguity, and the confidence interval will be the same.
My impression is that the confidence interval calculation in nlparci changed between the time I wrote the function and now. I changed my function to accord with the current behavior of the MATLAB Statistics and Machine Learning Toolbox functions.
‘Taken together, it means there is a 10% chance that the "true" gradient is outside the bounds defined by the upper and lower limits.’
That is incorrect, at least as I read it. The confidence intervals are such that at a 95% (or 5%) confidence interval, there is a 95% probability that the true value is within those limits and a 5% (or ±2.5%) probability that they will lie outside those limits.
The terms ‘confidence limits’ and ‘confidence interval’ are essentially the same. The context must be clear if either term is used. I prefer the term ‘confidence limits’.
카테고리
도움말 센터 및 File Exchange에서 Linear Predictive Coding에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!