Difference between regress function and basic fitting wizard

조회 수: 2 (최근 30일)
Victoria Dutch
Victoria Dutch 2023년 11월 21일
답변: Sulaymon Eshkabilov 2023년 11월 21일
I have 3-D matrixes of 2 variables, A and B. A contains measured values which are incomplete in space and time, and so has a large number of NaN values. B is a modelled value of A, with far fewer NaN values. I would like to have an equation in the form B = mA + c, where I can see what the predicted value of B would be for a given value of A. I have used the regress function, first converting each matrix to a column as so:
v = reshape(A,[],1);
onez = ones(length(v),1);
v_regress = horzcat(onez,v);
u = reshape(B,[],1);
[w,x,y,~,z] = regress(u,v_regress);
I have also made a scatter plot (using the scatter function) of my column versions of A and B (ie. v and u), and then applied a linear fit in the basic fitting toolbox. The resulting linear fit line looks wildly different (and much better) than plotting the line from the regress function. Additionally, the regress function gives a numerical R^2 value, and the fitting toolbox gives an R^2 value of NaN.
What does the linear fit option on basic fitting toolbox compute differently to the regress function?

답변 (1개)

Sulaymon Eshkabilov
Sulaymon Eshkabilov 2023년 11월 21일
They both produce the same results: R2 is the same for both. See e.g.:
A = randi([-13, 13], 20, 5);
B = randi([-130, 130], 20, 5);
IDXA_1 = randi(10, 7,1);
IDXA_2 = randi(5, 7,1);
for ii=1:numel(IDXA_1)
A(IDXA_1(ii), IDXA_2(ii)) = NaN; % A contains some NaNs
end
IDXB_1 = randi(10, 5,1);
IDXB_2 = randi(5, 5,1);
for ii=1:numel(IDXB_1)
B(IDXB_1(ii), IDXB_2(ii)) = NaN; % B contains some NaNs
end
v = reshape(A,[],1);
onez = ones(length(v),1);
v_regress = horzcat(onez,v);
u = reshape(B,[],1);
[w,x,y,~,z] = regress(u,v_regress);
R2=z(1);
disp(R2)
0.0171
MDL = fitlm(v_regress,u)
Warning: Regression design matrix is rank deficient to within machine precision.
MDL =
Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ ______ ______ _______ (Intercept) 0 0 NaN NaN x1 13.208 8.2415 1.6026 0.11269 x2 -1.2607 1.0258 -1.229 0.22242 Number of observations: 89, Error degrees of freedom: 87 Root Mean Squared Error: 77.4 R-squared: 0.0171, Adjusted R-Squared: 0.00577 F-statistic vs. constant model: 1.51, p-value = 0.222
% You may also consider removing NaNs
A = randi([-13, 13], 20, 5);
B = randi([-130, 130], 20, 5);
IDXA_1 = randi(10, 7,1);
IDXA_2 = randi(5, 7,1);
for ii=1:numel(IDXA_1)
A(IDXA_1(ii), IDXA_2(ii)) = NaN; % A contains some NaNs
end
AF = fillmissing(A,'movmedian',10); % NaNs in A are substituted with moving median of 10 points
IDXB_1 = randi(10, 5,1);
IDXB_2 = randi(5, 5,1);
for ii=1:numel(IDXB_1)
B(IDXB_1(ii), IDXB_2(ii)) = NaN; % B contains some NaNs
end
BF = fillmissing(B,'movmedian',10); % NaNs in B are substituted with moving median of 10 points
v = reshape(AF,[],1);
onez = ones(length(v),1);
v_regress = horzcat(onez,v);
u = reshape(BF,[],1);
[w,x,y,~,z] = regress(u,v_regress);
R2=z(1);
disp(R2)
0.0029
MDL = fitlm(v_regress,u)
Warning: Regression design matrix is rank deficient to within machine precision.
MDL =
Linear regression model: y ~ 1 + x1 + x2 Estimated Coefficients: Estimate SE tStat pValue ________ ______ _______ _______ (Intercept) 0 0 NaN NaN x1 4.408 7.7121 0.57157 0.56893 x2 0.53641 1.0037 0.53445 0.59425 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 77.1 R-squared: 0.00291, Adjusted R-Squared: -0.00727 F-statistic vs. constant model: 0.286, p-value = 0.594

카테고리

Help CenterFile Exchange에서 Linear Regression에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by