# Convergence can't be reached in lsqcurvefit

조회 수: 6(최근 30일)
Shengyue Shan 2022년 11월 24일
답변: John D'Errico 2022년 12월 22일
Hello,
I am trying to fit a generalized Maxwell model to my data. The model looks like below, Where and ω are vectors of the same length, the fitting coefficients are , , , , , and . The goal is to identify the coefficients that minimize the difference between experimental and model predicted .
I tried to use lsqcurvefit, but the residual norm ended up being ~4e7. I am not sure where the problem could be. Below is my code.
I have also tried to fix the values of , , to 10, 1, 0.1 so there would be fewer unknowns. But that did not lead to convergence either. I have also contructured the problem to be solved with fminsearch to search for the minimal squared norm of difference between experimental and model predicted , and that method gave a similar result. I wonder what went wrong and any suggestions would be appreciated!
Thank you!
G_prime = ...
[9517.91000000000;
10133;
10670.2000000000;
11195.4000000000;
11686.6000000000;
12182.3000000000;
12698.1000000000;
13215.4000000000;
13727.3000000000;
14260.2000000000;
14827.4000000000;
15416.6000000000];
omega = ...
[0.314159000000000;
0.395503000000000;
0.497909000000000;
0.626830000000000;
0.789137000000000;
0.993460000000000;
1.25070000000000;
1.57452000000000;
1.98221000000000;
2.49548000000000;
3.14159000000000;
3.95499000000000];
x0 = ones(7,1); % In the order of g1, g2, g3, lambda1, lambda2, lambda3, ge
[x,resnorm,residual,exitflag,output] = lsqcurvefit(@maxwell, x0, omega, G_prime)
Local minimum possible. lsqcurvefit stopped because the final change in the sum of squares relative to its initial value is less than the value of the function tolerance.
x = 7×1
1.0e+04 * -0.3368 0.4538 0.4639 -1.5831 -0.0002 0.0000 1.1138
resnorm = 9.5722e+03
residual = 12×1
26.5939 -43.2090 -11.7220 14.0310 35.2619 19.8081 -21.8194 -40.8288 -12.7832 28.0335
exitflag = 3
output = struct with fields:
firstorderopt: 0.8764 iterations: 84 funcCount: 680 cgiterations: 0 algorithm: 'trust-region-reflective' stepsize: 0.2990 message: 'Local minimum possible.↵↵lsqcurvefit stopped because the final change in the sum of squares relative to ↵its initial value is less than the value of the function tolerance.↵↵<stopping criteria details>↵↵Optimization stopped because the relative sum of squares (r) is changing↵by less than options.FunctionTolerance = 1.000000e-06.'
sprintf("Squared norm of residual = %7.2f",resnorm)
ans = "Squared norm of residual = 9572.20"
hold on
plot(omega,G_prime,'o')
plot(omega,maxwell(x, omega))
hold off %
function F = maxwell(x, xdata)
%xdata = omega
F = xdata; %predicted G'
chunk_length = (length(x) - 1)/2;
chunk = zeros(1, chunk_length);
for j = 1:length(F)
for i = 1:length(chunk)
chunk(i) = (x(i)*x(i + chunk_length)^2*xdata(j)^2) ...
/(1 + x(i + chunk_length)^2*xdata(j)^2);
end
F(j) = sum(chunk) + x(end);
end
end
##### 댓글 수: 11표시숨기기 이전 댓글 수: 10
Walter Roberson 2022년 11월 26일
"Relative magnitude of search direction is smaller than the step tolerance" or "Change in the residual is less than the specified tolerance"
Those are not errors.
There are comparatively few situations in which lsqcurvefit can say for certain that it has definitely found a solution.
The lsq part of lsqcurvefit means "least squared" -- so the code is trying to minimize
sum( (f(parameters,xdata)-ydata).^2 )
if it encounters a set of parameters such that the sum is exactly zero, then it can give exitflag 1, "converged to a solution".
Suppose that instead it encounters a set of parameters such that sum( (f(parameters,xdata)-ydata).^2 ) is 1e-300 and that for each parameter in the list, varying the parameter to the next higher and next lower value always gives a value that is at least as large as 1e-300, possibly higher. Has it found a "solution" or has it not found a solution?
Maybe it has found a solution and the 1e-300 is just the inevitable result of the fact that operations on floating point values can never be exact. For example there is no representable IEEE754 Double Precision floating point value x near π such that in double precision is exactly 0. We know mathematically that is exactly 0, but π is an irrational value and if you truncate it to any finite number of digits then it cannot mathematically equal 0 and in practice sin(closest_53_bit_representation_to_pi) turns out to not be 0. If lsqcurvefit() encounters a situation like that where the residue could plausibly be attributed to floating point round-off then should lsqcurvefit() confidently give out the exitflag saying that Yes, it has definitely found a solution ?
But what if the function to be fitted turned out to be (x-pi)^2+1e-150 ? Then mathematically the only way to get an exact zero out of that would be to make x complex-valued. So maybe that residue of 1e-300 is a "hard" residue that cannot be eliminated. Should it excuse the 1e-300 residue in that case and confidentally give out the exit flag saying that Yes, it has definitely found a solution?
Can the function distinguish between the two cases? NO, not in the case of a general model. It would have the apply the symbolic toolbox to the model and do mathematical analysis of the expression in order to prove that it was in one situation (lack of exact convergence is solely because of floating point limitations) or the other (lack of exact convergence is because of something inherent in the model considering real valued coefficients.)
The two statuses you mention are two different conditions under which lsqcurvefit() might decide that it is no longer profitable to try to get a better answer. When you get either of them, the output might be the closest possible representation to the exact solution, or it might only be the closest you can expect to get to within the tolerance boundaries that you set up; or it might possibly be the case that mathematically there is no exact solution in reals.
The situation where lsqcurvefit detects that it is stuck, not able to get a better answer, but that the location it is at currently doesn't seem to be very good: that generates a negative error code. Sometimes it means that you need to try again with different positions; sometimes though it is just a badly non-linear model and the place you got to is really the best possible representation within floating point limitations.

댓글을 달려면 로그인하십시오.

### 답변(1개)

John D'Errico 2022년 12월 22일
Not every model you pose will always fit every set of data. But first, we should look at your model, as well as your data.
G_prime = [9517.91000000000;
10133;
10670.2000000000;
11195.4000000000;
11686.6000000000;
12182.3000000000;
12698.1000000000;
13215.4000000000;
13727.3000000000;
14260.2000000000;
14827.4000000000;
15416.6000000000];
omega = [0.314159000000000;
0.395503000000000;
0.497909000000000;
0.626830000000000;
0.789137000000000;
0.993460000000000;
1.25070000000000;
1.57452000000000;
1.98221000000000;
2.49548000000000;
3.14159000000000;
3.95499000000000];
plot(omega,G_prime,'o-') First, does this model have the correct curve shape?
fplot(@(w) w.^2./(1+w.^2),[0,5]) So, yes, it does. Sort of.
But interestingly, I'd worry that the sum of multiple terms of that form won't really help too much. The parameters g and lambda are just scale prameters. They don't actually allow the curve to take on a different shape.
How well does one term fit?
mdl = fittype('g0 + g1*lambda^2*omega.^2./(1 + lambda^2*omega.^2)','indep','omega')
mdl =
General model: mdl(g0,g1,lambda,omega) = g0 + g1*lambda^2*omega.^2./(1 + lambda^2*omega.^2)
fittedmdl = fit(omega,G_prime,mdl)
Warning: Start point not provided, choosing random start point.
fittedmdl =
General model: fittedmdl(omega) = g0 + g1*lambda^2*omega.^2./(1 + lambda^2*omega.^2) Coefficients (with 95% confidence bounds): g0 = 9512 (8952, 1.007e+04) g1 = 5974 (5374, 6573) lambda = 0.8794 (0.6718, 1.087)
plot(fittedmdl,omega,G_prime) So, one term did reasonably well under the circumstances. We can try to see if a second term might help.
mdl2 = fittype('g0 + g1*lambda1^2*omega.^2./(1 + lambda1^2*omega.^2) + g2*lambda2^2*omega.^2./(1 + lambda2^2*omega.^2)','indep','omega')
mdl2 =
General model: mdl2(g0,g1,g2,lambda1,lambda2,omega) = g0 + g1*lambda1^2*omega.^2./(1 + lambda1^2*omega.^2) + g2*lambda2^2*omega.^2./(1 + lambda2^2*omega.^2)
ee that I've chosen start points for the parameters based on the first fit.
fittedmdl2 = fit(omega,G_prime,mdl2,'start',[9500 6000 0.88 50 .1])
fittedmdl2 =
General model: fittedmdl2(omega) = g0 + g1*lambda1^2*omega.^2./(1 + lambda1^2*omega.^2) + g2*lambda2^2*omega.^2./(1 + lambda2^2*omega.^2) Coefficients (with 95% confidence bounds): g0 = 9091 (7821, 1.036e+04) g1 = 5749 (3906, 7591) g2 = -0.4032 (-6.745e+06, 6.745e+06) lambda1 = -1.132 (-1.821, -0.4441) lambda2 = 0.00667 (-5.053e+04, 5.053e+04)
Note the seriously wide confidence intervals. Wose, I'm not sure I'd call this a much better ft. This second model is complete crap.
plot(fittedmdl2,omega,G_prime) numel(omega)
ans = 12
You have 12 data points only. And a model where a third term will be useless. Trying to fit a 7 parameter model of this form is a waste of time. Certainly so on this data.
Sorry. It is.

댓글을 달려면 로그인하십시오.

### 범주

Find more on Get Started with Curve Fitting Toolbox in Help Center and File Exchange

R2022a

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!