Fitting nonlinear noisy data
이전 댓글 표시
I am fitting a function to some simulated data. The procedure works perfectly, but I would like to know if it can be made more robust to noise. When I use this amount of noise:
y = awgn(CPSC,35,'measured');
It still works very well. But if the amount of noise gets increased to:
y = awgn(CPSC,25,'measured');
In 15% of cases the fit is completely wrong.
This is the function that I use to generate the data:
function [EPSC, IPSC, CPSC, t] = generate_current(G_max_chl, G_max_glu, EGlu, EChl, Vm, tau_rise_In, tau_decay_In, tau_rise_Ex, tau_decay_Ex,tmax)
dt = 0.1; % time step duration (ms)
t = 0:dt:tmax-dt;
% Compute compound current
IPSC = ((G_max_chl) .* ((1 - exp(-t / tau_rise_In)) .* exp(-t / tau_decay_In)) * (Vm - EChl));
EPSC = ((G_max_glu) .* ((1 - exp(-t / tau_rise_Ex)) .* exp(-t / tau_decay_Ex)) * (Vm - EGlu));
CPSC = IPSC + EPSC;
end
And this is the fitting procedure:
% Values generated by simulation
[~,~,CPSC,t] = generate_current(60,40,0,-70,-30,0.44,15,0.73,3,120);
% Initial values
gmc = 90;
gmg = 90;
tde = 1;
tdi = 1;
tre = 1;
tri = 1;
% Apply white noise to the CPSC
y = awgn(CPSC,35,'measured');
% Alternatively, without noise
% y = CPSC;
%% Perform fit
[xData, yData] = prepareCurveData(t, y);
% Set up fittype and options.
ft = fittype( '((G_max_chl) .* ((1 - exp(-t / tau_rise_In)) .* exp(-t / tau_decay_In)) * (Vm - EChl)) + ((G_max_glu) .* ((1 - exp(-t / tau_rise_Ex)) .* exp(-t / tau_decay_Ex)) * (Vm - EGlu))', 'independent', 't', 'dependent', 'y' );
opts = fitoptions( 'Method', 'NonlinearLeastSquares' );
opts.Display = 'Off';
opts.Lower = [-70 0 1 1 -30 0 0 0 0];
opts.StartPoint = [-70 0 gmc gmg -30 tde tdi tre tri]; % Starting values
opts.Upper = [-70 0 150 150 -30 20 20 5 5];
[fitresult1, gof1] = fit(xData, yData, ft, opts)
%% Plot fit with data
figure( 'Name', 'Fit' );
h = plot( fitresult1, xData, yData );
legend( h, 'CPSC at -30mV', 'Fit to CPSC', 'Location', 'NorthEast', 'Interpreter', 'none');
subtitle('Realistic values')
% Label axes
xlabel( 'time', 'Interpreter', 'none' );
ylabel( 'pA', 'Interpreter', 'none' );
grid on
How can I make it more robust to noise?
Thanks!
채택된 답변
추가 답변 (2개)
Alan Weiss
2021년 3월 18일
1 개 추천
Most likely the issue is that there are multiple local minima, as in this example: Nonlinear Data-Fitting Using Several Problem-Based Approaches, especially the section Split Problem is More Robust to Initial Guess.
In that example a "split problem" approach worked. In general, you might want to use multiple start points, as in MultiStart Using lsqcurvefit or lsqnonlin.
Good luck,
Alan Weiss
MATLAB mathematical toolbox documentation
Bjorn Gustavsson
2021년 3월 18일
When I do this I typically write my own error-function or residual-function and then use fminsearch (or some of the general optimization-contributions on the file-exchange, like fminsearchbnd or optimize). These functions do the same job as fit as far as I understand, but with a more direct hands-on interface for the users. That way I can write my error-functions with the properly weighted contributions when I have measurements with varying standard deviation (I typically get away with the assumption that my noise is generaly normal-distributed, but if it isn't it's also possible to generalize to optimizing the liklelihood/log-likelihood for other noise-distributions). Therefore I'd solve your problem by turning away from fit and use lsqnonline (or fminsearch) instead:
function res = your_fit_residuals(pars,t,y_obs,y_std)
% Your residual-function for use with lsqnonlin
% that also takes the estimated standard deviations of your observations
% giving you a weighted least-square solution for your fit
if nargin == 3
y_std = 1; % if no std given defaults to a constant and get a standard lsq-fit
end
G_max_chl = pars(1);
tau_rise_In = pars(2);
tau_decay_In = pars(3);
Vm = pars(4);
EChl = pars(5);
G_max_glu = pars(6);
tau_rise_Ex = pars(7);
tau_decay_Ex = pars(8);
EGlu = pars(9);
y_mod = ((G_max_chl) .* ((1 - exp(-t / tau_rise_In)) .* exp(-t / tau_decay_In)) * (Vm - EChl)) + ...
((G_max_glu) .* ((1 - exp(-t / tau_rise_Ex)) .* exp(-t / tau_decay_Ex)) * (Vm - EGlu))');
res = (y_mod-y_obs)./y_std;
end
Then you call lsqnonlin like this:
Upper = [-70 0 150 150 -30 20 20 5 5];
StartPoint = [-70 0 gmc gmg -30 tde tdi tre tri]; % Starting values
Lower = [-70 0 1 1 -30 0 0 0 0];
% You have fixed the first second and fifth parameter here, I'm also unsure
% about the ordering of parameters the FIT-function uses so that you have
% to check...
sigma_y = % You have to get estimates of the uncertainty of your measurements from somewhere
[pars_lsqnonlin = lsqnonlin(@(pars) your_fit_residuals(pars,xData, yData,sigma_y),...
StartPoint, Lower,Upper);
yModel = your_fit_residuals(pars_lsqnonlin,xData, 0*yData);
With information about the variation of measurement uncertainty the fit will weight the contributions from the different data-points accordingly.
HTH
댓글 수: 7
Samuele Bolotta
2021년 3월 18일
편집: Samuele Bolotta
2021년 3월 18일
Bjorn Gustavsson
2021년 3월 18일
편집: Bjorn Gustavsson
2021년 3월 18일
OK, good - and something I don't fully understand.
Your G_max_Chl and be both 20 or 90? You you have data-sets from experiments where you used G_max_Chl of 20 or 90 ("as input"?) and then randomized the observed data such that you don't know which batch a data-set originates from? Can G_max_Chl be anything between 20 and 90? If so to my understanding you have it set to a fixed -70 in the code? Then You should have the first elements of Upper set to something like +91 (+95 or whatever margin you want to have) and Lower to +19 (or +15...). Or alternatively you might do 2 parameter-fittings for each data-set and use fixed +20 and +90 for G_max_Chl in the runs and determine which batch the data belonged to from the goodness of fit.
Samuele Bolotta
2021년 3월 18일
Bjorn Gustavsson
2021년 3월 18일
OK, I start to understand your plan. For the case where I want to keep some parameters fixed and only let a subset of them vary I use a pattern like this:
function res = your_fitsomepars_residuals(pars,idx4pars,allpars,t,y_obs,y_std)
% Your residual-function for use with lsqnonlin
% that also takes the estimated standard deviations of your observations
% giving you a weighted least-square solution for your fit
if nargin < 6 || isempty(y_std)
y_std = 1; % if no std given defaults to a constant and get a standard lsq-fit
end
% Here the varying parameters are inserted into the allpars array.
allpars(idx4pars) = pars;
G_max_chl = allpars(1);
tau_rise_In = allpars(2);
tau_decay_In = allpars(3);
Vm = allpars(4);
EChl = allpars(5);
G_max_glu = allpars(6);
tau_rise_Ex = allpars(7);
tau_decay_Ex = allpars(8);
EGlu = allpars(9);
y_mod = ((G_max_chl) .* ((1 - exp(-t / tau_rise_In)) .* exp(-t / tau_decay_In)) * (Vm - EChl)) + ...
((G_max_glu) .* ((1 - exp(-t / tau_rise_Ex)) .* exp(-t / tau_decay_Ex)) * (Vm - EGlu))');
res = (y_mod-y_obs)./y_std;
end
Then your call would be modified to something like this:
Upper = [-70 0 150 150 -30 20 20 5 5];
StartPoint = [-70 0 gmc gmg -30 tde tdi tre tri]; % Starting values
FixedPars = StartPoint;
Lower = [-70 0 1 1 -30 0 0 0 0];
idx4varpars = [2 3 4 6 7 8]; % The ordering of the parameters in this sketch is
% clearly messed up relative to yours...
% You have fixed the first second and fifth parameter here, I'm also unsure
% about the ordering of parameters the FIT-function uses so that you have
% to check...
sigma_y = % You have to get estimates of the uncertainty of your measurements from somewhere
[pars_lsqnonlin = lsqnonlin(@(pars) your_fitsomepars_residuals(pars,idx4varpars,FixedPars,xData, yData,sigma_y),...
StartPoint(idx4varpars), Lower(idx4varpars),Upper(idx4varpars));
yModel = your_fitsomepars_residuals(pars_lsqnonlin,idx4varpars,FixedPars,xData, 0*yData);
That way you completely remove the variables that's not in idx4varpars from the optimization.
Samuele Bolotta
2021년 3월 21일
Bjorn Gustavsson
2021년 3월 21일
편집: Bjorn Gustavsson
2021년 3월 22일
Good, that looks OK at a first glance.
A: Your question about hte standard-deviation estimation might be seen as "naive" at a first glance but is anything but naive. It is typically the most tricky one to actually get right in a typical observation scenario. You should try to get estimates of the standard deviation for each individual point of your measurements. That is sometimes relatively easy, sometimes much harder or very difficult indeed. In your case I tried:
y_std = movstd(y_obs-filtfilt(ones(1,11),1,y_obs),23);
Which gives you something. What you will/should do when you get real data you'll have to learn by looking at your data, and discussing with colleagues about your measurement characteristics. Here I simply made some low-pass filtering of your data - which corresponds to the assumption that you should have a trend that's smoothly varying with time.
B: you have to make res the first output variable from the fit_residuals function - that is the variable lsqnonlin tries to fit to. When I did that and simply overplotted y_mod the fit was good.
y_mod is not calculated with the starting-values - the first thing that happens is to replace the varying parameters in the allpars variable with the values in the pars variable and those are the variable that lsqnonlin modifies.
Samuele Bolotta
2021년 3월 22일
카테고리
도움말 센터 및 File Exchange에서 Get Started with Curve Fitting Toolbox에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
