Curve fitting and parameter estimation with lsqcurvefit

조회 수: 24 (최근 30일)
Mr. P
Mr. P 2024년 5월 10일
댓글: Star Strider 2024년 5월 11일
Dear Matlab family...
I have been following previous chats about this topic but when I try to relate with my model I get the error below. Note that, to improve the accuacy of the model, not everything needs to be estimated from the dataset, some aspects like natural mortality can be obtained from literature and government records, so in this case, I want to pass mu and Lambda as I already obtained them from literature, and only estimate the other parameters. I have attached my matlab script and my data for your convinience. Thank you very much for your assistance. As this is experimental, feel free to vary the initial conditions and boundaries for the parameters.
Error using lsqcurvefit>initEvalErrorHandler
FUN must have two input arguments.
Error in lsqcurvefit (line 258)
userFcn_ME = initEvalErrorHandler(userFcn_ME,funfcn_x_xdata{3}, ...
Error in fit_run (line 24)
estimated_params = lsqcurvefit(objective, initial_guess, lb, ub);

답변 (2개)

Torsten
Torsten 2024년 5월 10일
편집: Torsten 2024년 5월 10일
% Load data
data = readtable('infections.csv');
dates = datetime(data.Date, 'InputFormat', 'dd-MMM-yy');
cases = data.Cases;
% Define time vector
t = 1:numel(cases);
% Initial guess for parameters
initial_guess = [0.5, 0.1, 0.1];
% Define lower and upper bounds
lb = [0, 0, 0];
ub = [1, 1, 1];
% Call lsqcurvefit
estimated_params = lsqcurvefit(@objective, initial_guess, t, cases, lb, ub);
disp(estimated_params);
function I = objective(params,t)
S0 = ...
E0 = ...
I0 = ...
R0 = ...
[~,Y] = ode45(@(t,y)fit(t,y,params),t,[S0;E0;I0;R0]);
I = Y(:,3);
end
function dydt = fit(t, y, params)
S = y(1);
E = y(2);
I = y(3);
R = y(4);
beta = params(1);
sigma = params(2);
gamma = params(3);
Lambda=1000;
mu=0.000065;
dS = Lambda-beta*S*I-mu*S;
dE = beta*S*I - sigma*E-mu*E;
dI = sigma*E - gamma*I-mu*I;
dR = gamma*I-mu*R;
dydt = [dS; dE; dI; dR];
end

Star Strider
Star Strider 2024년 5월 10일
If you look at your data, I do not believe that there is any way to use your ‘SEIR’ model to fit them —
T1 = readtable('infections.csv')
T1 = 1017x2 table
Date Cases ___________ _____ 01-Jan-0020 1 02-Jan-0020 2 03-Jan-0020 0 04-Jan-0020 0 05-Jan-0020 0 06-Jan-0020 0 07-Jan-0020 0 08-Jan-0020 2 09-Jan-0020 2 10-Jan-0020 0 11-Jan-0020 0 12-Jan-0020 1 13-Jan-0020 0 14-Jan-0020 1 15-Jan-0020 0 16-Jan-0020 0
figure
plot(T1{:,1}, T1{:,2})
grid
xlabel('Date')
ylabel('Infections')
That aside, I wrote code using both lsqcurvefit (thatt fails because I serously doube any parameter set will work with your code) andusing the genetic algorithm (ga) that you can use if you have the Global Optimization Toolbox. Thd data span three yeaars, from 20 to 22, so the date data on the x-axis are not accurate, although the plot itself is. I created a linear daily date vector ‘t’ in my code that you can use if you want to.
My code for your problem —
clear all
close all
T1 = readtable('infections.csv')
DOY = day(T1.Date,'dayofyear');
didx = [diff(DOY)<0];
de = DOY(didx~=0);
di = cumsum([0; didx]);
di(di==1) = DOY(di == 1) + de(1);
di(di==2) = DOY(di == 2) + de(1) + de(2);
% size(di)
% size(T1)
figure
plot(di, T1{:,2})
grid
% return
t = di;
c = T1{:,2};
% theta0 = rand(7,1).*[ones(3,1)*1E+5; rand(4,1)];
% tic
% % [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c,zeros(size(theta0)));
% toc
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 7;
optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms).*[ones(1,3) ones(1,4)*1E-3], 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems
% opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],optsAns);
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf('Fitness value at convergence = %.4f\nGenerations \t\t\t\t = %d\n\n',fval,output.generations)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:numel(theta)
fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, compose('C_%d',1:size(c,2)), 'Location','best')
function C=kinetics(params,t)
c0=params(end-3:end);
[T,Cv]=ode15s(@DifEq,t,c0);
function dydt = DifEq(t, y)
S = y(1);
E = y(2);
I = y(3);
R = y(4);
beta = params(1);
sigma = params(2);
gamma = params(3);
Lambda=1000;
mu=0.000065;
dS = Lambda-beta*S*I-mu*S;
dE = beta*S*I - sigma*E-mu*E;
dI = sigma*E - gamma*I-mu*I;
dR = gamma*I-mu*R;
dydt = [dS; dE; dI; dR];
end
C=Cv(:,3);
end
I assumed that the ‘Cases’ were likely described by ‘dI’ so I chose the third column of the output of the integration to fit them. I also chaged the integrator to ode15s because your data are likely ‘stiff’ as are systems of differential equations describing most kinetic data.
My code runs, however it doesn’t converge because your model likely doesn’t describe your data.
.
  댓글 수: 2
Mr. P
Mr. P 2024년 5월 10일
@Star Strider Thank you for your response. I have adjusted my data so that it doesn't have the seasonality pattern, here is my adjusted data (infections1). However, when I run the code you provided above with this data, I get the error:
Index exceeds the number of array elements. Index must not exceed 0.
Error in fit_run (line 8)
di(di==1) = DOY(di == 1) + de(1);
Star Strider
Star Strider 2024년 5월 11일
I seriously doubt if my code (and your sifferential equation system) are going to work, regardless. Ths eimple reason is that they are not set up to model the sort of data you are giving them.
If you already have an identified model, you could probably (no promises) estimate the ‘S’, ‘E’, and ‘R’ data from the available ‘I’ data, however fitting those equations to your data is likely not possible, simply because your differential equations describe an entirely different situation than are present in your data. See for example The SEIRS model for infectious disease dynamics Nat Methods 17, 557–558 (2020). The newer data do not change that.
clear all
close all
T1 = readtable('infections1.csv')
T1 = 145x2 table
Date Cases ___________ _____ 09-Mar-0020 76 10-Mar-0020 17 11-Mar-0020 0 12-Mar-0020 25 13-Mar-0020 4 14-Mar-0020 25 15-Mar-0020 3 16-Mar-0020 16 17-Mar-0020 15 18-Mar-0020 28 19-Mar-0020 14 20-Mar-0020 3 21-Mar-0020 5 22-Mar-0020 27 23-Mar-0020 6 24-Mar-0020 12
DOY = day(T1.Date,'dayofyear');
% didx = [diff(DOY)<0];
% de = DOY(didx~=0)
% di = cumsum([0; didx])
% di(di==1) = DOY(di == 1) + de(1);
% di(di==2) = DOY(di == 2) + de(1) + de(2);
% size(di)
% size(T1)
di = DOY - DOY(1);
figure
plot(di, T1{:,2})
grid
xlabel('Date')
ylabel('Infections')
return
t = di;
c = T1{:,2};
% theta0 = rand(7,1).*[ones(3,1)*1E+5; rand(4,1)];
% tic
% % [theta,Rsdnrm,Rsd,ExFlg,OptmInfo,Lmda,Jmat]=lsqcurvefit(@kinetics,theta0,t,c,zeros(size(theta0)));
% toc
ftns = @(theta) norm(c-kinetics(theta,t));
PopSz = 500;
Parms = 7;
optsAns = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms).*[ones(1,3) ones(1,4)*1E-3], 'MaxGenerations',5E3, 'FunctionTolerance',1E-10); % Options Structure For 'Answers' Problems
% opts = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'PlotFcn',@gaplotbestf, 'PlotInterval',1);
% optshf = optimoptions('ga', 'PopulationSize',PopSz, 'InitialPopulationMatrix',randi(1E+4,PopSz,Parms)*1E-3, 'MaxGenerations',5E3, 'FunctionTolerance',1E-10, 'HybridFcn',@fmincon, 'PlotFcn',@gaplotbestf, 'PlotInterval',1); % With 'HybridFcn'
t0 = clock;
fprintf('\nStart Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t0)
[theta,fval,exitflag,output,population,scores] = ga(ftns, Parms, [],[],[],[],zeros(Parms,1),Inf(Parms,1),[],[],optsAns);
t1 = clock;
fprintf('\nStop Time: %4d-%02d-%02d %02d:%02d:%07.4f\n', t1)
GA_Time = etime(t1,t0)
DT_GA_Time = datetime([0 0 0 0 0 GA_Time], 'Format','HH:mm:ss.SSS');
fprintf('\nElapsed Time: %23.15E\t\t%s\n\n', GA_Time, DT_GA_Time)
fprintf('Fitness value at convergence = %.4f\nGenerations \t\t\t\t = %d\n\n',fval,output.generations)
fprintf(1,'\tRate Constants:\n')
for k1 = 1:numel(theta)
fprintf(1, '\t\tTheta(%2d) = %8.5f\n', k1, theta(k1))
end
tv = linspace(min(t), max(t));
Cfit = kinetics(theta, tv);
figure
hd = plot(t, c, 'p');
for k1 = 1:size(c,2)
CV(k1,:) = hd(k1).Color;
hd(k1).MarkerFaceColor = CV(k1,:);
end
hold on
hlp = plot(tv, Cfit);
for k1 = 1:size(c,2)
hlp(k1).Color = CV(k1,:);
end
hold off
grid
xlabel('Time')
ylabel('Concentration')
legend(hlp, compose('C_%d',1:size(c,2)), 'Location','best')
function C=kinetics(params,t)
c0=params(end-3:end);
[T,Cv]=ode15s(@DifEq,t,c0);
function dydt = DifEq(t, y)
S = y(1);
E = y(2);
I = y(3);
R = y(4);
beta = params(1);
sigma = params(2);
gamma = params(3);
Lambda=1000;
mu=0.000065;
dS = Lambda-beta*S*I-mu*S;
dE = beta*S*I - sigma*E-mu*E;
dI = sigma*E - gamma*I-mu*I;
dR = gamma*I-mu*R;
dydt = [dS; dE; dI; dR];
end
C=Cv(:,3);
end
.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Interpolation에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by