constrained stochastic dynamic programming does not converge

This is my first time to use Matlab to numerically solve a dynamic programming. I got stuck for a while so really appreciate any help. The distance between value functions P0 and P1 is decreasing in the beginning, but later it bounces and does not converge to zero.
I have 2 state variables and 21 control variables, and 2 inequality constraints (can be written as one equality constraint). I was using fmincon as the solver.
The below is my main code, and two m.files are attached.
% parameter values
beta = .95;
S = 20;
gamma = 2;
lambda = .95;
% probability distribution vector: Prob
Prob = zeros(S,1);
% productivity realization vector: wbar
wbar = zeros(S,1);
for s = 1:S
Prob(s) = (1-lambda) / (1 - lambda^S) * lambda^(s-1);
wbar(s) = s + 5;
end
% walking away continuation value
vspot = 1 / (1 - beta) * Prob' * wbar.^(1 - gamma) / (1 - gamma);
% space of the state variable: vspace
vmin = -30;
vmax = 50;
ngrid = 99;
grid = (vmax - vmin) / ngrid;
vspace = (vmin:grid:vmax)';
%%value function iteration
% initials
P0 = 30 * ones(ngrid + 1, S); % initial value function
P1 = zeros(ngrid + 1, S);
ctrl = NaN(ngrid+1,S, S+1);
iterate = 0;
maxits = 1000;
critical = 0.1;
dif = 1000; % begin with a large sup norm
h = waitbar(0,'Value function iterating... Good luck this time!');
while dif > critical && iterate < maxits
tic;
iterate = iterate + 1;
for s = 1:S
for i = 1:ngrid+1
x0 = (2:2:42)';
options = optimoptions(@fmincon,'Algorithm','sqp','Display','off');
y = @(x) objfun(x,s,grid,ngrid,vmin,vspace,S,wbar,beta,Prob,P0);
z = @(x) confuneq(x,i,s,vspace,wbar,beta,gamma,Prob,vspot);
lb = vmin * ones(S+1,1);
ub = vmax * ones(S+1,1);
[x,fval] = fmincon(y,x0,[],[],[],[],lb,ub,z,options);
ctrl(i,s,1) = x(1); % assign maximizers to control variables
ctrl(i,s,2:end) = x(2:end);
P1(i,s) = - fval; % assign maximized value to value function
end
end
dif = norm(P1 - P0);
P0 = P1; % replace value function with new matrix
text = ['Currently norm(P1-P0)=', num2str(dif),' the last iteration is the ', num2str(iterate), 'th iteration, the maximum number of iterations is ', num2str(maxits)];
text2 = ['Currently norm(P1-P0)=', num2str(dif)];
waitbar(iterate / maxits);
disp(text);
toc;
end

답변 (0개)

카테고리

도움말 센터File Exchange에서 Mathematics에 대해 자세히 알아보기

질문:

2017년 11월 18일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by