A Simple Loop For Bandit Problem
조회 수: 14 (최근 30일)
이전 댓글 표시
Hello. I wanna write a code for Bandit problem. My problem: We have 5 machines. We wanna play each of these machines 1000 times (we call each 1000 plays a TASK). In each time, the machine gives us a reward (randomly). We wanna check that the mean reward of which one of these TASKs is more than others (maximum). I wrote this code but it doesn't work well. Why?
for j=1:5
h=0;
for i=1:1000
rew=randn(1);
sum=rew+h;
end
mean(j)=sum./1000; [value,index]=max(mean(j))
end end
Thanks in advance
댓글 수: 0
답변 (1개)
Manikanta Aditya
2022년 6월 20일
Hi Zahra,
The calculation of mean is to be done in the inner loop, where as I see it is being done in the outer loop in your code. This is the reason for the errors. Please refer to the code below :
mean = [];
for i = 1 : 5
m = 0;
for j = 1 : 1000
s = 0;
r = randn(1);
s = s + r;
m = s / 1000;
end
mean(i) = m;
end
value = max(mean);
index = find(mean == max(mean));
disp(mean);
disp(value);
disp(index);
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Parallel and Cloud에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!