Issues with reproducibility in multistart with parallelization
조회 수: 7 (최근 30일)
이전 댓글 표시
I am running several model fits (1139) using multistart with parallelization. I first ran with 50 start points (including my initial guess). I then wanted to re-run with 150 start points, to compare the reduction in fval. So, when running the first 50, I batched my fits manually across a handful of nodes, and saved the rng state to a mat file. When running the 150, I batched the fits in the same way, except for a set of about 100 where that node was unavailable, and loaded the rng state.
To test the effect of the node (I think I read the node influences the number generator, and was curious what would happen with reproducibility) I ran that set of 100 fits across two nodes, loading the same state each time. In this case, I got identical outputs, identical function values.
However, I did not get good reproducibility from the 50 start point run to the 150 start point run. 27% of the 1139 fits were worse (had higher function values in a minimization problem) than the 50 start point fits. I also found that of the 1139 fits, 17% had greater than 1% higher fval, and 3% had 10% greater fval - I thought maybe its rounding or something, but this seems pretty high.
What am I missing? How can I make these fits reproducible?
댓글 수: 0
답변 (1개)
Matt J
2025년 9월 2일
편집: Matt J
2025년 9월 2일
I don't see why you would expect agreement between a 50-point multistart and a 150-point multistart. Only if both versions succeed in finding the global minimum would the results be guaranteed to agree.
댓글 수: 4
Matt J
2025년 9월 3일
You are quite welcome, but when/if you are convinced this is the correct answer please accept-click it.
참고 항목
카테고리
Help Center 및 File Exchange에서 Global or Multiple Starting Point Search에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!