## Is it possible for parfor workers to keep data in between iterations?

jake555

### jake555 (view profile)

님이 질문을 제출함. 1 Aug 2019
최근 활동 jake555

### jake555 (view profile)

님이 댓글을 추가함. 19 Sep 2019
Gaurav Garg

### Gaurav Garg (view profile)

님이 답변을 채택함.
Hi,
I'm new to parallel processing and hoping I can get some suggestions from the larger Matlab community. I have a set of N column vectors of size (4x1) that can be written as X = (4xN) matrix. At each time step, I need to run update calculations on each of the Nx(4x1) vectors. The updated values then become part of the input at the next time step. I have already vectorized everything so there are no for loops in calculating the update.
I'm trying to speed up the process more by using parfor. I have seen improvement from 1 to 2 cores but 3 and 4 cores are both comparable to 2 cores. I'd like to see if I can continue improving with additional cores (particularly if I were to run this on a larger cluster). I have read enough elsewhere to understand this may not be possible, but I'd like to try.
I'm currently sending the workers everything at each time step, but it seems like I should be able to keep the updated information on the workers so they can use it at the next time step. To hopefully make this a little clearer, I am doing the following in pseudo-code:
X = InitialCondition();
for k=2:numtimesteps
% turns X into a cell array, where each cell can go to a worker
XC = SliceFunction(X,numworkers);
XCnew = cell(1,numworkers);
parfor i=1:numworkers
XCnew{i} = UpdateFunction(@CalculateAB,XC{i},otherinputs); % otherinputs is much smaller than X
end
% final X, which becomes the input at the next timestep
X = [XCnew{:}];
end
function XCnew = UpdateFunction(CalculateAB,XC,otherinputs)
% this function calculates A,B, then solves x=A\B
% note A,B are each 3D arrays and I need to solve A*x=B for each 2D slice
[A,B] = CalculateAB(XC,otherinputs);
% this is a modification of the File Exchage multinv
% it turns the 3D A,B matrices into sparse 2D matrices and solves using the \ operator
XCnew = multimldivide(A,B);
end
So to summarize, I guess the question is this: can I keep information on the workers so that I don't have to send as much info back and forth? I'm hoping this could reduce the overhead involved with using parfor, so that I can continue to see speed improvements as I increase the number of cores. Or are there other tricks to reduce the overhead? I'm constantly going in and out of the parfor with each iteration of k.
Thanks!

로그인 to comment.

R2017a

## 답변 수: 1

Gaurav Garg

### Gaurav Garg (view profile)

님의 답변 29 Aug 2019
채택된 답변

Hi,
To save yourself from sending data back and forth for the worker threads, you can make a temporary array of cells which store the data for each worker thread and other threads can access/read/update the cells for their own as well as other workers.
It can reduce the overhead involved, although it’s not necessary. This is because speed improvements depend on many factors such as number of cores being the first one, but also context switching being a detrimental factor.
Each worker thread needs to save its state/data to the memory so that it can resume its execution from the point where it left when it comes back and load its state from the memory when it returns. This is known as context switching. Sometimes, this time of saving and loading data from/to memory is excessive when optimal conditions are not met. I suspect this might be the case you are encountering.
You can use the persistent variables if they might suit your case and needs.

jake555

### jake555 (view profile)

19 Sep 2019
Thank you for your answer, this was very helpful! And gives me a better understanding of what is happening "under the hood."

로그인 to comment.

Translated by