How to transform these three nested FOR loops into a PARFOR loop?

Hi All,
I am trying to modify the three FOR nested loops below to allow the use of the PARFOR command.
A=sparse(m,n)
for j=1:N
for t1=1:T
for t2=t1:T % yes, t1 not 1, this is not a typo
A(f(j,t1,t2),g(j,t1,t2)) = h(j,t1,t2);
end
end
end
f, g and h are linear functions of j, t1 and t2; N>>T.
Replacing the outer FOR loop by a PARFOR loop obviously doesn't do it since A is not a sliced variable as it is right now because of the indexing.
Is there any way to do this?
Thanks in advance,
--Tanguy

 채택된 답변

Matt J
Matt J 2012년 10월 26일
Similar to Chris', I guess, but perhaps a little more readable/generalizable
A=sparse(m,n);
parfor i=1:n*T*T
[j,t1,t2] = ind2sub([N,T,T], i);
A(f(j,t1,t2),g(j,t1,t2)) = h(j,t1,t2);
end

댓글 수: 11

Thanks a lot for the reply. Note that in my initial example t2=t1:T (yes, t1, this is not a typo). Your solution works great for t2=1:T.
Any ideas on how to address the case where t2=t1:T?
Back to the ndgrid approach, I guess
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
parfor i=1:numel(J)
j=J(i); t1=T1(i); t2=T2(i);
A(f(j,t1,t2),g(j,t1,t2)) = h(j,t1,t2);
end
Thanks for your reply Matt. So... when I read your suggestions earlier I thought they would work great, but now that I'm actually testing them on matlab I still get the classification error on A.
Example:
N=10;
T=4;
A=sparse(N+T,N+T);
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
parfor i=1:numel(J)
j=J(i); t1=T1(i); t2=T2(i);
A(j+t1,j+t2)=j+t1+t2;
end
Error: The variable A in a parfor cannot be classified.
Ideas?
In your example, f and g are not 1-to-1 functions of j, t1, and t2. This means that you will be assigning to the same locations A(f,g) multiple times, overwriting previous values of h. Henceforth, I will assume that what you really want is
A(f(j,t1,t2),g(j,t1,t2)) = A(f(j,t1,t2),g(j,t1,t2)) + h(j,t1,t2);
Otherwise, it's not clear that there is any parallel-splitting possible in the computations.
With the above assumption, you can use MAT2TILES from the File Exchange and do as follows
N=10;
T=4;
A=sparse(N+T,N+T);
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
p=numel(J);
q=ceil(p/numlabs);
args=mat2tiles( [J+T1, J+T2, J+T1+T2],[q,1] );
parfor i=1:size(args,1)
A=A+sparse(args{i,:});
end
I've checked that the above code runs error free, but I increasingly doubt that parallel processing is suitable for what you're trying to do. The fastest thing is probably just to build A with a single command, as follows.
A=sparse(J+T1,J+T2,J+T1+T2, N+T, N+T);
Matt,
I really appreciate your input. I just realized reading your response that the small example I was using above to fix ideas was not the best since F: (j,t1,t2) --> (f(j,t1,t2),g(j,t1,t2)) was not injective in that case, which indeed means assigning the same locations A(f,g) multiple times. Sorry about that.
But your response greatly helped me to formulate what is my actual question:
If F: (j,t1,j2) --> (f(j,t1,t2),g(j,t1,t2)) is injective (with f and g not necessarily injective), is parallel processing suitable?
If F is injective, we are not assigning the same A(f,g) multiple times anymore. So it seems to me that we should be able to use parfor, right?
Below is an example closer to what I am trying to do:
N = 100; %in reality this number is much larger, that's why I would like to use parfor
T = 24;
RW=[]; %to track the values of f
CL=[]; %to track the values of g
K = (T+1)*T - sum([1:T]); %constant I use in the loop
A=sparse(N*T*(T+1)/2,(N-1)*T+T); %matrix to be filled in the loop
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
parfor i=1:numel(J)
j=J(i); t1=T1(i); t2=T2(i);
rw = (j-1)*K+(T+1)*(t1-1)-sum([1:(t1-1)])+t2-t1+1; %f(j,t1,t2)
cl = (j-1)*T+t1; %g(j,t1,t2)
% A(rw,cl) = 1 %h=1 to simplify
RW=[RW;rw];
CL=[CL;cl];
end
RC = [RW, CL]; %here I am tracking the values of F(j,t1,t2)
[u,I,J] = unique(RC, 'rows', 'first');
hasDuplicates = size(u,1) < size(RC,1) %here I make sure the same A(f,g) is not assigned multiple times.
First, I evaluate this piece of code without assigning values to A. I get hasDuplicates=0 which means that I am not trying to assign the same A(f,g) multiple times.
But then, when I try to actually assign values to A in the loop, I get the classification error on A... Am I missing something?
Tanguy,
If F is injective, then the PARFOR approach that I gave in my last comment will work just the same. However, it still seems highly doubtful to me that the Parallel Computing Toolbox will give the best performance here when everything you're doing above to build A, RW, and CL can be done instead with a few short vectorized commands. The overhead of transferring data to the labs and then consolidating the results from each of the labs seems like it might be greater than what loopless serial commands could do.
Below is how I would do it, but of course you can time both methods to test which is faster. Note that I've simplified your expressions for f and g somewhat, e,g. using the identity sum(1:n)=n*(n+1)/2
N = 100;
T = 24;
K = (T+1)*T/2; %constant I use in the loop
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
RW= (J-1).*K + (T-T1/2).*(T1-1) + T2;
CL= (J-1).*T +T1;
H=1;
A=sparse(RW,CL,H, N*T*(T+1)/2,(N-1)*T+T);
Matt,
Many thanks for your help! I compared both methods and the one using loopless serial commands is indeed way faster... I had never realized that these commands were so fast, thanks a lot.
Last question: in my initial code, instead of
rw = (j-1)*K+(T-t1/2)*(t1-1)+t2;
cl = (j-1)*T+t1
A(rw,cl) = h
I had:
rw = (j-1)*K+(T-t1/2)*(t1-1)+t2;
cl = [(j-1)*T+t1:(j-1)*T+t2]; % << change here
A(rw,cl) = h
Is there any way to do that only using loopless serial commands?
EDIT: I think I got it
[J,T1,T2,T3]=ndgrid(1:np, 1:T, 1:T, 1:T);
idx=(T2>=T1);
J=J(idx);
T1=T1(idx);
T2=T2(idx);
T3=T3(idx);
idx=(T3<=(T2-T1+1));
J=J(idx);
T1=T1(idx);
T2=T2(idx);
T3=T3(idx);
RW= (J-1).*tempT + (T-T1/2).*(T1-1) + T2;
CL= (J-1).*T +T1-1+T3;
H=1;
A=sparse(RW,CL,H, np*T*(T+1)/2,NbVar);
So everything's fine now?
So, the approach using ndgrid works fine to get rid of the 3 FOR loops (and now actually 4 with t3) I was mentioning in my initial question, and fill the sparse matrix A.
Now, in real life I have to deal with 2 additional nested loops, t4 and t5, to fill another sparse matrix b in addition to A:
N= big number (e.g. 10000)
T=24;
A=sparse(N*T*(T+1)/2,(N-1)*T+T);
b=sparse(N*T*(T+1)/2,1);
tempT = (T+1)*T/2;
for j=1:N
for t1=1:T
for t2=t1:T
for t3=1:t2-t1+1
rw = (j-1)*tempT + (t1-1)*(T-t1/2)+t2;
cl = (j-1)*T+t1-1+t3;
A(rw,cl) = 1;
for t4=t1:t2
for t5=t4:t2
b(rw) = b(rw) + dd(t4,t5,j); % dd is given
end
end
end
end
end
end
Using the same ndgrid approach leads to something like this:
[J,T1,T2,T3,T4,T5]=ndgrid(1:N, 1:T, 1:T, 1:T, 1:T, 1:T);
Do operations on J,T1,T2,T3,T4,T5 to reflect conditions on t2,t3,t4,t5
Calculate A and b
However, at this point:
- I don't see how to calculate b only using loopless serial commands
- Initializing this huge ndgrid takes a lot of time, and memory. I actually run out of memory pretty quickly.
Maybe an option would be to encapsulate the 5 inner FOR loops into a function that uses the ndgrid approach to get rid of the loops (with j as an input to that function), and then make the outer loop a PARFOR loop? Or is there a better approach?
I would really appreciate any comments and/or suggestions. Thanks
Yes, a different approach will be required. First though, Accept-click this answer (since you say it covers the 3-loop problem in your original post) and start a new post for your new question.

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Chris A
Chris A 2012년 10월 26일
편집: Chris A 2012년 10월 27일
Here is a possible solution:
A=sparse(m,n)
indexes=tril(reshape(1:T*T, T, T));
u1=indexes(indexes>0);
v1=[0:T*T:N*T*T-1];
indexes1=kron(ones(numel(u1),1), v1);
indexes2=kron(u1,ones(1,numel(v1)));
indexes=indexes1+indexes2;
for i=1:numel(indexes),
n=indexes(i),
j=floor((n-1)/(T*T))+1;
t1=mod(floor((n-1)/T),T)+1;
t2=mod(n-1,T)+1;
if (t2 < t1),
error('Incorrect index.');
end
A(f(j,t1,t2),g(j,t1,t2)) = h(j,t1,t2);
end

댓글 수: 6

Thanks Chris, your suggestion works great for t2=1:T. Any ideas on how to address the case where t2=t1:T? Thanks for your help.
EDIT: When I read your suggestion I thought it would work great for t2=1:T, but I just implemented it for the following small example and I still get the classification error on A...
N=10;
T=4;
A=sparse(N+T,N+T);
parfor n=1:(N*T*T),
j=floor((n-1)/(T*T))+1;
t1=mod(floor((n-1)/T),T)+1;
t2=mod(n-1,T)+1;
A(j+t1,j+t2) = j+t1+t2;
end
Error: The variable A in a parfor cannot be classified.
Ideas?
Tanguy,
I have modified the solution for the case t2=t1:T.
Please let me know if this works or you still have problems.
Regards
Chris
Chris,
Thanks for your response. Your solution works great to get rid of the two nested FOR loops, but is there a way to parallel split the remaining FOR loop? When I try on the following example (with a PARFOR loop) I still get the classification error on A. Thanks for your help.
--T
N = 100; %in reality this number is much larger
T = 24;
K = (T+1)*T - sum([1:T]); %constant I use in the loop
A=sparse(N*T*(T+1)/2,(N-1)*T+T);
indexes=tril(reshape(1:T*T, T, T));
u1=indexes(indexes>0);
v1=[0:T*T:N*T*T-1];
indexes1=kron(ones(numel(u1),1), v1);
indexes2=kron(u1,ones(1,numel(v1)));
indexes=indexes1+indexes2;
parfor i=1:numel(indexes),
n=indexes(i);
j=floor((n-1)/(T*T))+1;
t1=mod(floor((n-1)/T),T)+1;
t2=mod(n-1,T)+1;
if (t2 < t1),
error('Incorrect index.');
end
rw = (j-1)*K+(T+1)*(t1-1)-sum([1:(t1-1)])+t2-t1+1; %f
cl = (j-1)*T+t1; %g
A(rw, cl) = 1; %h=1 to simplify
end
Question:
Where is tempT defined? I do not have the parallel module so I cannot test my code out before I post it?
N = 100; %in reality this number is much larger
T = 24;
K = (T+1)*T - sum([1:T]); %constant I use in the loop
A=sparse(N*T*(T+1)/2,(N-1)*T+T);
indexes=tril(reshape(1:T*T, T, T));
u1=indexes(indexes>0);
v1=[0:T*T:N*T*T-1];
indexes1=kron(ones(numel(u1),1), v1);
indexes2=kron(u1,ones(1,numel(v1)));
indexes=indexes1+indexes2;
parfor i=1:numel(indexes),
n=indexes(i);
j=floor((n-1)/(T*T))+1;
t1=mod(floor((n-1)/T),T)+1;
t2=mod(n-1,T)+1;
% if (t2 < t1),
% error('Incorrect index.');
% end
% rw = (j-1)*tempT+(T+1)*(t1-1)-sum([1:(t1-1)])+t2-t1+1; %f
% cl = (j-1)*T+t1; %g
% A(rw, cl) = 1; %h=1 to simplify
end
Does this code run without an error?
Chris
tempT = K. Yes the code above runs without error. The error arises when I try to assign values to A in the parfor loop:
A(rw, cl) = 1;
It says the variable A in parfor cannot be classified, which makes sense according to what they say on this page (see paragraph on Form of Indexing): http://www.mathworks.com/help/distcomp/advanced-topics.html#bq_tcng-1
Does this mean there is no way we can parallel split this FOR loop? I don't know...
See if this works.
N = 100; %in reality this number is much larger
T = 24;
K = (T+1)*T - sum([1:T]); %constant I use in the loop
A=sparse(N*T*(T+1)/2,(N-1)*T+T);
indexes=tril(reshape(1:T*T, T, T));
u1=indexes(indexes>0);
v1=(0:T*T:N*T*T-1);
indexes1=kron(ones(numel(u1),1), v1);
indexes2=kron(u1,ones(1,numel(v1)));
indexes=indexes1+indexes2;
B=zeros(numel(indexes),3);
parfor i=1:numel(indexes),
n=indexes(i);
j=floor((n-1)/(T*T))+1;
t1=mod(floor((n-1)/T),T)+1;
t2=mod(n-1,T)+1;
% if (t2 < t1),
% error('Incorrect index.');
% end
rw = (j-1)*tempT+(T+1)*(t1-1)-sum([1:(t1-1)])+t2-t1+1; %f
cl = (j-1)*T+t1; %g
% A(rw, cl) = 1; %h=1 to simplify
B(i,:)=[rw cl 1];
end
A(sub2ind(size(b), B(:,1), B(:,2)))=1;

댓글을 달려면 로그인하십시오.

Matt J
Matt J 2012년 10월 26일
편집: Matt J 2012년 10월 26일
A=sparse(m,n);
[J,T1,T2]=ndgrid(1:N, 1:T, 1:T);
parfor i=1:numel(J)
j=J(i); t1=T1(i); t2=T2(i);
A(f(j,t1,t2),g(j,t1,t2)) = h(j,t1,t2);
end

카테고리

도움말 센터File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

질문:

2012년 10월 26일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by