parfor variable classification issue revisited

조회 수: 2 (최근 30일)
Craig
Craig 2023년 8월 11일
댓글: Jeff Miller 2023년 8월 18일
I have a million (literally) text files that I need to read a number from. I currently do this in a nested loop as such:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(prod([len_A, len_B, len_C, len_D, len_E]), 6);
for ind_A = 1 : len_A
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output(line_num, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
end
This is time intensive. Since my disk and processor are not maxed out, I wanted to do this in parallel and speed it up. Based on: https://www.mathworks.com/matlabcentral/answers/838625-parfor-variable-classification-issue, I tried:
output = zeros(prod([5, 6, 7, 8, 9]), 6);
% output = zeros(1, 7);
parfor ind_A = 1 : 5
output_temp = zeros(prod([6, 7, 8, 9]), 6);
count = 0;
for ind_B = 1 : 6
for ind_C = 1 : 7
for ind_D = 1 : 8
for ind_E = 1 : 9
count = count + 1;
line_num = sub2ind([9, 8, 7, 6, 5], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(count, 1:6) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
max_line_num = sub2ind([9, 8, 7, 6, 5], 9, 8, 7, 6, ind_A);
min_line_num = max_line_num - prod([9, 8, 7, 6, 1]) + 1;
output(min_line_num : max_line_num, :) = output_temp;
end
I am unable to figure out how to make this work. I would truly appreciate any help you could provide.

채택된 답변

Walter Roberson
Walter Roberson 2023년 8월 11일
Clear a multidimensional array. parfor along one of the dimensions, preferably the last.
Within the parfor loop, use nested for loops and multidimensional indexing to assign values to a temporary array that is the right size except for being length 1 along the dimension you are parfor over. After you have assigned all the values to the temporary array,
output(:,:,:,:,INDEX, :) = output_temp;
If you need to, then after the parfor loop, reshape() to collapse those other dimensions.
It is important that the only place you write into the output variable, that the indices be one of ":", or an expression that is constant throughout the parfor, or a linear transform of the parfor variable. Using a computed range like you are doing is Not Permitted.
  댓글 수: 2
Craig
Craig 2023년 8월 18일
편집: Craig 2023년 8월 18일
By following Walter's suggestions, and after some work such as changing the parfor from Walter's recommendation of the last index to the first, this is what I finally got to work for me:
len_A = 5;
len_B = 6;
len_C = 7;
len_D = 8;
len_E = 9;
output = zeros(len_A, len_B, len_C, len_D, len_E, 6);
parfor ind_A = 1 : len_A
output_temp = zeros(len_B, len_C, len_D, len_E, 6);
for ind_B = 1 : len_B
for ind_C = 1 : len_C
for ind_D = 1 : len_D
for ind_E = 1 : len_E
line_num = sub2ind([len_E, len_D, len_C, len_B, len_A], ind_E, ind_D, ind_C, ind_B, ind_A);
% Real Script
% open a file from the disk, read in a number
% output_temp(count, :) = [line_num, ind_A, ind_B, ind_C, ind_D, ind_E, the number from line above];
% Example Script
output_temp(ind_B, ind_C, ind_D, ind_E, 1:6) = [line_num ind_A, ind_B, ind_C, ind_D, ind_E];
end
end
end
end
output(ind_A, :, :, :, :, :) = output_temp;
end
output = reshape(output, prod([len_A, len_B, len_C, len_D, len_E]), 6);
output = sortrows(output, 1);
Walter Roberson
Walter Roberson 2023년 8월 18일
The reason I suggested parfor over the last dimension instead of the first, is that the way multidimensional arrays are stored, the any leading : dimensions are stored in consecutive memory -- so if you had A(:,:,idx) then A(1:end,1:end,idx) would be stored in consecutive memory. But if you had A(idx,:,:) then each piece of data would be size(A,1) apart from each other in memory, which is not as efficient to transfer as consecutive memory.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Jeff Miller
Jeff Miller 2023년 8월 16일
편집: Jeff Miller 2023년 8월 18일
Maybe something like this would be helpful, using the wonderful allcomb.
idx = allcomb(1:5,1:6,1:7,1:8,1:9);
nrows = size(idx,1);
output = zeros(nrows,6);
parfor ind_row = 1:nrows
idx_A = idx(ind_row,1);
idx_B = idx(ind_row,2);
idx_C = idx(ind_row,3);
idx_D = idx(ind_row,4);
idx_E = idx(ind_row,5);
result = yourActualFn(idx_A,idx_B,idx_C,idx_D,idx_E);
output(ind_row,:) = [idx(1:5), result];
end
  댓글 수: 2
Craig
Craig 2023년 8월 18일
Thanks for the reply Jeff. This might allow the calculation of the "line_num", but I don't see how it would allow me to do all the other work in the real script.
Jeff Miller
Jeff Miller 2023년 8월 18일
@Craig, Glad you got the problem solved.
Just for future reference, I edited the script to make it clearer what I thought you might do. Could be that I don't understand what other work you want to do in the real script, though.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Matrices and Arrays에 대해 자세히 알아보기

제품


릴리스

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by