how to make this faster?

조회 수: 6 (최근 30일)
Yulai Zhang
Yulai Zhang 2022년 8월 13일
댓글: Yulai Zhang 2022년 8월 15일
I have A and B two large arrays of the same size, say, 2000 by 2000 by 5000 in size.
The operation below is very slow and uses lot of memory:
A(B==1)=2;
Is there a better way of doing this?
Thanks!
  댓글 수: 5
Jan
Jan 2022년 8월 13일
@Yulai Zhang: Please mention, what the type of A and B is. Is B logical or double? Can you work with INT8 instead?
Even on a machine with a lot of RAM (1 TB seems sufficient for your problem), writing 150 GB of data needs some time.
dpb
dpb 2022년 8월 13일
Could sparse arrays possibly help here?
Or possibly processing smaller chunks at a time could actually help throughput rather than trying to address the whole thing in one big bite?

댓글을 달려면 로그인하십시오.

답변 (2개)

Walter Roberson
Walter Roberson 2022년 8월 13일
The programming model of
A(B==1)=2;
is as if it first compares each element of B to 1, creating a temporary logical array, and then calls the subsassgin function to assign 2 to the locations in A referred to by the logical array.
That is, the programming model calls for B==1 to be completely completed before the assignment starts. It does not internally optimize the code to a loop along the lines of
for K = 1 : numel(A)
if B(K) == 1; A(K) = 2; end
end
except in some internal programming language (generated machine code for example)
Would it be possible in theory to detect that the indexing expression is "simple" and optimize to this kind of loop? Perhaps. But it would have to be fairly constrained. For example,
A = zeros(3,5,'sym');
syms B
A(A <= 0) = -1
A = 
A(end) = B
A = 
try
A(A <= 0) = -2
catch ME
fprintf('assignment failed\n')
lasterror
end
assignment failed
ans = struct with fields:
message: 'Error using evalin↵Unrecognized function or variable 'x'.' identifier: 'MATLAB:UndefinedFunction' stack: [5×1 struct]
A
A = 
The error message is wacky (should be an error about being unable to decide whether the comparison is true.
The important point here is to notice that A did not get changed by the failed assignment: that in the general case you have to fully evaluate the indexing expression first in case it fails.
In the simple case of a comparison of a double array to a constant, there cannot be a failure (provided the double array does not exceed the size of the original matrix... but there can be a failure if the double array is larger...)
So optimization would sometimes be possible, but mostly not.
What you could do in order to reduce the memory load is to process in chunks, such as a group of 10 three dimensional panes at a time. However if you do that then in order to get the assignment right you would have to calculate the locations of all of the indices using find() and sub2ind() -- because there is no way to ask for something like
A(logical2DArray, [], 5)
meaning that you wanted the logical2DArray to expand to indicate the first two dimensions.
It would certainly be possible to reduce memory pressure using these techniques, but it would probably not be faster if you do have available memory.
  댓글 수: 2
Walter Roberson
Walter Roberson 2022년 8월 13일
The straightforward one-at-a-time loop would use the least memory and if you used linear indexing the indexing calculations would not be too bad, but it would be slow.
Hmmm... perhaps this might not be too bad.
PerGroup = 1e7; %1e7 should give 2000 iterations
for K = 1 : PerGroup : numel(A)-PerGroup+1
start = (K-1)*PerGroup;
idx = find(B(start+1:K*PerGroup) == 1);
A(start + idx) = 1;
end
Yulai Zhang
Yulai Zhang 2022년 8월 15일
Thanks for your time and ideas.
Yes, I think for this particular case the one-at-a-time loop might be a good option since it does not need much memory.

댓글을 달려면 로그인하십시오.


Jan
Jan 2022년 8월 14일
Some ideas:
A = rand(2000, 2000, 5000);
B = randi([0,3], 2000, 2000, 5000);
tic;
A(B==1)=2;
toc
tic;
for k = 1:numel(A)
if B(k)==1
A(k) = 2;
end
end
toc
tic;
parfor k = 1:numel(A)
if B(k)==1
A(k) = 2;
end
end
toc
tic
A = reshape(A, 4e6, 5000);
B = reshape(B, 4e6, 5000);
parfor k = 1:500
a = A(:, k);
a(B(:, k)==1) = 2;
A(:, k) = a;
end
toc
  댓글 수: 1
Yulai Zhang
Yulai Zhang 2022년 8월 15일
let me try...
Thanks a lot!

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by