More Efficient ismember Calculation
이전 댓글 표시
Hello, I'm working on an interpolation for a specific project that I'm working on and have managed to produce a very fast interpolation function for N-D array for042. There is one final bottleneck that is taking up 84% of the total run-time of the program. Inserted are the profiler and exemplar code, any and all optimizations are greatly appreciated, but the particular line is obvious from the profiler.

alpha = -5:5:30;
mach = 1:6:25;
beta = -5:2.5:5;
num = zeros(length(alpha),length(mach),length(beta));
for cnt = 1:numel(alpha)
for cnt1 = 1:numel(mach)
for cnt2 = 1:length(beta)
num(cnt,cnt1,cnt2) = 3*alpha(cnt)+mach(cnt1)/2+beta(cnt2);
end
end
end
for cnt2 = 1:length(beta)
data.for006{cnt2}.cd = num(:,:,cnt2);
end
Is an example of the data set that we're working with.
And the line in question from the profiler is part of this block:
a = zeros(1,2^size(CURRENT,1)-1);
curridx = zeros(1,size(CURRENT,1));
for cnt = 1:(2^size(CURRENT,1))
change = find((rem(cnt-1,2.^(0:size(CURRENT,1)-1))==0)==1);
curridx(change) = 1*curridx(change)==0;
idx = lidx.*(curridx==0) + uidx.*(curridx==1);
chkvals = zeros(1,length(idx));
for cnt1 = 1:length(idx)
B = RANGES{cnt1};
chkvals(cnt1) = B(idx(cnt1));
end
chkvals = [chkvals(2),chkvals(4:end)];
% INEFFICIENT LINE %
a(cnt) = DATA.for042{find(ismember(DATA.permutation,chkvals,'rows'))}.cn(sub2ind_a(siz,idx));
end
댓글 수: 11
Walter Roberson
2020년 4월 16일
the find() can be skipped. ismember returns logical and that can be used as logical indexing.
I cannot read the rest of the line of code that is invoking ismember.
Walter Roberson
2020년 4월 16일
is there a reason for for006 to be a cell array of struct instead of an array of struct?
Ayden Clay
2020년 4월 16일
Ayden Clay
2020년 4월 16일
Walter Roberson
2020년 4월 16일
What is sub2ind_a() ?
Why are you not assigning size(CURRENT,1) to a variable? It does not change inside the loop.
2^size(CURRENT,1)-1 is used multiple times too, and should be assigned to a variable.
Ayden Clay
2020년 4월 16일
Walter Roberson
2020년 4월 16일
Guidelines for using find():
- If you have a relational test that is used exactly once, and you find(), and you use the result of the find() only to index an array: then skip the find() and use the result of the relational test as a logical index.
- if you are doing computation on the indices returned by find() then it might be worth retaining the find(). For example you might be wanting to compute the distance between events
- If you are updating corresponding locations, matrix(locations) = f(matrix(locations)) then one could hypothesize that taking find(locations) and using that might in some cases be faster than using logical indexing, because there would be less for the assignment to examine (just change a few locations directly, right?) . However in my tests with large arrays, using logical indexing is faster even for a small number of output locations, and is notably faster if there are a large number of output locations.
Note: this timing test takes a few minutes to execute due to the size of the arrays. For each test, a new output array the same size as the input has to be created.
data = rand(83,19,207,51,3);
data(12345) = -1;
data(876543) = -1;
N = 10;
t1 = zeros(N,1);
t2 = zeros(N,1);
t3 = zeros(N,1);
t4 = zeros(N,1);
f1 = @()fun1(data);
f2 = @()fun2(data);
f3 = @()fun3(data);
f4 = @()fun4(data);
for K = 1 : N; t1(K) = timeit(f1,0); end
for K = 1 : N; t2(K) = timeit(f2,0); end
for K = 1 : N; t3(K) = timeit(f3,0); end
for K = 1 : N; t4(K) = timeit(f4,0); end
plot([t1,t2,t3,t4]);
legend({'find0', 'mask0', 'find5', 'mask5'});
m1 = mean(t1); m2 = mean(t2); m3 = mean(t3); m4 = mean(t4);
ms = [m1;m2;m3;m4];
disp('timings')
disp(ms);
m = min(ms);
disp('ratios')
disp(ms ./ m);
function fun1(data)
%small number of changes, find
idx = find(data<0);
data(idx) = data(idx) * 2; %#ok<NASGU>
end
function fun2(data)
%small number of changes, logical indexing
idx = data<0;
data(idx) = data(idx) * 2; %#ok<NASGU>
end
function fun3(data)
%many places, find
idx = find(data<0.5);
data(idx) = data(idx) * 2; %#ok<NASGU>
end
function fun4(data)
%many places, logical indexing
idx = data<0.5;
data(idx) = data(idx) * 2; %#ok<NASGU>
end
Walter Roberson
2020년 4월 16일
I cannot tell what the size() of lidx and uidx are ? And when could length(idx) be different than any other run? Anything that is calculated repeatedly in the loop should be moved out to as far out as it can go in the loops so it is calculated as few times as possible.
Ayden Clay
2020년 4월 17일
Walter Roberson
2020년 4월 17일
But the length of idx does not depend upon the contents of CURRENT, only on the size of CURRENT, right? So you can pre-compute it.
In the great majority of cases, if you can move a computation out of a loop, doing so will result in more efficient code. This is not always the case, but most of the time.
Ayden Clay
2020년 4월 20일
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Shifting and Sorting Matrices에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

