how to optimize this nested for loop
조회 수: 6 (최근 30일)
이전 댓글 표시
I have 2 nested loops which do the following:
- Get two rows of a matrix
- Check if indices meet a condition or not
- If they do: calculate xcorr between the two rows and put it into new vector
- Find the index of the maximum value of sub vector and replace element of LAG matrix with this value
I dont know how I can speed this code up by vectorizing or otherwise.
b=size(data,1);
F=size(data,2);
LAG= zeros(b,b);
for i=1:b
for j=1:b
if j>i
x=data(i,:);
y=data(j,:);
d=xcorr(x,y);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(i,j)=I-1;
d=xcorr(y,x);
d=d(:,F:(2*F)-1);
[M,I] = max(d);
LAG(j,i)=I-1;
end
end
end
댓글 수: 2
채택된 답변
Jan
2017년 12월 14일
편집: Jan
2017년 12월 17일
Just some tiny modifications:
b = size(data,1);
F = size(data,2);
LAG = zeros(b, b);
for i = 1:b
x = data(i,:);
for j = i+1:b
y = data(j,:);
d = xcorr(x,y);
[~,I] = max(d(F:(2*F)-1));
LAG(i,j) = I-1;
d = xcorr(y,x);
[~,I] = max(d(F:(2*F)-1));
LAG(j,i) = I-1;
end
end
It is a waste of time to calculate the complete xcorr, of you use (F:end) of the output only. Maybe you can create your own xcorr, which calculates the wanted part only. As far as I can see, the maxlag value will not help directly.
[EDITED] Modify the xcorr() function to be leaner:
function ck = leanXCorr(x, y)
len = length(x);
c = [zeros(1,len-1), x];
d = [y, zeros(1,len-1)];
% Compute FFTs
X1 = fft(c);
X2 = fft(d);
% Compute cross correlation
X = X1 .* conj(X2);
ck = ifft(X);
end
[EDITED 2] This was a bad idea. It works faster for input vectors of size 1x100, but the padding with zeros is inefficient for the FFT. Better use a window length based on NEXTPOW2:
function ck = leanXCorr(x, y)
m = length(x);
mxl = m - 1;
m2 = 2 ^ nextpow2(2*m - 1);
% Compute FFTs
X1 = fft(x, m2, 2);
X2 = fft(y, m2, 2);
% Compute cross correlation
c = ifft(X1 .* conj(X2), [], 2);
ck = [c((m2 - mxl + 1):m2), c(1:mxl + 1)];
end
Now I use input: data = rand(15, 41200) and get 3.66 sec instead of 3.82 with the built-in xcorr. Not impressing. But the next step is the actual improvement: If you inline the code into the loop, you can omit the repeated FFT of x and y:
b = size(data, 1);
F = size(data, 2);
LAG = zeros(b, b);
m = F;
mxl = m - 1;
m2 = 2 ^ nextpow2(2*m - 1);
for i = 1:b
X1 = fft(data(i, :), m2, 2);
for j = i+1:b
X2 = fft(data(j, :), m2, 2);
X12 = X1 .* conj(X2);
c = ifft(X12, [], 2);
d = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];
[~, I] = max(d);
LAG(i,j) = I - 1;
c = ifft(conj(X12), [], 2);
d = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];
[~, I] = max(d);
LAG(j,i) = I - 1;
end
end
This inlined version needs 2.14 sec instead of 3.81 of the original.
Unfortunately this still calculates fft(data(j, :), m2, 2) for the same values. If you could store fft(data, m2, 2), the code would run much faster. But your original data need more than 8GB of RAM. But on a 64GB machine, this would be faster:
FM = fft(data.', m2, 1);
for i = 1:b
X1 = FM(:,i); % fft(data(i, :), m2, 2);
for j = i+1:b
% X2 = fft(data(j, :), m2, 2);
X2 = FM(:,j);
X12 = X1 .* conj(X2);
c = ifft(X12, [], 1);
d = [c((m2 - mxl + 1 + F):m2); c(1:mxl+1)];
[~, I] = max(d);
LAG(i,j) = I - 1;
c = ifft(conj(X12), [], 1);
d = [c((m2 - mxl + 1 + F):m2); c(1:mxl+1)];
[~, I] = max(d);
LAG(j,i) = I - 1;
end
end
I have transposed the input, because it is faster to access the data in columnwise order.
댓글 수: 7
Jan
2018년 2월 23일
The FFT works more efficient, if the n-point DFT uses a power of 2. Therefore:
m2 = 2 ^ nextpow2(2*m - 1);
gets such a value. Instead of padding the data with zeros, a larger window length is used. The result is the same (compare [EDITED] with [EDITED2]), but the speed is higher. But now the output after IFFT is sorted differently, such that
d = [c((m2 - mxl + 1 + F):m2), c(1:mxl+1)];
picks out the elements exactly as if they have been created using the padding.
Simply try the two versions of leanXCorr() posted above:
function ck = leanXCorr1(x, y)
len = length(x);
c = [zeros(1,len-1), x];
d = [y, zeros(1,len-1)];
% Compute FFTs
X1 = fft(c);
X2 = fft(d);
% Compute cross correlation
X = X1 .* conj(X2);
ck = ifft(X);
end
and
function ck = leanXCorr2(x, y)
m = length(x);
mxl = m - 1;
m2 = 2 ^ nextpow2(2*m - 1);
% Compute FFTs
X1 = fft(x, m2, 2);
X2 = fft(y, m2, 2);
% Compute cross correlation
c = ifft(X1 .* conj(X2), [], 2);
ck = [c((m2 - mxl + 1):m2), c(1:mxl + 1)];
end
with some small input. Step through the code line by line using the debugger and compare the contents of the variables.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Systems of Nonlinear Equations에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!