Slow down when accessing arrays
이전 댓글 표시
I have some code that really slows down when storing results. Code 2 runs orders of magnitude faster than Code1, especially as k increase in size (e.g. k = 500). I would need to store the results from each iteration i, but it seems very costly computationally. Why is this? Is there a better way to code it.
Code 1:
Q=zeros(k,k,N)
for i=1:N
Q(:,:,i) = Q(:,:,i) + e(i)^2*XtXt;
end
Code 2:
Q=zeros(k,k)
for i=1:N
Q = Q + e(i)^2*XtXt;
end
Here is the entire function if it is helpful. There are a lot of loops, but I found a clever way to eliminate any of them.
function varBhat = NeweyWest(e, X, L, invXX )
% e = (T x N)
% X = (T x k)
% L = scalar
% invXX = (k x k)
% Determine the size of the matrix of regressors
[T, k] =size(X);
% Determine the number of units
N = size(e,2);
% Calculate the Newey-West autocorrelation consistent covariance
Q = zeros(k,k,N);
for l = 0:L
w_l = 1-l/(L+1);
for t = l+1:T
if (l==0) % This calculates the S_0 portion
XtXt = compute_XtXt(X,t); %(k x k)
% reuse the computed matrix for all of N units
for i=1:N
Q(:,:,i) = Q(:,:,i) + e(t,i)^2 *XtXt;
end
else % This calculates the off-diagonal terms
XtXl = compute_XtXl(X,t,l,w_l); % (k x k)
% reuse the computed matrix for all of the generators
for i=1:N
Q(:,:,i) = Q(:,:,i) + e(t,i)*e(t-l,i)*XtXl;
end
end
end
end
Q = 1/(T-k) * Q;
% Calculate Newey-White standard errors (loops over each unit)
varBhat = finalNW(T, X, Q, invXX, N);
end
댓글 수: 2
Walter Roberson
2012년 5월 25일
Is XtXl a scalar, or is it a k x k array?
Joseph Cullen
2012년 5월 26일
답변 (2개)
Nathaniel
2012년 5월 25일
0 개 추천
How much memory in your computer, and how big is N? If you run out of available physical memory, then you will spend a lot of time waiting while your O/S swaps to and from your hard disk.
Image Analyst
2012년 5월 26일
0 개 추천
Well yeah! If the second case you're just overwriting a scalar N times so you're just doing N operations. In the first case you're assigning a kxk matrix N times. And with k = 500 that means 250,000 times N memory locations are being assigned. So I don't doubt that case 1 would run hundreds of times slower simply because you're assigning and storing hundreds of thousands more elements. What is the value of N? If it's less than about 5, then this should still happen in just a few seconds. But if N is also like 500, then it could take a very long time, and you might even run out of memory.
댓글 수: 5
Joseph Cullen
2012년 5월 26일
Image Analyst
2012년 5월 26일
OK, you're right that Q is a 2D matrix but a 2D matrix is still a lot less memory than a 3D matrix. It matters how you store things because they say that going down columns (incrementing rows first before moving over to the next column) is faster than going across rows first. I'm not sure why - yeah the next memory location is far away in memory rather than the very next memory location, but so what - we're talking ram memory here, not a hard drive. But apparently it does matter for some reason. So because your 3D matrix will involve memory farther away, it will take longer.
Walter Roberson
2012년 5월 26일
Fast CPUs can issue requests for chunks of memory on each read, as that reduces the average number of cycles the bus is occupied under the presumption that sequential memory will be used. The size of the chunks varies with the CPU, and the number of banks of memory can also make a difference. This kind of memory handling is generally known as "look-ahead cache".
On one of the servers I used to use (now obsolete), the optimum was 16 bytes (two double precision numbers) times 8 banks.
Joseph Cullen
2012년 5월 30일
Joseph Cullen
2012년 7월 11일
카테고리
도움말 센터 및 File Exchange에서 Logical에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!