Can you do this calculation any faster?

Hi there
I am trying to optimize some code, an example is given below. In my code, v_ustar etc are calculated elsewhere, and depend on q. This piece of code needs to run in a quite large loop (larger than the 1:1000 given as example here), and I don't think vectorization of the entire loops is possible due to RAM issues. N is typically 16, but can be larger as well.
I use Ubuntu and MATLAB R2014a (I will probably upgrade to R2014b soon)
Thanks in advance!
N=16;
for q=1:1000
%generate some random test data
v_ustar=rand(2*N,N,N);
vstar_u=rand(2*N,N,N);
u_ustar=rand(2*N,N,N);
vstar_v=rand(2*N,N,N);
F=...
repmat(reshape(v_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_u), [1 2*N N N]),[2*N 1 1 1])-...
repmat(reshape(u_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_v), [1 2*N N N]),[2*N 1 1 1]);
F=reshape(F,4*N^2,[]).';
end

댓글 수: 4

Oleg Komarov
Oleg Komarov 2014년 10월 15일
Do you create the inputs like that or those are for example?
Henrik
Henrik 2014년 10월 15일
편집: Henrik 2014년 10월 15일
Those are for example. Calculating them properly is quite quick but takes a lot of code, so I thought it's easier to just use random data of the right size.
In the real code, there are more than one outer loop, and v_ustar and u_ustar depend on one loop variable (q1), while vstar_u and vstar_v depend on the other loop variable (q2). I have calculated all the needed values of v_ustar etc. before this giant loop, so the actual code looks like
v_ustar=v_ustar_list(q1,:,:,:).
I left those things out since they don't take much time and I think they unnecessarily complicate matters.
The only small improvement I can think with this amount of code is:
F =...
bsxfun(@times, reshape(v_ustar,[2*N 1 N N]), reshape(conj(vstar_u), [1 2*N N N])) -...
bsxfun(@times, reshape(u_ustar,[2*N 1 N N]), reshape(conj(vstar_v), [1 2*N N N]));
You could get rid of the `reshape()` if you store:
v_ustar(:,1,:,:) = v_ustar_list(q1,:,:,:)
and finally get to:
F =...
bsxfun(@times, v_ustar, conj(vstar_u)) -...
bsxfun(@times, u_ustar, conj(vstar_v));
Henrik
Henrik 2014년 10월 15일
Thanks, this seems to give quite an increase in performance! If you post this as an answer I will accept it (I don't think comments can be accepted).

댓글을 달려면 로그인하십시오.

 채택된 답변

Sean de Wolski
Sean de Wolski 2014년 10월 15일
편집: Sean de Wolski 2014년 10월 15일

0 개 추천

Another (small) improvement you can make here is to pull some of the static computations out of the loop. For example
[2*N 1 N N]
Doesn't change at all so it's being recomputed 1000x. Instead, create a variable out of it outside of the loop and then reference this variable everywhere inside it.
What do you actually end up doing with F after the loop?
I also wouldn't be surprised if splitting the F calculation into a few separate lines might help the JIT accelerator.

댓글 수: 2

Henrik
Henrik 2014년 10월 21일
Thanks, that did speed up the calculation a bit. Sorry I forgot to accept your answer.
Could you explain what you mean about splitting the calculation?
There's more background to what I'm trying to achieve here, if you're interested: http://www.mathworks.com/matlabcentral/answers/158214-find-zero-of-function-with-least-amount-of-iterations
When you have a really long line of code like this:
F=...
repmat(reshape(v_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_u), [1 2*N N N]),[2*N 1 1 1])-...
repmat(reshape(u_ustar,[2*N 1 N N]),[1 2*N 1 1]).*...
repmat(reshape(conj(vstar_v), [1 2*N N N]),[2*N 1 1 1]);
F=reshape(F,4*N^2,[]).';
The JIT might not do as good a job optimizing it. If you break each piece, i.e. each line of repmat, into its own variable and then multiply the four variables, it might do a better job optimizing each piece.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Mathematics에 대해 자세히 알아보기

질문:

2014년 10월 15일

댓글:

2014년 10월 22일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by