how to make the .m code faster ?

Question

0 개 추천

I wrote a matlab code but the execution time is more than 25 minutes . I think it is because the for loop and the index. so i have a matrix of data H(435*4258) , and i am trying to take vaulues from it depend on another values from another matrix. x,y have the same size (435*4258) like H and contain the location of the data in H . the output matrix has c(869*869) . what Iam trying to do is making a rectangle around every point and search for each point in the matrix data is in this range and take the average . can somebody help me to make it faster and less than 2 seconds. the code is like interpolation. is there any way to make it faster ? the code :

 destance = 7.5;
d = distance/2;
m = 869;
n = 869;
rx = 3255;
c= zeros(m ,n);
newx = -rx:distance:+rx;
newy = -rx:distance:+rx;
for i = 1:m
    yc = newy(i)+d;
    yf = newy(i)-d;
    for j = 1:n
          xc = newx(j)+d;
          xf = newx(j)-d;
         ind = find(x>= xf & x<=xc & y >=yf & y<=yc); 
         if isempty(ind)
            c(i,j) = NaN; 
           else
           p = mean(h(ind));
           c(i,j) = p;
         end
    end
end

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

Guillaume 2017년 4월 3일

The code is bound to be slow. For each i, it rescans the x array which does not change between i for a given j. For each j, it rescans the y array which does not change for a given i. In other words, it performs m*n scans where only m+n scans are required at most.

Stephen23 2017년 4월 4일

https://www.mathworks.com/help/matlab/matlab_prog/techniques-for-improving-performance.html

https://www.mathworks.com/help/matlab/matlab_prog/preallocating-arrays.html

https://www.mathworks.com/help/matlab/matlab_prog/vectorization.html

https://www.mathworks.com/matlabcentral/answers/228557-experts-of-matlab-how-did-you-learn-any-advice-for-beginner-intermediate-users

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Guillaume 2017년 4월 3일

편집: Guillaume 2017년 4월 4일

MATLAB Online에서 열기

1 개 추천

First, a piece of advice, don't hardcode values that matlab can easily calculate. In your code m and n are the number of elements in vectors newx and newy, so you shouldn't hardcode them. One day, you'll decide to change distance or rx and your program will fail because you'll forget to recalculate m and n. Much safer:

m = numel(newx); 
n = numel(newy);

Anyway, if I understood correctly, this will produce the same result as your code:

distance = 7.5;
rx = 3255;
newx = -rx:distance:rx;
newy = -rx:distance:ry;    %shouldn't there be a ry?
binx = discretize(x(:), newx - distance/2);  %find index of x bin centered on newx with width distance
biny = discretize(y(:), newy - distance/2);  %find index of y bin centered on newy with width distance
c = accumarray([biny, binx], h(:), [], @mean);  %calculate the mean of all the values that fall within a bin

And should be much faster than your code which scan over and over the x and y arrays.

댓글 수: 14
이전 댓글 12개 표시 이전 댓글 12개 숨기기

Guillaume 2017년 4월 4일

편집: Guillaume 2017년 4월 5일

MATLAB Online에서 열기

I made a silly mistake in the code I wrote in the comment to Jan's answer. Initialising the sum matrix to NaN is never going to sum to anything but NaN. The correct code is:

edited: fixed off by 1 calculation of size of bin matrices.

outsize = [2*rx/distance, 2*rx/distance];
binsum = zeros(outsize);
bincount = zeros(outsize);
for idx = 1:numel(h)
   binx = 1 + floor((x(idx)+rx)/distance);  %discretisation of x
   biny = 1 + floor((y(idx)+rx)/distance);  %discretisation of y   
   if binx > 0 & biny > 0 & binx <= outsize(1) & biny <= outsize(2)
      binsum(biny, binx)  = binsum(biny, binx) + h(idx);
      bincount(biny, binx) = bincount(biny, binx) + 1;
   end
end
binmean = binsum ./ bincount; %internally loops over the bins.

To my surprise it is much much faster than my original solution.

Be aware that there is a minor difference between the two solutions, the grid in this latest solution is centered on the mid-points between the -rx:distance:rx whereas your original code and my accumarray code center the grid on the -rx:distance:rx points.

I'm going to assume it was a mistake in your original code (and this was also why you got NaNs in the discretize calls)

Jony Muller 2017년 4월 5일

Error using bsxfun Requested 1852230x163817 (2260.7GB) array exceeds maximum array size preference. Creation of arrays greater than this limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more information.

Guillaume 2017년 4월 5일

MATLAB Online에서 열기

Well use a loop then. More than 20% of your bins are NaN. That's a lot. as a result you're trying to do 303,426,761,910 hypot calculation at once, which is going to need a lot of memory.

[binycentres, binxcentres] = ndgrid((-rx:distance:rx) + distance/2);
for binidx = find(isnan(binmean))'
   binidx
   disttocentre = hypot(x(:) - binxcentres(binidx), y(:) - binycentres(binidx));
   [~, nearestidx] = min(disttocentre);
   binmean(binidx) = h(nearestidx);
end

That loop only does 1,852,230 hypot at once, but does it 163,817 times. You're trading speed for memory.

No matter what it's going to be very slow. You probably want to rethink what you're doing if you want to do this in real time.

댓글을 달려면 로그인하십시오.

Answer 2

Jan 2017년 4월 3일

편집: Jan 2017년 4월 3일

MATLAB Online에서 열기

1 개 추천

I assume Guillaume's suggestion is faster. For a comparison thry this cleaned loop:

distance = 7.5;  % Not "destance"
d  = distance/2;
m  = 869;
n  = 869;
rx = 3255;
c  = nan(m, n);
newx = -rx:distance:+rx;
newy = -rx:distance:+rx;
for i = 1:m
  yc = newy(i) + d;
  yf = newy(i) - d;
  yi = (y >= yf & y <= yc);  % Once per loop only
  for j = 1:n
     xc  = newx(j) + d;
     xf  = newx(j) - d;
     ind = (x >= xf & x <= xc & yi); 
     if any(ind)
       c(i, j) = sum(h(ind)) / sum(ind);
     end
  end
end

@Jony Muller: Please run a TIC/TOC and post the results for the two methods. Thanks.

댓글 수: 19
이전 댓글 17개 표시 이전 댓글 17개 숨기기

Guillaume 2017년 4월 4일

편집: Guillaume 2017년 4월 4일

MATLAB Online에서 열기

@Jony,

Internally, my solution involves 4 loops (I assume, if discretize and accumarray are implemented optimally)

one loop over the x values, for the first discretize
one loop over the y values, for the second discretize
one loop over corresponding x,y and h values by accumarray to accumulate the h values into their corresponding bin
one loop over the bins by accumarray to calculate the bin mean

All these loops are implemented in mex files or directly within the matlab compiled code so they'll run much faster than anything you'd write in m code.

You could reduce it to two loops by doing the discretisation and accumulation in the same loop. Code in m would be something like:

binsum = nan(2*rx/distance + 1);
bincount = zeros(size(binsum));
for idx = 1:numel(h)
   binx = 1 + floor((x(idx)+rx)/distance);  %discretisation of x
   biny = 1 + floor((y(idx)+rx)/distance);  %discretisation of y   
   if binx > 0 & biny > 0 & binx <= m & biny <= m
      binsum(biny, binx)  = binsum(biny, binx) + h(idx);
      bincount(biny, binx) = bincount(biny, binx) + 1;
   end
end
binmean = binsum ./ bincount; %internally loops over the bins.

This is optimal in the number of loops. You could make it even faster, if you got rid of the branching if by ensuring that all x and y are within -rx:+rx.

However, unless implemented in mex, I doubt it'll be faster than my discretize + accumarray solution.

Jan 2017년 4월 4일

@Jony: As long as you cannot provide input data, I cannot test some ideas I have. I understand that it is not trivial due to the file size. But this is your problem, so it is your turn to find a way to allow us to test suggestions and improvements.

But as far as I understand, Guillaume's code solves the problem already. Or does it reply something different from your code?

Guillaume 2017년 4월 4일

MATLAB Online에서 열기

@Jan,

I tested with this:

x = rx*(2*rand(435, 4258) - 1);
y = rx*(2*rand(435, 4258) - 1);
h = randi(200, 435, 4258);

which corresponds more or less to tony's description.

I don't know why the data is arrange in matrix form. As far as I understand this has no significance. The three variables could be just vectors.

댓글을 달려면 로그인하십시오.

how to make the .m code faster ?

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

채택된 답변

댓글 수: 14
이전 댓글 12개 표시 이전 댓글 12개 숨기기

추가 답변 (1개)

댓글 수: 19
이전 댓글 17개 표시 이전 댓글 17개 숨기기

카테고리

제품

태그

Community Treasure Hunt

how to make the .m code faster ?

댓글 수: 3 이전 댓글 1개 표시 이전 댓글 1개 숨기기

채택된 답변

댓글 수: 14 이전 댓글 12개 표시 이전 댓글 12개 숨기기

추가 답변 (1개)

댓글 수: 19 이전 댓글 17개 표시 이전 댓글 17개 숨기기

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 3
이전 댓글 1개 표시 이전 댓글 1개 숨기기

댓글 수: 14
이전 댓글 12개 표시 이전 댓글 12개 숨기기

댓글 수: 19
이전 댓글 17개 표시 이전 댓글 17개 숨기기