how to make the .m code faster ?
이전 댓글 표시
I wrote a matlab code but the execution time is more than 25 minutes . I think it is because the for loop and the index. so i have a matrix of data H(435*4258) , and i am trying to take vaulues from it depend on another values from another matrix. x,y have the same size (435*4258) like H and contain the location of the data in H . the output matrix has c(869*869) . what Iam trying to do is making a rectangle around every point and search for each point in the matrix data is in this range and take the average . can somebody help me to make it faster and less than 2 seconds. the code is like interpolation. is there any way to make it faster ? the code :
destance = 7.5;
d = distance/2;
m = 869;
n = 869;
rx = 3255;
c= zeros(m ,n);
newx = -rx:distance:+rx;
newy = -rx:distance:+rx;
for i = 1:m
yc = newy(i)+d;
yf = newy(i)-d;
for j = 1:n
xc = newx(j)+d;
xf = newx(j)-d;
ind = find(x>= xf & x<=xc & y >=yf & y<=yc);
if isempty(ind)
c(i,j) = NaN;
else
p = mean(h(ind));
c(i,j) = p;
end
end
end
댓글 수: 3
Look at and use
doc profile
before you start making any assumptions of exactly which part of code is slow. It is a very simple and easy to use profiler compared to many you get for other languages.
Guillaume
2017년 4월 3일
The code is bound to be slow. For each i, it rescans the x array which does not change between i for a given j. For each j, it rescans the y array which does not change for a given i. In other words, it performs m*n scans where only m+n scans are required at most.
Stephen23
2017년 4월 4일
채택된 답변
추가 답변 (1개)
I assume Guillaume's suggestion is faster. For a comparison thry this cleaned loop:
distance = 7.5; % Not "destance"
d = distance/2;
m = 869;
n = 869;
rx = 3255;
c = nan(m, n);
newx = -rx:distance:+rx;
newy = -rx:distance:+rx;
for i = 1:m
yc = newy(i) + d;
yf = newy(i) - d;
yi = (y >= yf & y <= yc); % Once per loop only
for j = 1:n
xc = newx(j) + d;
xf = newx(j) - d;
ind = (x >= xf & x <= xc & yi);
if any(ind)
c(i, j) = sum(h(ind)) / sum(ind);
end
end
end
@Jony Muller: Please run a TIC/TOC and post the results for the two methods. Thanks.
댓글 수: 19
Guillaume
2017년 4월 3일
Really, the two loops should be separate since x and y are completely independent. You've taken out the multi-scanning of the y array but not the x.
Jony Muller
2017년 4월 4일
Jony Muller
2017년 4월 4일
Jan
2017년 4월 4일
@Guillaume: I don't get your point.
@Jony: mean() has more overhead than sum. If you provide some meaningful input data, we could do some time measurements by out own. Optimizations are much easier, when e.g. the output of the profiler can be used. This code needs 1350 seconds?! This sounds really strange.
Nevertheless, if Guillaume's version works, you should use it in every case and comparing it with leaner loops is of accademic interest only.
Jony Muller
2017년 4월 4일
Without having the corresponding inputs, I cannot guess, what happens. How large are the inputs? Even 3 seconds seems to be surprisingly slow.
I do not think, that the shown code can create a "/ Matrix dimensions must agree" error. Did you modify the code?
It would be very useful to have some data to play with. Then answering is not based on guessing.
Jony Muller
2017년 4월 4일
편집: Jony Muller
2017년 4월 4일
Jan
2017년 4월 4일
@Jony: Again, please provide some input data. If you explain how the input data might look like, I would have to sit down and invent some Matlab code to produce some arrays which might match yours. But this would be a waste of my time, and even your time, when my speculations do not match your data.
I cannot imagine or guess, why mean could reply anything different from sum(h(ind)) / sum(ind). As long as ind is a vector, both epressions should be scalars.
I really want to help you. But without meaningful data I do not see a way do to this efficiently. So please post either the original data, or if it is enough to reproduce the problem some rand calls to create pseudo data. The sizes and values of x and y matters, perhaps they are sorted already. H is not huge, therefore I do not have the faintest idea what the computer is doing in the 1357 seconds.
Jony Muller
2017년 4월 4일
Jan
2017년 4월 4일
Can you attach the data as MAT files?
Jony Muller
2017년 4월 4일
Jony Muller
2017년 4월 4일
편집: Jony Muller
2017년 4월 4일
@Jan, "Guillaume: I don't get your point."
You've extracted yi = (y >= yf & y <= yc) out of the inner loop because it kept being recalculated for each x position. However, the same happens with x. The result x >= xf & x <= xc is the same for a given j regardless of the value of i, yet you recalculate it at each step of the i loop.
That is my point, the binning of the x is independent of the binning of the y. Therefore, it would make sense to have the two loops independent. Practically, with your approach, that's not possible (without a third loop).
@Jony,
Internally, my solution involves 4 loops (I assume, if discretize and accumarray are implemented optimally)
- one loop over the x values, for the first discretize
- one loop over the y values, for the second discretize
- one loop over corresponding x,y and h values by accumarray to accumulate the h values into their corresponding bin
- one loop over the bins by accumarray to calculate the bin mean
All these loops are implemented in mex files or directly within the matlab compiled code so they'll run much faster than anything you'd write in m code.
You could reduce it to two loops by doing the discretisation and accumulation in the same loop. Code in m would be something like:
binsum = nan(2*rx/distance + 1);
bincount = zeros(size(binsum));
for idx = 1:numel(h)
binx = 1 + floor((x(idx)+rx)/distance); %discretisation of x
biny = 1 + floor((y(idx)+rx)/distance); %discretisation of y
if binx > 0 & biny > 0 & binx <= m & biny <= m
binsum(biny, binx) = binsum(biny, binx) + h(idx);
bincount(biny, binx) = bincount(biny, binx) + 1;
end
end
binmean = binsum ./ bincount; %internally loops over the bins.
This is optimal in the number of loops. You could make it even faster, if you got rid of the branching if by ensuring that all x and y are within -rx:+rx.
However, unless implemented in mex, I doubt it'll be faster than my discretize + accumarray solution.
Jan
2017년 4월 4일
@Guillaume: Thanks, it is clear now. I only simplified the original loops in a trivial way. I'm afraid that x and y are such huge that sorting the complete matrix might exhaust the memory.
Jony Muller
2017년 4월 4일
편집: Guillaume
2017년 4월 4일
Jan
2017년 4월 4일
@Jony: As long as you cannot provide input data, I cannot test some ideas I have. I understand that it is not trivial due to the file size. But this is your problem, so it is your turn to find a way to allow us to test suggestions and improvements.
But as far as I understand, Guillaume's code solves the problem already. Or does it reply something different from your code?
Guillaume
2017년 4월 4일
@Jan,
I tested with this:
x = rx*(2*rand(435, 4258) - 1);
y = rx*(2*rand(435, 4258) - 1);
h = randi(200, 435, 4258);
which corresponds more or less to tony's description.
I don't know why the data is arrange in matrix form. As far as I understand this has no significance. The three variables could be just vectors.
카테고리
도움말 센터 및 File Exchange에서 Creating and Concatenating Matrices에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

