How can I speed up these for loops?

조회 수: 2 (최근 30일)
Timothy Plummer
Timothy Plummer 2017년 6월 1일
편집: Kelly Kearney 2017년 6월 14일
I have a group of 3 nested for loops that will calculate how much light is hitting a rectangular array of points if I know where the fixtures are located, in relation to the array of points. The results are used to determine how to best arrange a group of the same fixture in a space so there is a uniform amount of light at 5 different light levels, and at 8 different mounting heights, and it generally takes 10 to 20 calculations to find the best arrangement. At worst case scenario this is running 5*8*20 (800) times.
In the following code 'rows' and 'columns' are vectors of the same size that contains x and y coordinates of the of calculation points I need. 'xFixtureLocations' and 'yFixtureLocations' are vectors of the same size that contain the x and y coordinates of the fixtures. Finally, there is a struct 'IESdata' that contains 3 variables 'HorizAngles' (a vector of size 37), 'VertAngles' (a vector of size 37), and, 'photoTable' (an array of size m x n) that describes the intensity of the light in the steradian centered on the 'HorizAngles(m)' and 'VertAngles(n)' for the fixture in the calculation.
As of right now, I know that the calculation that is generated by this system is correct because I have compared it to programs that have been verified by the industry, namely AGI32. I have tried to parallelize the code by making the first loop a 'parfor' but I did not see a decent of improvement in how long it takes to run.
Irr = zeros(length(rows),length(columns),length(xFixtureLocations));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPt*180/pi, thetaPt*180/pi,'*nearest', 0.);
Irr(i1,i2,i3) = round((Ipt*cos(thetaPt)/dsq)*(Multiplier/1000),3);
end
end
end
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));
  댓글 수: 1
dpb
dpb 2017년 6월 1일
This would likely be easier to visualize with a (small) sample dataset to accompany it and I don't have time at the moment to pore over it in enough detail to really work out the details, but...
It strikes me you could generate the [X,Y] location arrays for the two sets of points (calculational and fixtures, respectively) and then use pdist2 to return the distance and then just do the calculation on that result. Seems as though could basically remove all the looping...
As said, a small sample case and code that could run in its entirety would surely help folks to do something without having to develop a baseline from which to start as well...

댓글을 달려면 로그인하십시오.

채택된 답변

Kelly Kearney
Kelly Kearney 2017년 6월 1일
You're calling interp2 800 times with a single point each. Just moving that outside the loop should be able to speed things up considerably:
[phiPtall, thetaPtall] = deal(zeros(length(rows), length(columns), length(xFixtureLocations)));
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt = atan(r/MountHeight);
if x==0
phiPt = 0;
else
phiPt = atan2(y,x);
end
phiPt = phiPt+pi + orientation(i3);
phiPt = mod(phiPt,2*pi)-pi;
dsq = r^2+(MountHeight)^2;
phiPtall(i1,i2,i3) = phiPt*180/pi;
thetaPtall(i1,i2,i3) = thetaPt*180/pi;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
phiPtall, thetaPtall,'*nearest', 0.);
Irr = round((Ipt.*cos(thetaPtall)./dsq).*(Multiplier./1000),3);
Irr = sum(Irr,3);
Avg = mean2(Irr);
Max = max(max(Irr));
Min = min(min(Irr));
As dpb mentioned, the calculations of the angles also look like they can be vectorized pretty well, but that would be more easily tested with some sample data.
  댓글 수: 5
Timothy Plummer
Timothy Plummer 2017년 6월 12일
편집: Timothy Plummer 2017년 6월 12일
So after iterating through this solution a lot, and expanding out the size of the x and y locations by almost eight-fold, I am back at the program taking more than 2 hours to full program to compute. If there is anything that can be done to make this loop even better that would be great.
I have attached some data that is generated before one attempted calculation and the results of that calculation.
Kelly Kearney
Kelly Kearney 2017년 6월 14일
편집: Kelly Kearney 2017년 6월 14일
I think your best bet would be to do away with the loop altogether. That may depend on the size of your data, though... the one-loop solution may make more sense if you're dealing with large arrays. I tested your 3-loop option against a fully vectorized option and a 1-loop option:
% Load input
A = load('~/Downloads/TestingData.mat');
rows = A.rows;
columns = A.columns;
xFixtureLocations = A.xFixtureLocations;
yFixtureLocations = A.yFixtureLocations;
IESdata = A.IESdata;
% Some counters
nr = length(rows);
nc = length(columns);
nfix = length(xFixtureLocations);
[m,n] = size(IESdata.photoTable);
% Additional input not provided
orientation = zeros(1, nfix);
Multiplier = 1;
MountHeight = 2;
% 3 loops
tic;
for ii = 1:500
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for i1 = 1:length(rows)
for i2 = 1:length(columns)
for i3 = 1:length(xFixtureLocations)
x = rows(i1)-xFixtureLocations(i3);
y = columns(i2)-yFixtureLocations(i3);
r = sqrt(x^2 + y^2);
thetaPt(i1,i2,i3) = atan(r/MountHeight);
if x==0
phiPt(i1,i2,i3) = 0;
else
phiPt(i1,i2,i3) = atan2(y,x);
end
phiPt(i1,i2,i3) = phiPt(i1,i2,i3) + pi + orientation(i3);
phiPt(i1,i2,i3) = mod(phiPt(i1,i2,i3),2*pi)-pi;
dsq(i1,i2,i3) = r^2+(MountHeight)^2;
end
end
end
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr{1} = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(1) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(1) = toc;
% No loops
tic
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
x = bsxfun(@minus, xg, permute(xFixtureLocations, [1 3 2]));
y = bsxfun(@minus, yg, permute(yFixtureLocations, [1 3 2]));
r = sqrt(x.^2 + y.^2);
thetaPt = atan(r./MountHeight);
phiPt = zeros(size(y));
phiPt(x~=0) = atan2(y(x~=0), x(x~=0));
phiPt = bsxfun(@plus, phiPt + pi, permute(orientation, [1 3 2]));
phiPt = mod(phiPt, 2*pi) - pi;
dsq = r.^2 + MountHeight.^2;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(2) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(2) = toc;
% One loop
for ii = 1:500
[xg, yg] = ndgrid(rows, columns);
phiPt = zeros(nr,nc,nfix);
thetaPt = zeros(nr,nc,nfix);
dsq = zeros(nr,nc,nfix);
for ifix = 1:nfix
x = xg - xFixtureLocations(ifix);
y = yg - yFixtureLocations(ifix);
r = sqrt(x.^2 + y.^2);
thetaPt(:,:,ifix) = atan(r./MountHeight);
tmp = zeros(size(y));
tmp(x~=0) = atan2(y(x~=0),x(x~=0));
phiPt(:,:,ifix) = tmp;
phiPt(:,:,ifix) = phiPt(:,:,ifix) + pi + orientation(ifix);
dsq(:,:,ifix) = r.^2 + MountHeight.^2;
end
phiPt = mod(phiPt, 2*pi) - pi;
Ipt = interp2(IESdata.HorizAngles-180, IESdata.VertAngles, IESdata.photoTable, ...
rad2deg(phiPt), rad2deg(thetaPt), 'nearest', 0);
Irr = round((Ipt.*cos(thetaPt)./dsq).*(Multiplier./1000),3);
Out(3) = struct('phiPt', phiPt, 'thetaPt', thetaPt, 'dsq', dsq, 'Irr', Irr);
end
t(3) = toc;
fprintf('3 loops: %f\n0 loops: %f\n1 loop : %f\n', t);
(The 500-times loop is just for timing purposes).
These were the results on my computer:
3 loops: 0.514277
0 loops: 0.351616
1 loop : 0.852183

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by