# How can the following code be optimized / vectorized?

조회 수: 1 (최근 30일)
L'O.G. 2022년 3월 16일
댓글: L'O.G. 2022년 3월 16일
How can you optimize the following? It is painfully slow and takes around 5 minutes to run on my computer. Here I am calcuatling the distances between atoms in molecules while taking account periodic boundary conditions (which mean an atom on the edge of the box might actually belong to a molecule that wraps around, so to speak). I imagine this could be done in a far more efficient and speedy way than my Fortran-esque implementation.
A % an array where column 1 is the atom ID, column 2 is the molecule ID, and columns 3-5 are the positions in x, y, and z
N = 200; % molecules in system
Sites = 150; % sites in each molecule
length = 120; % length of box on each side
SitesInSystem = N*Sites;
B = zeros(SitesInSystem,SitesInSystem);
% center box at origin
A(:,3) = A(:,3) - length/2;
A(:,4) = A(:,4) - length/2;
A(:,5) = A(:,5) - length/2;
for ii = 1:SitesInSystem
for jj = 1:SitesInSystem
if ii ~= jj % if not the same site
dx = A(jj,3) - A(ii,3);
dy = A(jj,4) - A(ii,4);
dz = A(jj,5) - A(ii,5);
% taking into account periodic boundaries
if (dx > length * 0.5)
dx = dx - length;
elseif (dx <= -length * 0.5)
dx = dx + length;
end
if (dy > length * 0.5)
dy = dy - length;
elseif (dy <= -length * 0.5)
dy = dy + length;
end
if (dz > length * 0.5)
dz = dz - length;
elseif (dz <= -length * 0.5)
dz = dz + length;
end
% calculate distances
B(ii,jj) = sqrt(dx^2+dy^2+dz^2);
end
end
end
B = triu(B); % to avoid double counting
Perhaps this also can be combined with the following snippet, which calculates something based on the values of the matrix B:
for ii = 1:N % molecule
for jj = 1:Sites % site
for kk = 1:N
for mm = 1:Sites
if ii ~= kk % if not the same molecule
if (B(ii*jj,kk*mm) < 1.75) && (B(ii*jj,kk*mm) > 0)
% do a calculation, i.e., add edge ii-kk to a graph
% with N nodes
end
end
end
end
end
end
##### 댓글 수: 4이전 댓글 2개 표시이전 댓글 2개 숨기기
Stephen23 2022년 3월 16일
"Here you can read the data into MATLAB and use readtable and table2array to turn the data into a MATLAB array."
READMATRIX would be simpler and more efficient: https://www.mathworks.com/help/matlab/ref/readmatrix.html
L'O.G. 2022년 3월 16일
@Stephen Thanks. I didn't know about that function!

댓글을 달려면 로그인하십시오.

### 채택된 답변

Bruno Luong 2022년 3월 16일
편집: Bruno Luong 2022년 3월 16일
It takes about 25 seconds only PC
Ac35 = A(:,3:5);
Ai = reshape(Ac35,[],1,3);
B = zeros(SitesInSystem,SitesInSystem);
for j=1:SitesInSystem
Aj = reshape(Ac35(j,:),1,[],3);
dxyz = Ai-Aj;
dxyz = mod(dxyz+length/2,length)-length/2;
B(:,j) = sqrt(sum(dxyz.^2,3));
end
##### 댓글 수: 1이전 댓글 -1개 표시이전 댓글 -1개 숨기기
L'O.G. 2022년 3월 16일
Excellent. Thank you.

댓글을 달려면 로그인하십시오.

### 추가 답변 (2개)

KSSV 2022년 3월 16일
I think pdist2, this inbuilt function will work for you.
T = readtable('Sample.xlsx') ;
A = table2array(T) ;
A = A(:,3:5) ; % pick coordinates (x,y,z)
d = pdist2(A,A) ;
B = triu(d) ; % you may check this with you answer. Use isequal to check.
##### 댓글 수: 3이전 댓글 1개 표시이전 댓글 1개 숨기기
KSSV 2022년 3월 16일
I have checked your code and pdist2 for the provided 274*5 data. Both the methods, gave same solution.
L'O.G. 2022년 3월 16일
편집: L'O.G. 2022년 3월 16일
@KSSV Right, but for the full case (not the sample file I gave) I would need to consider that some atoms might be at the edges of the box, so that periodic boundaries as applied to that molecule must be taken into account. Sorry I didn't make that clearer. I mentioned it in my initial query and coded it using the if statements nested in the orignal set of for loops, but I guess not very clearly.

댓글을 달려면 로그인하십시오.

Walter Roberson 2022년 3월 16일
You can see that the rem() based expression I use here can be used to vectorize the calculation.
format bank
Length = 5;
x = -3.5 : .5 : 3.5
x = 1×15
-3.50 -3.00 -2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
for K = 1 : length(x)
dx = x(K);
if (dx > Length * 0.5)
dx = dx - Length;
elseif (dx <= -Length * 0.5)
dx = dx + Length;
end
newx_if(K) = dx;
end
newx_mod = rem((x - Length/2), Length) + Length/2;
newx_if
newx_if = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 -2.00 -1.50
newx_mod
newx_mod = 1×15
1.50 2.00 2.50 -2.00 -1.50 -1.00 -0.50 0 0.50 1.00 1.50 2.00 2.50 3.00 3.50
newx_if - newx_mod
ans = 1×15
0 0 0 0 0 0 0 0 0 0 0 0 0 -5.00 -5.00
##### 댓글 수: 3이전 댓글 1개 표시이전 댓글 1개 숨기기
Walter Roberson 2022년 3월 16일
newxyz_mod = rem((A - Length/2), Length) + Length/2; % for x, y, z
looks fine, as long as Length is the same in all three directions.
You would follow it with
B = sqrt( sum(newxyz_mod.^2, 2) );
L'O.G. 2022년 3월 16일
편집: L'O.G. 2022년 3월 16일
@Walter Roberson Got it. Thank you! And about my other question, the differences -- rem() should use the differences, correct? Not the positions themselves? That's my understanding, but I don't know how to calculate that efficiently. As I mentioned in my previous comment, something like pdist(A(:,1),@(x,y) x-y); generates a huge vector and is slow, and that's just for x! I think there will be (N*Sites)^2 / 2 possible distances which will fill an upper triangular matrix. I'm not sure how to code that efficiently here, though.

댓글을 달려면 로그인하십시오.

### 카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

R2021b

### Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by