Fast calculation of distances between two large arrays
조회 수: 9 (최근 30일)
이전 댓글 표시
Dear MATLAB-Community,
I would like to calculate the distances between each entry in M (1 113 486 x 2) and N (1 960 000 x 2) and store the indices for the distances that are within a tolerance value tol. Can someone help me to do that efficiently? The below code takes 90 weeks. I have also tried [~, ind] = ismembertol(M, N, tol) which gives me logical 1 for every pair which does not make sense.
tol=0.5;
indM(size(M,1),1)=NaN;
indN(size(N,1),1)=NaN;
progressbar
for m=1:size(M,1)
for n=1:size(N,1)
if pdist2(M(m,1:2), N(n,1:2)) <= tol
indM(m)=m;
indN(n)=n;
else
indM(m)=NaN;
indN(n)=NaN;
end
end
progressbar(m/size(M,1))
end
Kind regards
Philipp
댓글 수: 2
Stephen23
2023년 4월 24일
편집: Stephen23
2023년 4월 24일
"I have also tried [~, ind] = ismembertol(M, N, tol) which gives me logical 1 for every pair which does not make sense."
If you want to compare rows then you need to specify the ByRows option:
Also note that by default input tol is scaled to the data magnitude: set DataScale to 1 if you want to specify an the actual absolute tolerance.
채택된 답변
Chris
2023년 4월 24일
편집: Chris
2023년 4월 24일
This should be a little bit quicker (my computer indicates ten hours).
tol = 0.5;
M = rand(1113486,2);
N = rand(1960000,2);
inds = cell(size(N,1),1);
for idx = 1:size(N,1)
close = pdist2(M,N(idx,:)) <= tol;
inds{idx} = find(close);
end
This would be a good candidate for GPU operations, if you have one.
if canUseGPU
tol = 0.5;
M = gpuArray(M);
N = gpuArray(N);
inds = cell(size(N,1),1);
for idx = 1:size(N,1)
close = pdist2(M,N(idx,:)) <= tol;
inds{idx} = find(close);
end
end
If your tolerance is loose relative to the density of your points -- that is, if you have a lot of distances<=tol, you may run into memory issues. In that case, inds should be a tall array.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Numeric Types에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!