Vectorize a loop to save time

Question

Filip 2019년 2월 3일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time

댓글: Walter Roberson 2019년 2월 4일

I have a big data set and my current code takes 2 hours. I am hoping to save time by vectorization if that is possible in my case.

I have a table Table with variables ID, t1, tend, p. My code is sth like:

x=zeros(size(Table.ID,1));
for i=1:size(Table.ID,1)
x(i)=sum(Table.t1<Table.t1(i) & Table.tend>Table.tend(i) & abs(Table.p-Table.p(i))>1);
end

So for each observation, I want to find number of observations that start before, ends after and have a p value in the neighborhood of 1. It takes 2 hours to run this loop. Any suggestion?

Thanks in advance!

댓글 수: 2
없음 표시없음 숨기기

Walter Roberson 2019년 2월 4일

How are the t1 and tend values arranged? Are tend(i+1) = t1(i) such that together they partition into consecutive ranges that are completely filled between the first and last? Do they act to partition into non-overlapping ranges but with gaps? Are there overlapping regions? Are the boundaries already sorted?

Filip 2019년 2월 4일

There is no arrangement between t1 and tend values across observations. They might overlap for some observations, there might be gaps in time too.

All I know is that t1<tend for an observation.

Table is sorted wrt ID.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Jan 2019년 2월 4일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359428

편집: Jan 2019년 2월 4일

MATLAB Online에서 열기

2 hours sounds long. Is the memory exhausted and the virtual memory slows down the execution? How large is the input?

Is this a typo:

x = zeros(size(Table.ID,1))

It creates a square matrix, but you access it as vector obly.

Does the table access need a remarkable amount of time?

n    = size(Table.ID,1);
t1   = Table.t1;
tend = Table.tend;
p    = Table.p;
x    = zeros(n, 1);
for i = 1:n
  x(i) = sum(t1 < t1(i) & tend > tend(i) & abs(p - p(i)) > 1);
end

If you sort one of the vectors, you could save some time:

[t1s, index] = sort(t1);
tends        = tend(index);
ps           = p(index);
for i = 2:n
  m    = t1s < t1s(i);
  x(i) = sum(tends(m) > tends(i) & ...
             abs(ps(m) - ps(i)) > 1);
end

Afterwards x has to be sorted inversly. If you provide some inputs, I could check the code before posting. I'm tired, perhaps I've overseen an obvious indexing error.

Is the shown code really the bottleneck of the original code?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Filip 2019년 2월 4일

I have more variables in the table and do more comparisons, but they are all similar. So, I wrote a sample here to give the idea.

x = zeros(size(Table.ID,1)) is obviously a typo.

I guess, sorting t1 will work, and also accessing table might be time consuming. I will update when I apply the changes but this seems promising. Thanks!

댓글을 달려면 로그인하십시오.

Answer 2

Walter Roberson 2019년 2월 4일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/442981-vectorize-a-loop-to-save-time#answer_359344

MATLAB Online에서 열기

My mind is headed towards creating a pairwise mask matrix,

M = squareform(pdist(Table.p) > 1);    %important that Table.p is a column vector

That would be comparatively fast. If the table is very big then it could fill up memory, though.

abs() is not needed for this; pdist will already have calculated distance as a non-negative number.

Now

Mi = M(i,:);
x(i)=sum(Table.t1(Mi)<Table.t1(i) & Table.tend(Mi)>Table.tend(i));

However you should do timing tests against

Mi = M(i,:);
x(i)=sum(Mi & Table.t1<Table.t1(i) & Table.tend>Table.tend(i));

and

Mi = M(i,:);
Tt = Table(Mi);
x(i)=sum(Tt.t1<Table.t1(i) & Tt.tend>Table.tend(i));

댓글 수: 2
없음 표시없음 숨기기

Filip 2019년 2월 4일

Unfortunately, this answer does not exactly work. But inspired by your answer, I believe that creating pairwise difference matrix by "bsxfun(@minus, T.t1, T.t1')" might work. I am not sure how faster it is gonna be and if I will have memory issues. I will try and update after.

Walter Roberson 2019년 2월 4일

MATLAB Online에서 열기

abs(T.t1 - T.t1.')

would work as a distance function for you in R2016b and later.

댓글을 달려면 로그인하십시오.

Vectorize a loop to save time

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (1개)

댓글 수: 2
없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Vectorize a loop to save time

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (1개)

댓글 수: 2 없음 표시없음 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 2
없음 표시없음 숨기기