Implicit loop or break for large data set

조회 수: 2 (최근 30일)
L N
L N 2019년 3월 11일
편집: Guillaume 2019년 3월 12일
Hi,
I am not very experienced with MATLAB and am trying to figure out the following situation. I have two large data sets, each a column of time information which is increasing. I want to find the indices of vector 1 that occur x seconds after an entry in vector 2, and return the indices of those locations in vector 2 as well.
My approach is that I have column vectors A and B, each with >2,000,000 data points but they are not the same size. Let us say I want the ones in B that occur within 1 s after A. So I have:
Atol = A +1 <-- this can be the maximum that the element in B can be, where the minimum is A. Then:
for i = 1:L1 (where L1 is length of A)
indx1 = (B >= A(i) & B <= Atol(i) )
end
And then to find the ones in A, I would do the same but the other way around.
for i = 1:L2
indx2 = (A <= B(i) & A>= Btol(i) ) where Btol is B-1
end
This would take forever to run however simply because of the large sets of data. Is there a way I can run this faster? I was thinking of implementing a break so that once I find one index, I stop and run through starting from that point since the answer for the second element will be later than the one for the first element, and so on. I have tried to implement this but it does not work. Alternatively, I have read that implicit loops are faster, but I cannot get this working either.

채택된 답변

Guillaume
Guillaume 2019년 3월 11일
편집: Guillaume 2019년 3월 11일
If I understood correctly:
Aidx = discretize(B, [A; Inf]); %find the index of the A element that is immediately smaller than the corresponding B element
Bdist = B - A(Aidx); %difference between B and the A element that is smaller
tokeep = Bdist <= 1; %indicates which elements in B are less than one second after A
B1sAfterA = B(tokeep); %keep elements of B that are less than one second after an element of A
Acorresponding = A(Aidx(tokeep)); %corresponding A elements
No idea how fast discretize will run on two vectors of > 2e6 elements.
  댓글 수: 2
L N
L N 2019년 3월 11일
편집: L N 2019년 3월 12일
Hi, thanks for the help. this works, but only if first element in B is smaller than A, else I get NaN. I have not tried with my large set yet.
I removed values of B smaller than value of first element of A and it works beautifully with my data. thank you!!
Guillaume
Guillaume 2019년 3월 12일
편집: Guillaume 2019년 3월 12일
Yes, sorry, I assumed that the Bs were all greater than the min of A.
I also forgot to say that it requires A to be sorted monotonically increasing. B does not have to be sorted at all.
If you don't want to remove the small Bs, you could modify the code as such:
Aidx = discretize(B, [A; Inf]);
nottoosmall = ~isnan(Aidx);
Bdist = B(nottoosmall) - A(Aidx(nottoosmall));
tokeep = false(size(Aidx));
tokeep(nottoosmall) = Bdist <= 1;
B1sAfterA = B(tokeep);
Acorresponding = A(Aidx(tokeep));

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by