Fast way to perform multiple searches on a large array

Question

0 개 추천

I have a large time series array (10,000,000 elements) :

ts = [2; 1; 3; 4; 6; 7; .......]

I have a corresponding time array (same size as the above) :

times = [d1; d2; d3; d4; d5.......]

I have 2 arrays of start times and end times (also large ~ 30000 elements):

st = [dd1 dd2 dd3 ....]
en = [de1 de2 de3 ....]

I need to create a new matrix with many many finds. Logic is :

results = NaN(300, numel(st));
for i=1:numel(st);
  temp = ts(find(times > st(i) & times < en(i) , 300,'first');
  results(:,i) = temp;
end;

Is there any ay I do this faster (ideally without a loop) ?

I have a 64 bit version so I can try a large in-memory solution.

Many thanks in advance, Nigel

댓글 수: 8
이전 댓글 6개 표시 이전 댓글 6개 숨기기

Daniel Shub 2011년 10월 4일

Just to confirm times, st and en are all sorted?

Nigel 2011년 10월 4일

Yes they are sorted by st and en(i)-st(i) = 300 seconds

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Daniel Shub 2011년 10월 4일

MATLAB Online에서 열기

0 개 추천

I think by dumping the past times you might be able to speed up the find. If st(i+1) > en(i), then you could dump even more elements, but I think the savings will be small. This code relies on times, st, and en being sorted.

results = NaN(300, numel(st));
offset = 0;
for i=1:numel(st);
  idx = find(times > st(i), 1,'first');
  offset = offset+idx-1;
  times = times(idx:end);
  results(:,i) = ts(0:299+idx+offset);
end

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Nigel 2011년 10월 10일

Hi Daniel,

I used a modified version of your solution. Indeed it is a LOT quicker to search over smaller sized arrays.

Thank you all for your help.

N.

댓글을 달려면 로그인하십시오.

Answer 2

Jan 2011년 10월 4일

MATLAB Online에서 열기

0 개 추천

Never let an array grow in each iteration! Pre-allocate the output:

results = NaN(300, numel(st));
for i = 1:numel(st)   % Not size(st), which is a vector!
  temp = ts(find(times > st(i) & times < en(i), 300, 'first');
  if length(temp) == 300
    results(:, i) = temp;
  else
    results(1:length(temp), i) = temp;
  end
end
results = results(~isnan(results));

If st and times are sorted, it wastes a lot of time to compare all values. But for vectorizing this, a very large matrix would be needed, such that I assume it will be slower than the loop.

Can you solve the problem by using HISTC?

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

Daniel Shub 2011년 10월 4일

and since times and st are sorted

0:299+find(times > st(i), 1, 'first')

Nigel 2011년 10월 4일

WOW by removing the < en(i)the processing time nearly halved !!

댓글을 달려면 로그인하십시오.

Answer 3

Nigel 2011년 10월 4일

0 개 추천

Certainly taking away the < en(i) helped. I'm a little hesitant to implement the dumping the past times part because I need the data for something a little later on.

Just for my own learning I would really like to know how could I vectorise this operation such that I didn't need to do this in a loop.

Thank you all once again for taking the time to look at and respond to my question.

N.

댓글 수: 2
없음 표시 없음 숨기기

Bjorn Gustavsson 2011년 10월 10일

Well then at least do the consequtive 'find's on shortened sections of times (with 'offset' as in Daniel's example):

idx = find(times(offset:end) > st(i), 1,'first');

Then you'd get the benefit from increasingly shorter arrays to search over but without loosing the data.

Daniel Shub 2011년 10월 10일

I wonder if this would be faster. I would hope MATLAB is smart enough not to have to reallocate memory for my method. Yours is probably a little safer. I was also thinking that working from the end backwards might ultimately be the fastest.

댓글을 달려면 로그인하십시오.

Fast way to perform multiple searches on a large array

댓글 수: 8
이전 댓글 6개 표시 이전 댓글 6개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (2개)

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 2
없음 표시 없음 숨기기

카테고리

태그

Community Treasure Hunt

Fast way to perform multiple searches on a large array

댓글 수: 8 이전 댓글 6개 표시 이전 댓글 6개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (2개)

댓글 수: 6 이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 2 없음 표시 없음 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 8
이전 댓글 6개 표시 이전 댓글 6개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 6
이전 댓글 4개 표시 이전 댓글 4개 숨기기

댓글 수: 2
없음 표시 없음 숨기기