How can I efficiently down-sample data that is non-uniformly spaced?

Question

Martin Hoecker 2012년 9월 17일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/48386-how-can-i-efficiently-down-sample-data-that-is-non-uniformly-spaced

댓글: Paul Quelet 2014년 10월 15일

I am working with long time-series. My data acquisition system takes one point per second for months - except when it hangs for a few minutes or hours, then there will be no data for this time interval.

I want to down-sample this data to say one point per 60 seconds. However, because there can be "gaps" in the data, I am struggling to find efficient code for this!

I tried the following approach for a typical array (called "data") that holds 0.5 million rows and two columns: The first column is the time, the second column is the actual data.

start_time = data(1,1);
end_time = data(end,1);
time_step = 60/(3600*24); 
total_time = end_time - start_time;
% Prepare the downsampled data array.
downsampled_data = zeros(floor(total_time/time_step),2);
tic
for i = 1:length(downsampled_data)
  % For each time intervall, find all points in this intervall, and
  % average over them
  downsampled_data(i,1) = start_time + (i-0.5)*time_step;
  downsampled_data(i,2) = mean(data(data > start_time + (i-1)*time_step &...
       data < start_time + i*time_step,2));
end
toc

As a final step I would have to fish out those points where there is no data... However, the above code takes about 60 seconds to run for .5 million points - and I need it to run in less than 10 minutes for an array of 5 million points. Can you guys think of a way of speeding it up?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Paul Quelet 2014년 10월 15일

MATLAB Online에서 열기

Thank you for posting this algorithm.

The code is very helpful in something that I am doing. However, when I used it for my dataset, I ended up with some values on the exact hour with 00:00 for the minutes and seconds, which became NaN . In any case, I would revise the above algorithm to include the equal to condition, in my case at the bottom time:

    mean(data(data >= start_time + (i-1)*time_step &...
       data < start_time + i*time_step,2));

Many thanks again for the algorithm Martin and fast solution Andrei.

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Andrei Bobrov 2012년 9월 17일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/48386-how-can-i-efficiently-down-sample-data-that-is-non-uniformly-spaced#answer_59109

MATLAB Online에서 열기

[Y, M, D, H, MN] = datevec(data(:,1));
[c,c,c] = unique([Y, M, D, H, MN],'rows');
out = accumarray(c,data(:,2),[],@mean);

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Martin Hoecker 2012년 9월 17일

Genius, thank you! Computation time for 0.5 million points improved from 60 seconds to 1.2 seconds! One month worth of data (2.5 million points) takes about 10 seconds now. Thanks again for your help!

댓글을 달려면 로그인하십시오.

How can I efficiently down-sample data that is non-uniformly spaced?

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

How can I efficiently down-sample data that is non-uniformly spaced?

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기