Method to Correlate Time Series Arrays of Differing Lengths

조회 수: 32 (최근 30일)
James
James 2019년 4월 9일
편집: David Wilson 2019년 4월 9일
I am attempting to correlate the time series from 4 separate tilt monitors that sample every 5 minutes. The time series have slightly different base times and end times, and the resulting arrays are slightly different lengths, though they span almost the (differing by ~3 mins) same period of time. My goal is to correlate each of these time series with a single "wind speed" time series that also covers the same period of time as the tilt monitors, sampling every 5 minutes, but also has a slightly different array length and origin time and end time.
The different array lengths in the tilt measurements are due to instrument error. There are some times within each of the arrays where the instrument missed a measurement and so the sample interval is 10 minutes.
My arrays sizes look something like this:
Tilt_a = 6236x2
Tilt_b = 6310x3
Tilt_c = 6304x2
Tilt_d = 6309x2
wind_Speed = 6383x2
I imagine that I will need to re-sample the data using something like interp1, but I do not know how to renconcile the origin and end times. Is there a method that comes to mind for handling a situation such as this one? Or a function that allows correlating arrays of differing lengths?

채택된 답변

David Wilson
David Wilson 2019년 4월 9일
One idea would be to:
  1. Start by finding a common time span in in all your data. Find the latest start time and the earliest finish time in all your 5 data sets. (I'm assuming the 1st column in each is the time.)
  2. Then define a regularly sampled time vector as something like
Ts = 5; % sample time
ti = [tstart: Ts: tfinish]; % interpolated time
  1. Now interpolate into each of the five data sets, all using ti
ta = Tilt_a(:,1); Ta = Tilt_a(:,2); % I assume this is the correct order.
Tilt_ai = interp1(ta, Ta, ti)
and so on.
Now all your data is aligned properly. But be careful, this involves interpolation based on some (un-said) assumptions.

추가 답변 (1개)

David Wilson
David Wilson 2019년 4월 9일
편집: David Wilson 2019년 4월 9일
OK, I'll generate some fake data that is not aligned, and the do the above approach.
%% Generate some data series that are slightly off shifted
Ts = 5; % 5 min sample time
ta = [0:5:1000]'; ya = sin(ta/100);
tb = [2.7:5:1055]'; yb = sin(tb/100 + 0.3);
tc = [-25:5:998]'; yc = sin(tc/100 - 5);
plot(ta, ya,'.', tb, yb, 'x', tc, yc, 's')
You can see that the 3 time series are not aligned, and have (slightly) different start and end times.
% Now find latest common starting time
t0 = max([ta(1), tb(1), tc(1)]);
tfinal = min([ta(end), tb(end), tc(end)]);
ti = [t0:Ts:tfinal]';
In this case, I retain the sample time of 5 mins, but start at the latest start time (t=2.7) and finish at earliest finish time (t=995). This is to ensure we don't go over the end of the data sets, although we could use the 'extrapolate' option in interp1.
Now we interp1 for each of the 3 series.
yai = interp1(ta,ya,ti);
ybi = interp1(tb,yb,ti);
yci = interp1(tc,yc,ti);
plot(ti,[yai, ybi, yci])
cov([yai,ybi, yci])
You will note that the covariance calculation now works.

카테고리

Help CenterFile Exchange에서 Time Series에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by