Comparing Timeseries to get similar Timeseries based on Euclidean Distance

조회 수: 10 (최근 30일)
Furqan Hashim
Furqan Hashim 2020년 12월 26일
댓글: Furqan Hashim 2020년 12월 27일
I have timeseries data in an array which I want to compare in order to build clusters of similar time series.
Generate sample data using the following piece of code:
timeseries = [1, 2, 3, 4; 1, 2, 3, 4; 1, 2, 3, 4; 4, 5, 6, 7; 4, 5, 6, 8; 4, 5, 6, 9; 4, 5, 6, 10];
Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp.
First I compute the eucledian distance of the data generated above. This can be done through
distance = squareform(pdist(timeseries));
From the above distance matrix we can find out unique distances by code below
unique_distances = unique(distance);
I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8). See below
t1 , t2 .. represent time series 1, 2 and so on.
First row and first column of the matrix would show how many timeseries have zero distance with the first time series and so on so forth.
First row and second column of matrix represent how many timeseries have distance of 1 with first timeseries and so on and so forth.
I am new to MATLAB I've done the desired result using code below;
dist = nan(size(timeseries, 1), size(unique_distances,1));
for i = 1:size(timeseries, 1)
disp(i)
for j = 1:size(unique_distances,1)
disp(j)
dist(i,j) = sum(distance(i,:) == unique_distances(j));
end
end
I am looking for a vectorised approach for above code.
Also I need to cluster based on time series which has zero distance with maximum number of other time series therefore I need to sort the matrix based on that as well. In this example it is already sorted as t1 had distance of zero with 3 timeseries as it can be seen from the matrix. an 3 is the max value aswell.
  댓글 수: 2
Ameer Hamza
Ameer Hamza 2020년 12월 26일
You mentioned, "I want to create a n (number of time series i.e 4) by m (number of unique distances i.e. 8)." But the matrix you create has seven rows. Are time series arranged along with columns or rows? I think you intend to take the transpose of the matrix before passing it to pdist()
distance = squareform(pdist(timeseries.'));
Furqan Hashim
Furqan Hashim 2020년 12월 27일
You've correctly pointed out the mistake. I've edited my question where I've rephrased
"Here we have 4 timeseries where each column represent a timeseries and each row represen the timestamp."
to
"Here we have 7 timeseries where each row represent a timeseries and each column represents the timestamp."
Now we do not need to take the transpose, for simplicity we can consider each row represents a timeseries instead of each column.

댓글을 달려면 로그인하십시오.

답변 (0개)

카테고리

Help CenterFile Exchange에서 Time Series에 대해 자세히 알아보기

제품


릴리스

R2017b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by