How to arrange time series data into yearly groups?
조회 수: 4 (최근 30일)
이전 댓글 표시
Hi,
I have daily temperature time series from 01/03/1961 to 28/07/2018. I want to arrange this data as per water year convention (01 June to 31 May) in following form:
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Temperature Temperature Temperature
Date (1960-1961) Date (1961-1962) . . . . . . . . . . . (2017-2018)
01/06/1960 01/06/1961
. .
. .
. .
. .
. .
31/05/1961 31/05/1962
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Finally, I need this data table in following form
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
= = = = = = = = = = = = Temperature = = = = = = = = = = = =
Day 1960-1961 1961-1962 . . . . . . . 2017-2018
1
2
3
.
.
.
.
.
.
365
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Please tell me how to do this.
Thanks in advance.
댓글 수: 2
채택된 답변
Guillaume
2019년 11월 18일
편집: Guillaume
2019년 11월 20일
A simpler and faster way of obtaining the same as Joe's answer:
dataInitial = readtimetable('Data.xlsx');
dataFinal = table(day(dataInitial.DateTime, 'dayofyear'), year(dataInitial.DateTime), TemperatureDifference, ...
'VariableNames', {'Day', 'Year', 'TemperatureDifference'});
uyears = unique(dataFinal.Year);
dataFinal = unstack(dataFinal, 'TemperatureDifference', 'Year', 'NewVariableNames', compose('%d-%d', uyears, uyears+1))
Since tables and matrices can't have a different number of rows per column, empty entries are automatically filled by NaN.
edit: and for versions < R2019b which don't support arbitrary variable names for table variables:
edit: and for versions < R2019a which don't have readtimetable:
dataInitial = table2timetable(readtable('Data.xlsx'));
dataFinal = table(day(dataInitial.DateTime, 'dayofyear'), year(dataInitial.DateTime), TemperatureDifference, ...
'VariableNames', {'Day', 'Year', 'TemperatureDifference'});
uyears = unique(dataFinal.Year);
dataFinal = unstack(dataFinal, 'TemperatureDifference', 'Year', 'NewVariableNames', compose('Y%d_%d', uyears, uyears+1))
Again, this produces the same result as Joe's code (only faster and with a lot less work!)
추가 답변 (1개)
Joe Vinciguerra
2019년 11월 17일
편집: Joe Vinciguerra
2019년 11월 18일
Start by importing your data with either readtable or readtimetable (timetable is more powerful when working with dates, but can be more complicated).
The simplest next step is to create a second table formatted how you want your final data to look: first column is 1:366 (leap years), headers are the year labels. Header labels can be populated using something like
for i=1:57
h(i) = join([string(i+1959),'-',string(i+1960)])
end
Assign h to your table variable names.
Then use a for loop to populate the final table.
Here's one way to do it:
%% The following script assumes dataset is complete, contiguous, and contains more than 2 years of data or you will get errors
% Import the data:
dataInitial = readtimetable("Data.xlsx"); % import the data into a timetable
% Setup time-related variables
% YearFirst = min(year(dataInitial.Datetime)); % the first year in the dataset
YearFirst = 1960; % use this, OR the line above instead
% YearLast = max(year(dataInitial.Datetime)); % the last year in the dataset
YearLast = 2018; % use this, OR the line above instead
dateStart = datetime([YearFirst 6 1]); % the first date in the dataset
dateEnd = datetime([YearLast 5 31]); % the last date in the dataset
oneYear = calendarDuration([1 0 0]); % calendar duration signifying 1 year
bins = dateStart:oneYear:dateEnd; % array containing the start date of each column for the final table
nbins = length(bins); % the number of data columns in the final table
YearRange = YearFirst:YearLast; % array of all years in the dataset
header = strings(1,nbins); % preallocate variables that will be created in a loop
for i = 1:nbins
header(i) = join([string(YearRange(i)),'-',string(YearRange(i+1))]); % generate an array of strings for the headers
end
% setup the output table
tableSize = [366, nbins+1]; % the size of the final table (+1 for column containing day numbers)
varTypes = cell(1, nbins+1); % create an empty cell array for the variable types
varTypes(1) = {'int8'}; % use 'int8' for the day number column
varTypes(2:end) = {'double'}; % use 'double' for all other data elements
dataFinal = table('Size',tableSize,'VariableTypes',varTypes,'VariableNames',['day',header]); % create an empty table to store the results
% arrange the imported dataset into the formatted table
dataFinal.day = (1:366)'; % insert the day number into the first column
for i = 1:length(bins) % for each column of data...
dateRange = timerange(bins(i),bins(i) + oneYear); % identify the date range of interest for this column
dataTemporary = dataInitial(dateRange,:); % extra step for clarity
dataFinal(1:height(dataTemporary),i+1) = dataTemporary; % insert the data extracted into the final table where it belongs
end
댓글 수: 11
Guillaume
2019년 11월 20일
"but final table only shows appropriate headers but in data cells all 'zero'."
I'm not entirely sure what this mean. Anyway, with which code do you get that?
"I get following error"
This would indicate that for some reason i is a timetable, not the loop index. Anyway, I've given you much simpler code (just 4 lines) which will do the same job.
An easier way to get the timetable in pre-R2019a is with:
dataInitial = table2timetable(readtable('Data.xlsx'));
참고 항목
카테고리
Help Center 및 File Exchange에서 Timetables에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!