Comparing datetime value from two tables

조회 수: 11 (최근 30일)
Ziye
Ziye 2017년 7월 11일
댓글: Ziye 2017년 8월 3일
I have two tables a and b both containing time and price data. They are imported from csv files and therefore become two tables.
Both contain data of similar format but the datetime data in a is in minutes eg "11-07-17 09:00" and in b it's in dates eg "11-07-17".
I want to compare the price data given a date, something like "select * from b where b(:,1)=date(a(1,1))"
I'm extremely confused in trying to convert table into other format in order to compare them, I felt that there must be a very easy way to do it which i'm not aware of. someone pls help thanks so much!
  댓글 수: 1
Ziye
Ziye 2017년 7월 11일
Thanks! ya I should have attached the files. Pls refer to the attached file

댓글을 달려면 로그인하십시오.

채택된 답변

Guillaume
Guillaume 2017년 7월 11일
편집: Guillaume 2017년 7월 11일
A datetime always has an hour, minute, second component. The display format of the datetime array may be such that it is not shown, but it is always there. If you've read that array from a text file which didn't have a time component, then it defaults to 00:00:00.
You have several options to compare datetime arrays using just the date
  • You could set the time component in a to 00:00:00 and use standard comparison functions (e.g. ismember, or join)
  • You could extract the year, month and day from each with ymd and do array comparison (e.g. with ismember with the 'rows' option)
  • You could extract the date as strings (using char or datestr) and compare the strings.
Going with option 2, you could do:
ymda = zeros(height(a), 3);
[ymda(:, 1), ymda(:, 2), ymda(:, 3)] = ymd(a.Datetime);
ymdb = zeros(height(b), 3);
[ymdb(:, 1), ymdb(:, 2), ymdb(:, 3)] = ymd(b.Date);
queryresult = b(ismember(ymdb, ymda, 'rows'), :); %equivalent to select * from b where b.date = a.date
  댓글 수: 3
Guillaume
Guillaume 2017년 7월 11일
I wouldn't convert the tables to cell arrays. Tables are much easier to work with.
if you just use plain readtable it's not the month and day that is swapped but the day and year. That's because your year is encoded with only two digits. That confuses matlab so you need to tell it how to decode the date properly:
optionsa = setvaropts(detectImportOptions('a.csv'), 'Datetime', 'InputFormat', 'dd-MM-yy HH:mm');
a = readtable('a.csv', optionsa);
optionsb = setvaropts(detectImportOptions('b.csv'), 'Date', 'InputFormat', 'dd-MM-yy');
b = readtable('b.csv', optionsb);
%change datetime display format to something less ambiguous (optional)
a.Datetime.Format = 'dd MMM yyyy HH:mm';
b.Date.Format = 'dd MMM yyyy';
With that my code above works fine.
Ziye
Ziye 2017년 7월 12일
Thanks Guillaume. It's a nice solution. Too bad i'm using an older version of matlab but I get the ideas.
I've been quite confused with these datatypes, when I need to work with multiple datasets containing datetime, double and strings you think it's better that I put them in tables?

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Peter Perkins
Peter Perkins 2017년 7월 11일
One of your data files contains daily data with holes, the other contains irregular data of much higher frequency. The only two ways I can make sense of your question is that you want to either
1) braodcast out the daily data to the higher frequency data, or 2) somehow summarize the higher frequency data to a daily value and match them up
If you have access to MATLAB R2016b or later, both are easy to do with two timetables and the synchronize method. If tt1 has "pure dates" as it's row times, and tt2 has dates+times as its row times, then something like
tt = synchronize(tt1,tt2,'first','mean')
says, at the times defined by the first input, take the mean of the values in the second input between each of those times. The output is a timetable that has your prices as its two variables.
tt = synchronize(tt1,tt2,'last','previous')
says, at the times defined by the second input, find the value in the first input immediately before each of those times. This does what is in effect a join on the date.
There are subtleties you probably need to account for due to the holes in your daily data, but that's the general idea.
Prior to R2016b, you should be able to do more or less the same things by first creating a Date variable in your high frequency data using dateshift. To get a daily series,
varfun(@mean,tt2,'GroupingVariable','Date)
on the high frequency data, and then use the join function to combine that result with the daily data.
To get a high frequency series, just use the join function.
Hope this helps.
  댓글 수: 1
Ziye
Ziye 2017년 8월 3일
Thanks Peter, it's very helpful!

댓글을 달려면 로그인하십시오.


Steven Lord
Steven Lord 2017년 7월 11일
If you store your data in a timetable you could index into rows in a specific range of time using a timerange.

카테고리

Help CenterFile Exchange에서 Calendar에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by