I am using the datenum function to convert a time stamp in my dataset to a serial number. This is the line that does it:
wind.t = datenum(wind.TheTime,'mm/dd/yyyy HH:MM:SS');
It is failing on one dataset and not on another and I don't understand why. There is nothing different between the two datasets, other than one is much longer.
Both timestamps are formatted the same way, but the longer one gives this error:
Error using ==> datenum at 174 DATENUM failed.
Caused by: Error using ==> dtstr2dtnummx Failed on converting date string to date number.
The info about the timestamp column was found using the 'summary' function below. They are identical.
wind = summary(wind)
TheTime: [27878x1 cell string, Units = TheTime]
summary(wind)
TheTime: [2047x1 cell string, Units = TheTime]
Any help or suggestions would be much appreciated.

댓글 수: 1

Rob Graessle
Rob Graessle 2011년 5월 6일
Does your program fail at the same point every time you run it? What version of MATLAB and operating system are you using?

댓글을 달려면 로그인하십시오.

 채택된 답변

Matt Tearle
Matt Tearle 2011년 5월 6일

0 개 추천

Assuming you haven't looked at every string individually, I'd guess the mostly likely cause is a malformed string somewhere in the cell array. You could perhaps use regexp to check that the strings all do, in fact, match the assumed pattern.
Something like this should work:
nnz(cellfun(@isempty,regexp(wind.TheTime,'\d+/\d+/\d\d\d\d\s+\d+:\d+:\d+')))

댓글 수: 14

Braden
Braden 2011년 5월 6일
I'm not 100% how regexp is working, but if I understand the other two functions correctly, cellfun gives a 1 if it finds an empty, and nnz finds those 1's. So I ran this on both datasets - the short one gave an output of 0 and the longer one gave an output of 4026. I'm not sure how it could catch that many timestamps that didn't match - they were all made using the same averaging program.
Walter Roberson
Walter Roberson 2011년 5월 6일
Matt!! This cries out for logical indexing!
t = char(wind.TheTime);
if size(t,2) < 19
disp('all entries have dates too short!')
else
toolong = false(size(t,1),1)
if size(t,2) > 19
toolong = any(t(:,20:end) ~= ' ',2);
end
badfmt = ~toolong & ismember(t(:,[1:2 4:5 7:10 12:13 15:16 18:19]),'0':'9') & all(t(:,[3 5]),2)=='/' & t(:,11) == ' ' & all(t(:,[14 17])==':'),2);
if any(toolong)
disp('Entries were too long at lines')
disp(find(toolong).')
disp('which were:')
disp(t(toolong,:))
disp('')
end
if any(badfmt)
disp('Entries had bad format at lines')
disp(find(badfmt).')
disp('which were:')
disp(t(badfmt,:))
disp('')
end
Walter Roberson
Walter Roberson 2011년 5월 6일
This are some bugs in Matt's answer. The corrected version is:
nnz(cellfun(@isempty,regexp(wind.TheTime,'^\d\d/\d\d/\d\d\d\d\s\d\d:\d\d:\d\d$')))
If single-digit days, months, hours, and minutes must be accepted, then
nnz(cellfun(@isempty,regexp(wind.TheTime,'^\d\d?/\d\d?/\d\d\d\d\s\d\d?:\d\d?:\d\d?$')))
Matt's version did not find dates with leading or trailing nonsense, and did not find dates with excessive digits in some of the positions.
Braden
Braden 2011년 5월 6일
Walter - thanks for the help! Your answer looks like it will be informative. A heads up - there was an 'end' missing at the bottom, as well as a bracket on the badfmt line. After that, it now gives this error:
??? Error using ==> and
Inputs must have the same size.
Walter Roberson
Walter Roberson 2011년 5월 6일
Ah yes,
badfmt = ~toolong & ismember(t(:,[1:2 4:5 7:10 12:13 15:16 18:19]),'0':'9') & all(t(:,[3 5])=='/',2) & t(:,11) == ' ' & all(t(:,[14 17])==':',2);
I think that should fix the "and" problem as well.
Matt Tearle
Matt Tearle 2011년 5월 6일
I was using \d+ because (I think) datenum would cope with a single digit month or day, for example. But Walter's right that it also wouldn't catch things like 3-digit months. I was assuming that a very basic pattern search would reveal the problem (such as missing h:m:s data). But the anchoring operators are a good idea, to avoid leading/trailing garbage (an equally likely cause of problems).
Braden
Braden 2011년 5월 9일
Thanks for the follow up Walter. Although even with the new "badfmt" expression, I cannot seem to make that "and" problem go away. The same error keeps coming up.
Walter Roberson
Walter Roberson 2011년 5월 9일
Looks like I missed an "all"
badfmt = ~toolong & all(ismember(t(:,[1:2 4:5 7:10 12:13 15:16 18:19]),'0':'9'),2) & all(t(:,[3 5])=='/',2) & t(:,11) == ' ' & all(t(:,[14 17])==':',2);
Braden
Braden 2011년 5월 9일
After getting Walter's answer debugged, it does not catch anything on either dataset - the long or the short one. Although Matt's nnz answer still outputs ans = 4026 for the longer dataset. The longer dataset still throws the datenum error, so I am not sure what the issue is.
Walter Roberson
Walter Roberson 2011년 5월 9일
I think it is time to run the dates through a "for" loop, converting one by one in a try/catch block. If that succeeds then it might be worth splitting the dataset into pieces and seeing what is going on.
Also, in Matt's nnz(), if you replace the nnz() by find() then it will output the entry numbers, which you could then examine more closely.
Braden
Braden 2011년 5월 9일
Thanks for the tip about find(). I zeroed in on those numbers and it turns out there was a month where seconds weren't included in the time stamp. This is fixed and the import is complete.
Walter Roberson
Walter Roberson 2011년 5월 9일
Agggh, I need a visit from Captain DeMorgan!
badfmt = ~toolong & ~(all(ismember(t(:,[1:2 4:5 7:10 12:13 15:16 18:19]),'0':'9'),2) & all(t(:,[3 5])=='/',2) & t(:,11) == ' ' & all(t(:,[14 17])==':',2));
Why did it work in my test??
Matt Tearle
Matt Tearle 2011년 5월 10일
@Braden: excellent, glad it's resolved.
@Walter: "Captain DeMorgan"... Let X be the set of all rum-drinkers, and let Y be the set of all pirates...
Walter Roberson
Walter Roberson 2011년 5월 10일
A bit long, but... http://www.youtube.com/watch?v=d67j-Hfgsco

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Data Import from MATLAB에 대해 자세히 알아보기

질문:

2011년 5월 6일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by