How to check for gaps between datetimes in list of files in a folder?

조회 수: 17 (최근 30일)
I have a list of files in a folder which are named in datetimes of two intervals. I want to check that each file is in fact every 2 minutes, and that there are no gaps in files. I should have one file every 2 mins.
To do this I have converted the datetimes of the files to serial date numbers, and calculated what 2 mins is in terms of serial date. I then use this to check if the gap between each file is equal to this or not. However, even though I have checked manually that the numbers 'gap' and 'correctgap' are the same, my script tells me that they are not. I am also having a problem in that the second for loop in my script reverts back to the first, rather than cycling within the loop to the number of files within d.
Can anyone see the problem? Thanks a lot!!
%Checking for gaps between files
paths={'H:\SoundTrap\wav\GoatIsland\001_GoatIsland_5100'};
dateFormat='yymmddHHMMSS';
correctgap=0.006944444496185;
lastfile=strsplit('5100.190610124350.wav','.'); %filename of first file in folder
lastfile=lastfile(2);
lastfile=datenum(lastfile,dateFormat);
for i=1:length(paths)
path=char(paths(i));
d=dir(fullfile(path,'*.wav'));
filecount=length(d);
%for loop that doesn't work:
for j=2:length(d) %start on second file (comparing second to first)
DateTime=strsplit(d(j).name,'.');
DateTime=DateTime(2);
DateTime=datenum(DateTime,dateFormat);
gap=DateTime - lastfile; % dateTime
if gap==correctgap %doesn't work, when values are the same, error('gap!') is returned
disp('good')
else
error('gap!')
end
lastfile=DateTime;
end
end

채택된 답변

Mohammad Sami
Mohammad Sami 2020년 4월 22일
This is likely an issue with the precision of the double value. instead of checking for exact equality you may want to check for abs difference less then a certain threshold
threshold = 0.1 / (24*3600); % 0.1 second. change as needed
%.... your code .....%
if abs(gap-correctgap) < threshold
disp('good');
else
error('gap!');
end
  댓글 수: 2
Louise Wilson
Louise Wilson 2020년 4월 26일
Thanks Mohammad! That fixes it. Just to clarify, are you saying that 0.1/(24*3600) is one second in the datenum format?
Louise Wilson
Louise Wilson 2020년 4월 26일
I have this now and it works, but in my output table the only thing I can get to work is single digits. Ideally I would have datetime in the first column and either 'good' or 'gap!' in the second column. How do I change the format of the output so that the output table can contain both a datenum in the first column and text in the second column?
paths={'Y:\SoundTrap\wav\GoatIsland\001_GoatIsland_5100'};
row=1;
output=[];
dateFormat='yymmddHHMMSS';
correctgap=0.006944444496185; %correct gap when 10 minute interval
threshold=0.1/(24*3600);
lastfile=strsplit('5100.190610124350.wav','.'); %filename of first file in folder
lastfile=lastfile(2);
lastfile=datenum(lastfile,dateFormat);
for i=1:length(paths)
path=char(paths(i));
d=dir(fullfile(path,'*.wav'));
filecount=length(d);
for j=2:length(d) %start on second file (comparing second to first)
DateTime=strsplit(d(j).name,'.');
DateTime=DateTime(2);
DateTime=datenum(DateTime,dateFormat);
gap=DateTime - lastfile; % dateTime
output(row,1)=DateTime;
if abs(gap-correctgap) < threshold
disp('good')
output(row,2)='1'; %1=49
else
error('gap!')
output(row,2)='0'; %0=48
end
lastfile=DateTime;
row=row+1;
end
end

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Peter Perkins
Peter Perkins 2020년 4월 27일
Louise, if you really want one array that contains both time and text, timetable is the right thign to use (though I would argue that you want logical, not text).
I recommend that you don't use datenum at all unless you are using a very old version of MATLAB. This will not suffer from the round-off problems you are having:
>> fname = '5100.190610124350.wav';
>> timestamp = datetime(fname,'InputFormat',"'5100.'yyMMddHHmmss'.wav'")
timestamp =
datetime
10-Jun-2019 12:43:50
When you subtract each timestamp from the previous, you will get a duration, and that's fine. Better, in fact.
Notice how I've embedded '5100.' and '.wav' as literals in the input format. There are other ways to peel that onion (e.g. the strplit you were using), but the above is straight-forward. Maybe if the '5100.' part is not always the same, you'd want to use strsplit.
There are easy ways to get your output with no loop. Assume you have the struct that you get back from dir, with a name field.
>> d = dir
d =
5×1 struct array with fields:
name
folder
date
bytes
isdir
datenum
>> fnames = {d.name}'
fnames =
6×1 cell array
{'5100.190610124350.wav'}
{'5100.190610125350.wav'}
{'5100.190610130351.wav'}
{'5100.190610131350.wav'}
{'5100.190610132350.wav'}
{'5100.190610133350.wav'}
>> timestamps = datetime(fnames,'InputFormat',"'5100.'yyMMddHHmmss'.wav'")
timestamps =
6×1 datetime array
10-Jun-2019 12:43:50
10-Jun-2019 12:53:50
10-Jun-2019 13:03:51
10-Jun-2019 13:13:50
10-Jun-2019 13:23:50
10-Jun-2019 13:33:50
>> gaps = [NaN; diff(timestamps)]
gaps =
6×1 duration array
NaN
00:10:00
00:10:01
00:09:59
00:10:00
00:10:00
>> output = timetable(timestamps,gaps,gaps==minutes(10))
output =
6×2 timetable
timestamps gaps Var2
____________________ ________ _____
10-Jun-2019 12:43:50 NaN false
10-Jun-2019 12:53:50 00:10:00 true
10-Jun-2019 13:03:51 00:10:01 false
10-Jun-2019 13:13:50 00:09:59 false
10-Jun-2019 13:23:50 00:10:00 true
10-Jun-2019 13:33:50 00:10:00 true
I don't know if that's actually what you want: notice that the 3rd file's timestamp is off by 1sec, and you get a false, but the 4th also gets a false even though its timestamp is on a ten minute boundary, because the gap frm the 3rd is not 10min. Maybe you want to look at differences from the first file, not the previous, I don't know. That would be easy in the above code.
>> (timestamps(4) - timestamps(1)) / minutes(10)
ans =
3
  댓글 수: 2
Louise Wilson
Louise Wilson 2020년 4월 28일
Hi Peter-thanks for your response.
Using duration makes sense to me! The four digit number at the start is the serial number of the device I used to record the file, so that can be any of ten different serial numbers.
I do want to look at differences between subsequent files, not from the first file-I am interested in the difference between each file. Basically this is a hydrophone which is scheduled to record every 10 minutes, and I want to make sure that none were skipped e.g. the gap between recordings is 20 minutes. Seconds aren't important. So I guess I could change 'minutes(10)' to 'minutes(18)' or something and that would catch anything?
Peter Perkins
Peter Perkins 2020년 4월 30일
Probably you want something like gaps>minutes(11), I would think.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Dates and Time에 대해 자세히 알아보기

제품


릴리스

R2019a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by