extracting numbers after the particular string from cell array

조회 수: 1 (최근 30일)
sermet OGUTCU
sermet OGUTCU 2021년 7월 2일
편집: dpb 2021년 7월 2일
data={'333', 'AS C37 2021 03 28 00 05 30.000000 1 -0.884071511631E-03','abvc','400 55 a','AS G17 2021 3 28 0 17 30.000000 1 0.416843065644E-03'};
For example in the above cell array, how can I extract all YYYY MM DD HH MM SS (2021 03 28 00 05 30.00 and 2021 3 28 0 17 30.0)?
The related YYYY MM DD HH MM SS values always comes after AS [A-Z][0-9][0-9] (for example, AS C37 and AS G17). So, can we define the codes for extracting these values following this rule? The original size of the data cell array is 1x400000, therefore the speed is also an important factor.
  댓글 수: 6
dpb
dpb 2021년 7월 2일
There may well be (probably is, no undoubtedly is) code to read these files available -- they might already have a MATLAB routine, even. Have you looked for what routines are available?
sermet OGUTCU
sermet OGUTCU 2021년 7월 2일
I just want to extract all dates YYYY MM DD HH MM SS (such as 2021 03 28 00 05 30.000000) from this cell array.

댓글을 달려면 로그인하십시오.

채택된 답변

dpb
dpb 2021년 7월 2일
편집: dpb 2021년 7월 2일
Oh. I see I didn't look far enough down the file -- the header stuff ends at record 170; the other data starts at record 171.
tCOD=readtable('COD0MGXFIN_20210870000_01D_30S_CLK.clk','FileType','text', ...
'headerlines',170,'readvariablenames',0);
tCOD.Properties.VariableNames(3:8)={'Yr','Mn','Day','Hr','Min','Sec'};
tCOD.DateTime=datetime(tCOD{:,{'Yr','Mn','Day','Hr','Min','Sec'}});
leaves you with
>> [head(tCOD);tail(tCOD)]
ans =
16×12 table
Var1 Var2 Yr Mn Day Hr Min Sec Var9 Var10 Var11 DateTime
______ _____________ ____ __ ___ __ ___ ___ ____ ___________ __________ ____________________
{'AR'} {'BADG00RUS'} 2021 3 28 0 0 0 2 0.00044149 3.7396e-11 28-Mar-2021 00:00:00
{'AR'} {'ABMF00GLP'} 2021 3 28 0 0 0 2 -0.00024309 3.739e-11 28-Mar-2021 00:00:00
{'AR'} {'AJAC00FRA'} 2021 3 28 0 0 0 2 -0.00038427 3.7166e-11 28-Mar-2021 00:00:00
{'AR'} {'ALIC00AUS'} 2021 3 28 0 0 0 2 -2.4277e-09 3.7381e-11 28-Mar-2021 00:00:00
{'AR'} {'AMU200ATA'} 2021 3 28 0 0 0 2 -2.9659e-08 3.7474e-11 28-Mar-2021 00:00:00
{'AR'} {'ANKR00TUR'} 2021 3 28 0 0 0 2 1.9425e-08 3.7349e-11 28-Mar-2021 00:00:00
{'AR'} {'AREG00PER'} 2021 3 28 0 0 0 2 0.00046999 3.7485e-11 28-Mar-2021 00:00:00
{'AR'} {'ASCG00SHN'} 2021 3 28 0 0 0 2 -3.5686e-08 3.7378e-11 28-Mar-2021 00:00:00
{'AS'} {'R16' } 2021 3 28 23 59 30 1 -1.3127e-05 NaN 28-Mar-2021 23:59:30
{'AS'} {'R17' } 2021 3 28 23 59 30 1 0.00041179 NaN 28-Mar-2021 23:59:30
{'AS'} {'R18' } 2021 3 28 23 59 30 1 7.1344e-05 NaN 28-Mar-2021 23:59:30
{'AS'} {'R19' } 2021 3 28 23 59 30 1 -0.00013759 NaN 28-Mar-2021 23:59:30
{'AS'} {'R20' } 2021 3 28 23 59 30 1 -4.6221e-05 NaN 28-Mar-2021 23:59:30
{'AS'} {'R21' } 2021 3 28 23 59 30 1 -0.00019777 NaN 28-Mar-2021 23:59:30
{'AS'} {'R22' } 2021 3 28 23 59 30 1 -0.00010502 NaN 28-Mar-2021 23:59:30
{'AS'} {'R24' } 2021 3 28 23 59 30 1 3.6747e-05 NaN 28-Mar-2021 23:59:30
>>
There are only two (2) variables past the time field at the end of the table instead of three (3), hence the NaN elements for Var11.
You can either scan the file for the location of the "END OF HEADER" record to find the number of headerlines to skip or the probably is sufficient data within the file header to compute where that is -- although if the COMMENTS are freeform, there may not be a fixed number of records there and so it may just take scanning the file first...
Either way, this is much simpler and straightforward than trying to parse the cell array...that's fraught with difficulty in comparison.

추가 답변 (1개)

sermet OGUTCU
sermet OGUTCU 2021년 7월 2일
I attached the original file.

카테고리

Help CenterFile Exchange에서 Clocks and Timers에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by