regexp for time - cannot guess it
조회 수: 3 (최근 30일)
이전 댓글 표시
regexp('123456.00','(?<hh>\d{2})(?<mm>\d{2})(?<ss>\d{2})\.(?<ff>\d{2})','names')
ans =
struct with fields:
hh: '12'
mm: '34'
ss: '56'
ff: '00'
super. :)
Wanting to make the ".00" optional, hence wanting to put a ()* round the last part, but, ...
>> regexp('123456.00','(?<hh>\d{2})(?<mm>\d{2})(?<ss>\d{2})(\.(?<ff>\d{2}))*','names')
ans =
struct with fields:
hh: '12'
mm: '34'
ss: '56'
even () messes it up
regexp('123456.00','(?<hh>\d{2})(?<mm>\d{2})(?<ss>\d{2})(\.(?<ff>\d{2}))','names')
ans =
struct with fields:
hh: '12'
mm: '34'
ss: '56'
Anyone point me to it? I am done reading and guessing.
many thanks in anticipation.
댓글 수: 0
답변 (2개)
Stephen23
2019년 4월 11일
편집: Stephen23
2019년 4월 11일
This is easy with a conditional expression based on the fourth token (decimal point) match:
>> rgx = '(?<hh>\d{2})(?<mm>\d{2})(?<ss>\d{2})(\.?)(?<ff>(?(4)\d{2}))';
% Added token and conditional expression: ^ ^^ ^^^^^ ^
>> regexp('123456.00',rgx,'names')
ans =
hh: '12'
mm: '34'
ss: '56'
ff: '00'
>> regexp('123456',rgx,'names')
ans =
hh: '12'
mm: '34'
ss: '56'
ff: ''
Note that one easy way to experiment with regular expressions is to download my interactive regular expression tool iregexp, which lets you quickly try out different parse strings and match strings, and shows regexp's outputs as you type:
iregexp has exactly the same inputs/outputs as regexp, plus it also has text fields for interactively changing the match and parse strings. For your example:
>> out = iregexp('123456.00',rgx,'names')
and then I interactively changed the parse string to give:
댓글 수: 2
Stephen23
2019년 4월 12일
편집: Stephen23
2019년 4월 12일
"Read the doc regexp again & have no clue what ?(4) does."
(?(4)...) is a conditional operator which matches that token only the if fourth token (the decimal point) was matched. Read about conditional operators.
You got rid that conditional operator from your code, which means that your code will match a trailing decimal point which is incorrect according to the specification that you wrote in your question: "Wanting to make the ".00" optional...". Also note that getting rid of the conditional operator makes the fourth token totally superfluous.
"Experimenting from your answer, I see taht all I needed to do it the ()* way, I was looking for was to put the \d{2} in brackets. Then it accepts ()* brackets round the outside, for some reason."
The code I gave you already works correctly according to the specification in your question.
The changes that you have made have made the regular expression buggy because it now matches strings which are not valid time strings, e.g. it will now match '1234567890123456' and '123456......' and '123456.........78901234'. I have no idea why you think you needed to change my answer, or why you think you need to use *, or why you want to match those kinds of strings (breaking with your own specification). Even if you replace * with ? your code will still match '12345678' (i.e. not a time string according to your own specification).
"Many thanks."
I hope that it helped. Please remember to accept my answer if it resolved your question. Accepting an answer is the easiest way to show your thanks for the help you recieve.
Rik
2019년 4월 11일
If you know this format, maybe a regexp isn't the right tool for the job.
function s=parse_time(str)
c=regexp(str,'\.','split');
s.hh=c{1}(1:2);
s.mm=c{1}(3:4);
s.ss=c{1}(5:6);
if numel(c)>1
s.ff=c{2};
else
s.ff='00';
end
end
참고 항목
카테고리
Help Center 및 File Exchange에서 Characters and Strings에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!