Regexp formatting and explanation

조회 수: 6 (최근 30일)
Taylor Wait
Taylor Wait 2018년 11월 16일
편집: Stephen23 2018년 11월 17일
I am reading a text file that I'm currently reading in and getting the data from, but I need to use a regexp to pull some data from this particular line: '0051-21194_3 Rev 09_P02109461-011_8-1-18'
Specifically, I am trying to get '0051-21194 Rev 09', and the date at the end '8-1-18'. For the number and revision, I am trying to leave out the _# because it changes. Sometimes it's not there, sometimes it is, sometimes it uses a '-' instead. I am having some serious trouble with regexp reading anything as it's pretty new to me.
So the actual question is, can someone help me to format that properly? And also point me to some good examples explaining regular expressions and formatting, because the MATLAB page hasn't been very helpful to me for actually writing it out.

답변 (1개)

Stephen23
Stephen23 2018년 11월 17일
편집: Stephen23 2018년 11월 17일
>> S = '0051-21194_3 Rev 09_P02109461-011_8-1-18';
>> R = '^(\d+-\d+)[^R]{0,4}(Rev\s\d+)\w+-\d+_(\d+-\d+-\d+)$';
>> T = regexp(S,R,'lineanchors','tokens','once');
>> T{:}
ans = 0051-21194
ans = Rev 09
ans = 8-1-18
I made some assumptions, e.g. added the line anchors in case this char vector is a substring of a larger string (i.e. the entire imported file). You can remove them if they are not required. A brief summary of that regular expression:
^(\d+-\d+)[^R]{0,4}(Rev\s\d+)\w+-\d+_(\d+-\d+-\d+)$
^ %-> start of line (or string)
(\d+-\d+) %-> '0051-21194', as token group.
[^R]{0,4} %-> '_3 ' (anything not 'R', <=4 times)
(Rev\s\d+) %-> 'Rev 09', as token group.
\w+-\d+_ %-> '_P02109461-011_'
(\d+-\d+-\d+) %-> '8-1-18', as token group.
$ %-> end of line (or string)
"And also point me to some good examples explaining regular expressions and formatting, because the MATLAB page hasn't been very helpful to me for actually writing it out."
The best source of information is the documentation:
To learn how to write regular expressions you will have to read this page countless times. Regular expressions are a whole language unto themselves, so it takes lots of time and lots of reading the documentation to learn how to use them. There is no replacement for simply reading the documentation again and again and again... and practice, practice, practice, ...
You might like to try using my FEX submssion iregexp, which lets you interactively write a regular expression and shows all of regexp's outputs as you type:
You can find various tutorials and interactive regular expression parsers online and these can be useful to practice with or to see examples... but be aware that every programming language has different regular expression features, so (in case this was not clear yet) the MATLAB documentation is the only definitive source of information on MATLAB regular expressions.

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by