regexp - match regular expression question

조회 수: 2 (최근 30일)
Kenny
Kenny 2016년 9월 30일
편집: Kenny 2016년 11월 10일
Hi all,
In the Matlab 'help' documents for the function called regexp, I'm trying to understand the what the vertical line ( ie. | ) means in the pattern layout below. The example below comes directly from Matlab's help area .... after typing 'help regexp'.
The help documentation indicates:
"|" means Match subexpression before or after the "|"
What I would like to ask is. What does the above mean exactly? At the moment, I'm thinking 'which is it?' .... I was expecting that a match would either be 'before', or it would be 'after'.... but not both before OR after. But even if it really means 'match before OR after', what does that mean exactly? For example, what does "|" actually represent?
Thanks in advance.
str = 'John Davis; Rogers, James';
pat = '(?<first>\w+)\s+(?<last>\w+)|(?<last>\w+),\s+(?<first>\w+)';
n = regexp(str, pat, 'names')
  댓글 수: 2
Stephen23
Stephen23 2016년 9월 30일
The | is an exclusive or. Here is an example of how it works, tested on a string with four slightly different "words":
>> regexp('a123z a%%%z a1%3z a__z','a(\d+|%+)z','match')
ans =
'a123z' 'a%%%z'
The pattern matches all sequences starting with a, ending with z, and containing XOR(digits,%-symbols). The third "word" in the string does not match this because it contains both digits and %-smbols, the fourth contains only underscore, so also does not match the regex. Now lets alter the regex and use two |, to give XOR(digits,%-symbols,underscores):
>> regexp('a123z,a%%%z,a1%3z,a__z','a(\d+|%+|_+)z','match')
ans =
'a123z' 'a%%%z' 'a__z'
Bonus if you want a convenient way to test and experiment with regular expressions, you can try my FEX submission:
Kenny
Kenny 2016년 9월 30일
편집: Kenny 2016년 10월 1일
Hi Stephen !! Thanks for going out of your way to help me as well. The example that you gave is truly excellent. Thanks very much for showing this. The regexp function is so powerful, but it helps a great deal when you and S.S. add great understandable examples. When I first looked at those 'code' patterns from inbuilt examples, it didn't have the nice explanations that allowed followers to follow through, and understand. Thanks for mentioning XOR, and the bonus link too! Best regards! Thanks a lot again. Kenny

댓글을 달려면 로그인하십시오.

채택된 답변

Star Strider
Star Strider 2016년 9월 30일
편집: Star Strider 2016년 9월 30일
When I’ve used the ‘|’ (‘or’) operator, I’ve used it to match either of the two (or more) sub-expressions in the expression string. In this instance, if it detects a comma it labels the first string as the last name and the second expression as the first name. If it does not detect a comma, it does the reverse. The presence or absence of a comma in the target string determines which sub-expression will return the result, because the target string with a comma will return an empty value for the sub-expression without a comma, and the reverse is true for the other sub-expression.
If you want to see how this works in practice, try it with only one sub-expression (and without the ‘|’ operator). That’s the easiest (and most instructive) way to see how a particular syntax works.
EDIT Clarified an ambiguity in the original.
  댓글 수: 2
Kenny
Kenny 2016년 9월 30일
Thanks so much for your help and time S.S. ! That helped me a lot tremendously. Thanks for helping me. Genuinely appreciated S.S.
Star Strider
Star Strider 2016년 9월 30일
As always, my pleasure!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Just for fun에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by