How to capture an optional expression using regular expressions?

조회 수: 2 (최근 30일)
Patrick Mboma
Patrick Mboma 2015년 9월 23일
댓글: Walter Roberson 2015년 9월 25일
Dear all,
I would like to use regular expressions to capture and transform expressions of the form
name
or
name(string,digits)
where name belongs to a list of NAMES and digits is an integer: 1, 2, 3,...
That is, "name" is optionally followed by
  1. an opening parenthesis: (
  2. a string
  3. some numbers : 1, 2, 3,...
  4. a closing parenthesis: )
For that purpose I wrote the following regular expression that does not work
expr='(\w+)((\w+),(\d+))?'
replace='${convertMe($1,$3,$4)}';
result=regexprep(cellarray,expr,replace)
I have written a convertMe function taking 3 inputs but only the first input gets in. The other inputs the function receives are $3 and $4 instead of the second string and the digits.
Any suggestions?

답변 (2개)

Walter Roberson
Walter Roberson 2015년 9월 23일
For the longer case, expr='(\w+)(?:\(\w+),\(d+)\))' replace='${convertMe($1,$2,$3)}'
For the case with no argument supplied, it is not clear what you would like passed to convertMe or if you want convertMe to be called at all.
  댓글 수: 2
Patrick Mboma
Patrick Mboma 2015년 9월 23일
Hi Walter, Thanks for the answer. It seems to me that what you suggest has some problems. I tried to correct your suggestion as follows
expr='(\w+)(?:\((\w+),(\d+)\))'
But it still doesn't work. I do capture the first \w+ but not the second and not the \d+ . The second and third arguments received by convertMe are $2 and $3.
In the short case, I would like to capture only the first \w+ only. I would be fine if in that case convertMe receives $2 and $3 as the second and third input arguments respectively.
Walter Roberson
Walter Roberson 2015년 9월 25일
Odd. I had it working the other day, but now it doesn't.

댓글을 달려면 로그인하십시오.


Cedric
Cedric 2015년 9월 24일
Another option is to parse all entries first, and then to rebuild relevant expressions:
entries = {'name1(John, 48)', 'name2', 'name3(Doo)'} ;
tokens = regexp( entries, '([\w\d_-]+)\(?(\w+)?,?\s*(\d+)?', 'tokens', 'once' ) ;
parsed = vertcat( tokens{:} ) ;
With that you get
>> parsed
parsed =
'name1' 'John' '48'
'name2' '' ''
'name3' 'Doo' ''
which is easy to post process for building whatever you need.

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by