Looking for an alternative to regexp.

조회 수: 3 (최근 30일)
Bob Thompson
Bob Thompson 2021년 3월 23일
편집: Stephen23 2021년 3월 25일
I'm looking for an alternative way to parse through strings to find bits of information, or for a way to use regexp that doesn't give me nested cells. I'm tired of dealing with the nested cells.
I've got a string that contains node numbers and locations. I would like to capture all of the node numbers, and then put them into a double array. I can identify and extract the numbers with regexp, but any time I use regexp with tokens I end up with cells inside of cells for a reason that I don't entirely understand. Am I doing something to create the extra layer of cells, or is there another command that can parse and extract the information I want?
singlestring = 'nxyzs=74xyz[0]:-2.0447000e+010.0000000e+001.8288000e+00Nearestnodeis7736664atadistanceof4.6823094e-03locatedat-2.0451682e+012.2396341e-161.8288000e+00';
repeatstrings = repmat(singlestring,1,5);
nodes = regexp(repeatstrings,'Nearestnodeis(\d+)','tokens');
The nodes variable will contain a 1x5 cell matrix, where each cell contains a 1x1 cell, which contains the node number string.
  댓글 수: 2
Stephen23
Stephen23 2021년 3월 24일
편집: Stephen23 2021년 3월 25일
Tokens are always returned in a cell array (with size equal to the number of tokens (thus in your case scalar, because you only specified one token)). If multiple matches is enabled (the default) then every output is nested in a cell array (with size equal to the number of matches made), so you will get nested cell arrays of tokens.
FYI, if you only need to match the regular expression exactly once, then you can specify the 'once' option and the outputs are not nested in cell arrays. This does not apply to your example, but is useful in other cases.
As well as concatenating the output data or using named tokens as the answers below show, you can also use a look-behind assertion and return the matched string (no nested cell arrays), which makes post-processing much simpler:
nodes = regexp(repeatstrings,'(?<=Nearestnodeis)\d+','match')
nodes = 1×5 cell array
{'7736664'} {'7736664'} {'7736664'} {'7736664'} {'7736664'}
vec = str2double(nodes)
vec = 1×5
7736664 7736664 7736664 7736664 7736664
Bob Thompson
Bob Thompson 2021년 3월 24일
Thanks, I definitely think this is more smooth than what I usually attempt.

댓글을 달려면 로그인하십시오.

답변 (2개)

Star Strider
Star Strider 2021년 3월 23일
See if adding either:
Out = cell2mat([nodes{:}].')
or:
Out = str2num(cell2mat([nodes{:}].'))
to the posted code provides the desired result.
Note that str2num is not generally recommended, however it works when str2double produces an unacceptable result.

Walter Roberson
Walter Roberson 2021년 3월 23일
singlestring = 'nxyzs=74xyz[0]:-2.0447000e+010.0000000e+001.8288000e+00Nearestnodeis7736664atadistanceof4.6823094e-03locatedat-2.0451682e+012.2396341e-161.8288000e+00';
repeatstrings = repmat(singlestring,1,5);
nodes = regexp(repeatstrings,'Nearestnodeis(?<NN>\d+)','names');
str2double({nodes.NN})
ans = 1×5
7736664 7736664 7736664 7736664 7736664
  댓글 수: 3
Walter Roberson
Walter Roberson 2021년 3월 23일
(?<WORD>PATTERN)
creates a named token; whatever is matched by PATTERN gets stored in a struct field named WORD, as text. But even though it is called a "named token", oddly enough to get back the struct, you have to ask for "names" instead of for "tokens".
You get back a struct array, one struct array entry for each time the overall pattern matches -- in this case one for each time Nearestnodeis is followed by a sequence of digits. So a 5 x 1 struct in this case, each with a field named as indicated, NN. So as usual with struct arrays you call pull out all of the entries using struct expansion inside a {}, creating a cell array of character vectors, and then you can convert them all at once using str2double() on the cell array.
Bob Thompson
Bob Thompson 2021년 3월 24일
Thanks for the explanation. I do like structures better than cells, most of the time.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Structures에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by