How do you do regular expressions at the character level?
조회 수: 1 (최근 30일)
이전 댓글 표시
Hello all,
I am trying to find words in a text with a set of rules and then extract them. I am looking for words with a certain structure. The words themselves have different lengths and letters.
for example:
term_1 = "TER";
term_2 = "ZTnE";
term_3 = "ZEnP";
...
Since I have a lot of terms, I tried to create a pattern with character-level rules. To do this, I split up the terms and always looked to see which character could occur at which position in the string.
For the simple example above:
1st place:
seg_1 = '[TZ]'
2nd place:
seg_2 = '[ET]'
3rd digit:
seg_3 = '[nR]'
4th digit:
seg_4 = '[EP]'
seg = seg_1 + seg_2 + seg_3 + seg_4;
result = extract(term_2, seg)
This now works for a term with the same length, but term_1 is not recognised.
Therefore, I have now made the following adjustment and declared the 4th seg as optionalPattern:
seg_4 = optionalPattern("E" | "P");
This is how the extraction works now. However, terms are now also extracted that skip an optionalPattern in the meantime.
Does anyone have any other ideas on how I can easily and safely include terms of different lengths?
Thank you very much!
댓글 수: 0
채택된 답변
Walter Roberson
2023년 10월 25일
편집: Walter Roberson
2023년 10월 25일
"+" on character vectors is not a pattern operation.
seg_1 = '[TZ]'
seg_2 = '[ET]'
seg = seg_1 + seg_2
char(seg)
p_1 = characterListPattern('TZ')
p_2 = characterListPattern('ET')
p_1 + p_2
댓글 수: 4
Walter Roberson
2023년 10월 29일
편집: Walter Roberson
2023년 10월 31일
In terms of your original patterns, 'ZT E' would require that seg_3 match space instead of [nR]
To allow space instead of one of the characters, include space in the [] if you are using regexp
seg_1 = "[ TZ]"
seg_2 = "[ ET]"
seg_3 = "[ nR]"
seg_4 = "[ EP]";
seg = seg_1 + seg_2 + seg_3 + seg_4;
result = regexp(term_2, seg, 'match');
If you want to more generally include "whitespace" (such as tab) then instead of putting a space in the [], use \s such as "[\sTZ]"
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Get Started with MATLAB에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!