Need help with regexpi expression for multiple variants of the same phrase
조회 수: 1 (최근 30일)
이전 댓글 표시
I have a question regarding the use of regexpi to determine if certain string words are input into a text file. The text files were created by multiple individuals and use slightly different phrasing to mean the same variable. For example in a text file containing a gait evaluation the phrase 'slow cadence' was recorded, but 'slow cadence' can be denoted as 'slow cadence' or 'slow stepping'. My original code was as follows:
data=fileread('Test.txt');
A=isempty(regexpi(data{'slow cadence','slow stepping'}));
However, this version can return a false positive as it will mix and match string within the {}. For example the following code for the same file will return a '0' for the isempty function even though none of the string phrases match completely:
data=fileread('Test.txt');
A=isempty(regexpi(data{'fast cadence','slow stepping'}));
I feel like I am missing a simple command to indicate that A can be 'slow cadence' OR 'slow stepping'. Any help is much appreciated.
댓글 수: 0
답변 (2개)
Stephen23
2022년 12월 14일
편집: Stephen23
2022년 12월 14일
You will probably find the 'ONCE' option also very very very useful (here I inverted the logical output, because true=contains is usually much simpler to work with than messing-with-your-head true=doesnotcontain):
str = fileread('Test.txt');
idx = ~isempty(regexpi(str, 'slow (cadence|stepping)','once'))
Using regular expressions requires reading the documentation again and again and again and again and again... it takes quite a while to get profficient and comfortable using them. Also, make sure you read the documentation.
You might also find my interactive tool useful for helping to develop regular expressions:
I should also mention, that if you want to use regular expressions then you need to read the documentation. A lot.
PS: Another approach using the newer CONTAINS and patterns:
pat = regexpPattern('slow (cadence|stepping)');
idx = contains(str,pat, 'ignorecase',true)
댓글 수: 0
Fifteen12
2022년 12월 14일
I think you want to look at making regular expressions. Try this:
A=isempty(regexpi(data,'(slow cadence|slow stepping)'));
You'll probably want to do more case matching as well, using wild cards to subsitite for white spaace, etc. You can find more here
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Data Type Conversion에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!