Strfind to contain complex pattern
조회 수: 1 (최근 30일)
이전 댓글 표시
I have created the following program to search for sentences. I want to include those that only begin with vowels.
a = 'John played volleyball. I love Anna. Are you there?'
b = strfind(a,'.')
y{1} = a(1:b(1))
for i=2:length(b)
y{i} = a(b(i-1)+1:b(i))
end
y{i+1} = a((b(i)+1):end)
댓글 수: 0
채택된 답변
Cedric
2018년 1월 3일
편집: Cedric
2018년 1월 3일
Here is an approach:
str = 'John played volleyball. I love Anna. Are you there?' ;
buffer = strtrim( strsplit( str, {'.', '?', '!'} )) ;
for k = numel( buffer ) : -1 : 1
if isempty( buffer{k} ) || ~any( upper(buffer{k}(1)) == 'AEIOUY' )
buffer(k) = [] ;
end
end
Running this, you get:
>> buffer
buffer =
1×2 cell array
{'I love Anna'} {'Are you there'}
If you don't understand, evaluate the following expression independently, and analyze their output:
strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} )
buffer = strtrim( strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} ))
upper(buffer{1}(1))
upper(buffer{1}(1)) == 'AEIOUY'
any( upper(buffer{1}(1)) == 'AEIOUY' )
upper(buffer{2}(1))
upper(buffer{2}(1)) == 'AEIOUY'
any( upper(buffer{2}(1)) == 'AEIOUY' )
PS: this could also be done using regular expressions, but more classic approaches (like the above) should be understood first.
PS2: your first attempt is good actually. You try to implement STRSPLIT and it works to some extent; it was good training, but it would have to be extended to support multiple delimiters. If you run the expressions above, you will realize that STRSPLIT does the split (outputs a cell array). You may have to use another approach if you need to keep the delimiters (.?!) though (see PS3). STRSPLIT leaves the leading and trailing spaces, which is a problem if you want to test the first character, hence the call to STRTRIM. The output of STRTRIM is a cell array of "clean" sentences. The loop removes all entries that don't start with a vowel. It needs to go backwards from the end of the array, otherwise indexing is messed up.
PS3: If you need to keep the punctuation, replace the call to STRSPLIT with a call to REGEXP, as follows:
buffer = strtrim( regexp( str, '.*?[\.!\?]', 'match' )) ;
댓글 수: 0
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Characters and Strings에 대해 자세히 알아보기
제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!