Strfind to contain complex pattern
    조회 수: 5 (최근 30일)
  
       이전 댓글 표시
    
I have created the following program to search for sentences. I want to include those that only begin with vowels.
a = 'John played volleyball. I love Anna. Are you there?'
b = strfind(a,'.')
 y{1} = a(1:b(1)) 
for i=2:length(b)
   y{i} = a(b(i-1)+1:b(i)) 
end
y{i+1} = a((b(i)+1):end)
댓글 수: 0
채택된 답변
  Cedric
      
      
 2018년 1월 3일
        
      편집: Cedric
      
      
 2018년 1월 3일
  
      Here is an approach:
 str    = 'John played volleyball. I love Anna. Are you there?' ;
 buffer = strtrim( strsplit( str, {'.', '?', '!'} )) ;
 for k = numel( buffer ) : -1 : 1
    if isempty( buffer{k} ) || ~any( upper(buffer{k}(1)) == 'AEIOUY' )
       buffer(k) = [] ;
    end
 end
Running this, you get:
 >> buffer
 buffer =
  1×2 cell array
    {'I love Anna'}    {'Are you there'}
If you don't understand, evaluate the following expression independently, and analyze their output:
 strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} )
 buffer = strtrim( strsplit( 'John played volleyball. I love Anna. Are you there?', {'.', '?', '!'} ))
 upper(buffer{1}(1))
 upper(buffer{1}(1)) == 'AEIOUY'
 any( upper(buffer{1}(1)) == 'AEIOUY' )
 upper(buffer{2}(1))
 upper(buffer{2}(1)) == 'AEIOUY'
 any( upper(buffer{2}(1)) == 'AEIOUY' )
PS: this could also be done using regular expressions, but more classic approaches (like the above) should be understood first.
PS2: your first attempt is good actually. You try to implement STRSPLIT and it works to some extent; it was good training, but it would have to be extended to support multiple delimiters. If you run the expressions above, you will realize that STRSPLIT does the split (outputs a cell array). You may have to use another approach if you need to keep the delimiters (.?!) though (see PS3). STRSPLIT leaves the leading and trailing spaces, which is a problem if you want to test the first character, hence the call to STRTRIM. The output of STRTRIM is a cell array of "clean" sentences. The loop removes all entries that don't start with a vowel. It needs to go backwards from the end of the array, otherwise indexing is messed up.
PS3: If you need to keep the punctuation, replace the call to STRSPLIT with a call to REGEXP, as follows:
 buffer = strtrim( regexp( str, '.*?[\.!\?]', 'match' )) ;
댓글 수: 0
추가 답변 (0개)
참고 항목
카테고리
				Help Center 및 File Exchange에서 Characters and Strings에 대해 자세히 알아보기
			
	제품
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

