Problems finding exact match for a string

조회 수: 28 (최근 30일)
J M
J M 2017년 8월 23일
댓글: J M 2017년 8월 24일
Hello,
I would like to match a string in a list of other strings (cell array). My problem is that in using regexpi or regexp, it misidentifies the location of the string if it finds it as a substring. For example
GN= {‘EC2020’}; In looking at a cell array of strings, GN will be found in the string ‘EC2020.1’
I do not want it to match with this expression because this is string stands for something different. I simply want to exact match EC2020 without including EC2020.1 as a hit. Some more details of this is example:
GN = {'EC2020'};
List = {'EC1919'; 'EC2020'; 'EC2020.1'};
dList = length(List(:,1));
pds = cell(1,dL);
j = 1;
for i = 1:dList
c = find(~cellfun(@isempty,regexpi(List(i,:),GN)));
if ~isempty(c)
pds(j)= List(i,1);
j = j + 1;
end
end
Any help would be greatly appreciated!

채택된 답변

Jan
Jan 2017년 8월 24일
편집: Jan 2017년 8월 24일
What about strcmp:
GN = {'EC2020'};
List = {'EC1919'; 'EC2020'; 'EC2020.1'};
index = find(strcmp(List, GN{1}))
any(strcmp(List, GN{1})) looks much easier than
~isempty(find(~cellfun(@isempty,regexpi(List(i,:),GN))))
Your output pds is a cell of the same size as List, and it contains the searched strings as often, as they are found. This can be done without a loop by:
pds = List(strcmp(List, GN{1}));
Notes:
  • length(List(:,1)) wastes time with creating the vector List(:,1). Prefer: size(List, 1)
  • cellfun('isempty', ...) is faster than |cellfun(@isempty, ...)
  댓글 수: 1
J M
J M 2017년 8월 24일
it worked and was very simple thank you!

댓글을 달려면 로그인하십시오.

추가 답변 (3개)

Walter Roberson
Walter Roberson 2017년 8월 24일
Because of your decimal points, you should be using regexptranslate() to prepare your target strings.
You can force exact matches of complete strings by putting '^' before the prepared string and '$' after it. However, if that is your purpose, you should consider using strcmp as Jan shows.
If you are looking for an match with any one of a number of strings, then you should consider using ismember(). If you need something more complicated, then you can consider
temp = regexptranslate('escape', GN);
pattern = [ '^(', strjoin(temp, '|'), '$' ];
match_information = regexp(List, pattern, 'match');
Or perhaps
temp = regexptranslate('escape', GN);
pattern = [ '^(?<word>', strjoin(temp, '|?<word>'), ')$' ];
match_struct = regexp(List, pattern, 'names');
{match_struct.name}
  댓글 수: 1
J M
J M 2017년 8월 24일
thank you I will try this one as well to learn from it. Much appreciated

댓글을 달려면 로그인하십시오.


John BG
John BG 2017년 8월 24일
Hi J M
From the additional information
for a much larger set of info ..
The GN variable represents a list of values ..
that includes both decimal values and non-decimal values.
the best way to obtain matching strings is with intersect
List={'EC1919';'EC2020';'EC2020.1';'EC1919.21';'EC1234'};
GN={'EC2020','EC3456','EC1919'};
A=intersect(List,GN)
A =
2×1 cell array
'EC1919'
'EC2020'
if you find this answer useful would you please be so kind to consider marking my answer as Accepted Answer?
To any other reader, if you find this answer useful please consider clicking on the thumbs-up vote link
thanks in advance
John BG
  댓글 수: 1
J M
J M 2017년 8월 24일
I will also try this out. Thanks so much for the reply and suggestion.

댓글을 달려면 로그인하십시오.


Image Analyst
Image Analyst 2017년 8월 23일
Check the length of the strings in addition. The lengths must match.
  댓글 수: 1
J M
J M 2017년 8월 23일
Thanks for the reply. I would but this is actually for a much larger set of info in which I can't just limit it by the size. The GN variable represents a list of values (which I've put into a loop) that includes both decimal values and non-decimal values.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by