MATLAB Answers

Compare two list of strings line by line for a match and summarize results.

조회 수: 40(최근 30일)
Jon Thornburg
Jon Thornburg 29 Mar 2021
편집: dpb 7 Apr 2021 19:31
I have a script that works with the test data files.
It breaks down 2 input files of differnt sizes, into strings, then compares line by line for a match.
How do add the results from each iteration to a summary vector?
How do identify the lines that match?
In the test data it is str_chk(8) and str_readin(8) which would be line #162 in the results.
For the test data I have been copy/pasting the command window result into an excel file.
%AutoBOM
clear
%input search list
cd 'C:\Desktop\AutoBOM'
[n,t,r]=xlsread('readin.csv');
%convert it to a string
str_readin=string(r);
str_readin=lower(str_readin);
% input list of known word to compare to
%input libary
[a,b,l]=xlsread('check_list.csv');
%convert it to a string
str_chk=string(l);
str_chk=lower(str_chk);
%Compare line by line for a match
for i=1:numel(str_chk)
for j=1:numel(str_readin)
if str_chk(i) == str_readin(j);
disp('Match')
% How do I add the result to a new vector?
% How do I identify the lines that match?
else disp('No')
%add displayed word to the same vector
end
end
end

채택된 답변

dpb
dpb 29 Mar 2021
편집: dpb 1 Apr 2021
Don't need to explicitly loop -- MATLAB has functions builtin to do that for you.
readin=lower(string(textread('readin.csv','%s','delimiter','\n')));
check=lower(string(textread('check_list.csv','%s','delimiter','\n')));
[ia,locb]=ismember(readin,check);
gives you
>> find(ia)
ans =
8
>> readin(ia)
ans =
"joker"
>> check(locb~=0)
ans =
"joker"
>>
So, depending upon what it is you need to return, you have the locations in the read in string array found in the check string array in ia and the location of the first matching location in the check array (if more than one) for the associated string.
See the doc for ismemeber for the full details on input/output arguments.
ADDENDUM:
>> readin=[readin;"2"]; % add another element that is duplicated in check
[ia,locb]=ismember(readin,check);
found=readin(ia);
% illustrate what we get...
>> found
found =
2×1 string array
"joker"
"2"
>>
nFound=numel(found);
locsInCheck=cell(nFound,1); % preallocate cell array for locations
for i=1:nFound
locsInCheck(i)={find(found(i)==check)};
end
% show what we gots...
>> locsInCheck
locsInCheck =
1×2 cell array
{[8.00]} {6×1 double}
>>
You've now got the strings found in the check array and where all of them are by string along with the string itself.
Only the second loop needed to find for each by string; the first loop is inside the much more efficient builtin ismember function to do the hard work.
If you wanted, you could replace the explicit for...end loop above with a cellfun construct
locsInCheck=cellfun(@(s) find(s==check),found,'uniform',0);
  댓글 수: 9
Adam Danz
Adam Danz 7 Apr 2021 18:54
I can see how the additional ismember output would be helpful.
This doesn't really avoid loops but with string arrays that aren't too long,
A = ["A" "B" "C" "B"];
B = ["B" "B" "C" "B" "A"];
[row, col] = find(A(:)==B(:).');
arrayfun(@(i){B(col(row==i))}, 1:numel(A))
ans = 1×4 cell array
{["A"]} {["B" "B" "B"]} {["C"]} {["B" "B" "B"]}
% Or to return Bi,
% arrayfun(@(i){col(row==i)}, 1:numel(A))

댓글을 달려면 로그인하십시오.

추가 답변(0개)

제품


릴리스

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by