Comparing common strings in two large string arrays imported from excel (using xlsread)

조회 수: 3 (최근 30일)
I have two very large string arrays (one of them 2000 rows and the other one is 7000 rows) that I import using xlsread and want to compare with each other to see if they have any common elements. The STRR part is always the same string and all I need to do is to compare the numerical part. The ebtries in rows of each column are not repeated.
As output it will be enough to have the numbers of the rows from left and right columns where a common phrase appears. In this case the output could look like this:
1 4 (orange)
17 1 (yellow)
9 10 (blue)
and so on.
Is it possible to do this without a loop?
  댓글 수: 2
madhan ravi
madhan ravi 2019년 2월 15일
See if the below satisfies your needs:
a=str2double(regexp(a,'\d*','match','once')); % first column
b=str2double(regexp(b,'\d*','match','once')); % second column
T=table;
T.a=find(ismember(a,b))
T.b=find(ismember(b,a))

댓글을 달려면 로그인하십시오.

채택된 답변

dpb
dpb 2019년 2월 15일
[c,ia,ib]=intersect(categorical(s1),categorical(s2));

추가 답변 (1개)

OCDER
OCDER 2019년 2월 15일
%Generating a demo cell array
C = cell(10000, 2);
for j = 1:numel(C)
C{j, 1} = sprintf('STRR %d', j);
C{j, 2} = sprintf('STRR %d', j-2);
end
%Use intersect to determine location of matching entities between column 1 and 2
tic
[Matched, Col1, Col2] = intersect(C(:, 1), C(:, 2)); %NOTE: this doesn't work if there are duplicate entries in the colum
toc %0.063 sec
%If you plan to do numerical calculations, convert string to number via something like this
tic
Str = strrep(C, 'STRR ', '');%Deletes "STRR " from every cell
Num = cellfun(@(x) sscanf(x, '%f'), Str); %Convert remaining char to double format
[Matched, Col1, Col2] = intersect(Num(:, 1), Num(:, 2)); %NOTE: this doesn't work if there are duplicate entries in the colum
toc %0.359 sec

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by