how to find a common part of the strings?

Question

abdul kalam 2016년 6월 11일

2
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/289308-how-to-find-a-common-part-of-the-strings

편집: Sindar 2018년 7월 18일

Hi there!

Could you please help me how to find a common part of the strings?

for example: S1_carbon_avg_air, S1_carbon_err_air, S1_carbon_avg_arg, S1_carbon_err_arg, S1_carbon_avg_nit, S1_carbon_err_nit,

the coomon string is S1_carbon and i want to use this as legend and title of the graph as well.

Thank you in advance

Abdul kalam

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Walter Roberson 2016년 6월 11일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/289308-how-to-find-a-common-part-of-the-strings#answer_225158

MATLAB Online에서 열기

If you have the strings in a cell array of strings, Scell, then

Schar = char(Scell(:));
all_rows_same = all(diff(Schar == 0, 1),1);
common_cols = find(~all_rows_same, 1, 'first');
if isempty(common_cols)
  common_to_use = '?'
else
  common_to_use = Scell{1}(1:common_cols);
end

This finds the longest leading substring common to all entries in the cell array and uses that; however if there is no leading substring that is common to all of them then it uses '?' just to have something .

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

dpb 2016년 6월 11일

편집: dpb 2016년 6월 11일

MATLAB Online에서 열기

Do a little debugging/work on your own instead of expecting somebody else to solve all your problems...

Walter made a typo of a misplaced closing parenthesis in his computation of the all_rows_same variable; your mission (should you choose to accept it) is to look at the logic of the problem and see where that might be...

I had written the same logic slightly differently before reading Walter's -- it looked like, starting with the string containing all the titles (use whatever form you have, if a cell array then the char cast works while I used a utility function of my own that does the breakup for a string into tokens):

>> s='S1_C2sum_avg_air, S1_C2sum_er_air, S1_C2sum_avg_arg, S1_C2sum_er_arg, S1_C2sum_avg_nit, S1_C2sum_er_nit';
>> t=tokens(s);
>> ix=find(all(diff(t)),1,'first');
>> strTitle=s(1:ix-2)
strTitle =
S1_C2sum

This is identically the same thought process as what Walter wrote; work thru them both to see the underlying logic and then where there's the aforementioned typo.

Matthias Brandt 2017년 6월 6일

편집: Matthias Brandt 2017년 6월 6일

MATLAB Online에서 열기

a combination of both answers above gives an even simpler code: first we create the cell array that the first answer assumes with the help of the second:

s='S1_C2sum_avg_air, S1_C2sum_er_air, S1_C2sum_avg_arg,S1_C2sum_er_arg, S1_C2sum_avg_nit, S1_C2sum_er_nit';
Scell=strsplit(s,', ');

the extraction of the common part can then be written in one (even more intuitive) line:

common_to_use = Scell{1}(all(~diff(char(Scell(:)))))

the string we are after are "the characters that do not differ among all". Be aware, this code can produce false positive characters, if not only the first characters are equal through all the strings (as in the example).

Sindar 2018년 7월 18일

편집: Sindar 2018년 7월 18일

MATLAB Online에서 열기

This finds only the leading match

common_to_use = Scell{1}(1:find(any(diff(char(Scell(:))),1),1,'first')-1)
if isempty(common_to_use)
   common_to_use='?'
end

댓글을 달려면 로그인하십시오.

Answer 2

dpb 2016년 6월 11일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/289308-how-to-find-a-common-part-of-the-strings#answer_225152

MATLAB Online에서 열기

The most trivial is that the part thru the second underscore is the same so use that fact...

ix=strfind(str,'_');       % the underscore locations in string 'str'
titleStr=str(1:ix(2)-1));  % the string up to that location
title(titleStr)            % use it

댓글 수: 2
없음 표시없음 숨기기

abdul kalam 2016년 6월 11일

편집: abdul kalam 2016년 6월 11일

Thanks for your quick reply dpb,

to elaborate,

I have three matrices of 10*17 size. all columns have column headers. first column is same in all (time). Col2 to col9 is intensity and col 10 to col 17 are error bars. I want to plot time vs (coli, coli+8) as error graphs. in a plot i should use 6 columns (col(i), col(i+8)) from 3 matrices) and their corresponding column head strings. I have plotted the graph but stuck at generating legend and a common name (to be used as title).

sometimes the strings could be s1_carbon_avg_air,s1_carbon_err_air,s2_carbon_avg_arg,s2_carbon_err_arg in this the common string is 'carbon'.

Image Analyst 2016년 6월 11일

Is the "common" part always between underlines? Otherwise you could say that "r" is a common string since the character r is all over the place.

댓글을 달려면 로그인하십시오.

Answer 3

Gergö Schmidt 2017년 5월 30일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/289308-how-to-find-a-common-part-of-the-strings#answer_268904

MATLAB Online에서 열기

If you have a uniform text/delimiter structure you could use the built-in strsplit and ismember functions like this:

Scell = {'S1_carbon_avg_air', 'S1_carbon_err_air', 'S1_carbon_avg_arg', 'S1_carbon_err_arg', 'S1_carbon_avg_nit', 'S1_carbon_err_nit'};
delimiter = '_';
commonparts = strsplit(Scell{1},delimiter);
for iS = 2:numel(Scell) 
  commonparts(~ismember(commonparts,strsplit(Scell{iS},delimiter))) = [];
end
commonparts = strjoin(commonparts,delimiter)

This will deliver:

commonparts =
S1_carbon

But in case of

Scell = {'S1_carbon_avg_air', 'S1_carbon_err_air'};

it gives

S1_carbon_air

which might be useful for some reason...

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

how to find a common part of the strings?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

추가 답변 (2개)

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

how to find a common part of the strings?

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

추가 답변 (2개)

댓글 수: 2 없음 표시없음 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기