correctly split with multiple under score

조회 수: 3 (최근 30일)
Sheikh Omar Bah
Sheikh Omar Bah 2023년 1월 5일
댓글: Sheikh Omar Bah 2023년 1월 6일
I have a cell array with several underscore. I was wondering how to correctly split the string when you have underscores. My input looks like this:
(a) MK_0334_300151IT0005274805/MD_01082027_3_103 I want only 300151IT0005274805
(b) MK_0334_700720ES01653860_1_103 . I want only 700720ES01653860_1
© CAB001_0578_SK09 1100 0000 i want SK09 1100 0000.
(d) CR_07_C_59163230 i want only C_5916323
i have tried to split the data using : regexp({my_data}, '_', 'split'); then selecting the cell 3 and then split it using
regexp({cell 3}, '/MD', 'split');. however, that doensn't work for (b) and (d) data type.
  댓글 수: 4
the cyclist
the cyclist 2023년 1월 5일
The hard part is not writing the MATLAB code. The hard part is understanding what single "rule" or logic is that results in what you want.
Tell us the rule.
Adam Danz
Adam Danz 2023년 1월 5일
To illustrate what others have mentioned above, explain why you don't want to return "SK09 1100" in your 3rd example or why you don't want to return "07" in your 4th example?

댓글을 달려면 로그인하십시오.

채택된 답변

Walter Roberson
Walter Roberson 2023년 1월 5일
S = {'MK_0334_300151IT0005274805/MD_01082027_3_103', 'MK_0334_700720ES01653860_1_103', 'CAB001_0578_SK09 1100 0000', 'CR_07_C_59163230' }
S = 1×4 cell array
{'MK_0334_300151IT0005274805/MD_01082027_3_103'} {'MK_0334_700720ES01653860_1_103'} {'CAB001_0578_SK09 1100 0000'} {'CR_07_C_59163230'}
regexp(S, '(?<=^[^_]+_[^_]+_)[^/_]+(_[^/_]+)?', 'match', 'once')
ans = 1×4 cell array
{'300151IT0005274805'} {'700720ES01653860_1'} {'SK09 1100 0000'} {'C_59163230'}
The rules used:
  • discard up to and including the second _
  • scan forward from there, stopping just before the first / or the second _ if present
  댓글 수: 2
the cyclist
the cyclist 2023년 1월 5일
Using the Roberson Clairvoyance Toolbox™ again, I see. Still waiting on the public beta for that.
Sheikh Omar Bah
Sheikh Omar Bah 2023년 1월 6일
Thanks Walter. That was very helpful.
I just have to add some tweak. some of the data had '/' which were important to keep.
MK_0458_002/001/30/210/E/K_1. where it was necessary to keep 002/001/30/210/E/K_1.
so i instead use
Split_data = regexp({s}, '(?<=^[^_]+_[^_]+_)[^_]+(_[^/_]+)?', 'match', 'once');
A = regexp(Split_data, '/MD', 'split');
then select the first cell with a loop.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by