Find continuous file name jump

조회 수: 1 (최근 30일)
Tsuwei Tan
Tsuwei Tan 2018년 5월 9일
댓글: Tsuwei Tan 2018년 5월 9일
I have dozen thousand files with names like the following:
SHARK_225054651_41_0547_r001
SHARK_225054651_41_0548_r005
SHARK_225054651_41_0548_r009
...
SHARK_225054651_41_0619_r121
SHARK_225054651_41_0620_r125
...
SHARK_225062101_41_0621_r001
SHARK_225062101_41_0621_r005
SHARK_225062101_41_0622_r009
...
SHARK_225062101_41_0653_r121
SHARK_225062101_41_0654_r125
each file's name end up with .....r%%%, the three %%% digits are 001, 005,....up to 121, 125. Total thirty-two with increment equals four and the same SHARK_%%%%%%%%%_%%_%%%%_r%% name before _r%%%. Then another file starts over with the same r%%% iteration.
However, the actual file has a "jump" for instance r005 is missing between r001 and r009.
Is there a way to read out the file name with some logic loop to pick up the missing one?

채택된 답변

Stephen23
Stephen23 2018년 5월 9일
편집: Stephen23 2018년 5월 9일
C = { % fake data:
'SHARK_225054651_41_0548_r001'
'SHARK_225054651_41_0548_r005'
'SHARK_225054651_41_0548_r009'
'SHARK_225054651_41_0548_r021'
'SHARK_225054651_41_0548_r025'
'SHARK_225062101_41_0621_r005'
'SHARK_225062101_41_0621_r009'
'SHARK_225062101_41_0621_r013'
'SHARK_225062101_41_0621_r025'
};
T = regexp(C,'^(\w+)r(\d{3})$','tokens','once'); % split
A = cellfun(@(c)c(1),T); % 1st token
B = cellfun(@(c)c(2),T); % 2nd token
[U,~,X] = unique(A(:)); % 1st token unique only
V = str2double(B(:)); % 2nd token -> numeric values
G = 1:4:25; % the required values (change this to 1:4:125).
C = accumarray(X,V,[],@(v){setdiff(G,v)});
The outputs of interest to you are:
  • U the unique groups, e.g. 'SHARK_225054651_41_0547_' and 'SHARK_225062101_41_0621_' in my example fake data.
  • C the missing rXXX values.
These outputs are shown here:
>> U{1}
ans = SHARK_225054651_41_0548_
>> C{1}
ans =
13 17
>> U{2}
ans = SHARK_225062101_41_0621_
>> C{2}
ans =
1 17 21
You could easily loop over these, or display them in the command windows or your GUI, etc:
>> Z = [U,cellfun(@num2str,C,'uni',0)]';
>> for k = 1:numel(U), fprintf('%s: %s\n',U{k},sprintf('%3d, ',C{k})); end
SHARK_225054651_41_0548_: 13, 17,
SHARK_225062101_41_0621_: 1, 17, 21,
  댓글 수: 1
Tsuwei Tan
Tsuwei Tan 2018년 5월 9일
This is great!!! Thank you so much!!!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by