How to slice each string in a string array without using for loop

조회 수: 32 (최근 30일)
YANAN ZHU
YANAN ZHU 2018년 9월 18일
편집: Cedric 2018년 9월 19일
For a string array, for example,
celldata =
3×1 cell array
{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}
Is there array operation (i.e. without using for loop) to extract the months from each cell and convert them to a 3*1 numeric matrix, which should be [12;11;09]. I don't want to use for loop because it was too slow.

채택된 답변

Cedric
Cedric 2018년 9월 19일
편집: Cedric 2018년 9월 19일
If you favor performance over readability/maintainability, you can build an approach around the following:
buffer = vertcat( dates{:} ) ;
months = (buffer(:,6:7) - '00') * [10;1] ;
where dates is a cell array of date strings ( celldata in your example).
Here is the benchmark:
% Build "large" data set.
N = 1e4 ;
dates = repmat( {'2018-12-12'; '2018-11-05'; '2018-09-02'}, N, 1 ) ;
% Basic FOR loop, STR2DOUBLE.
tic ;
n = numel( dates ) ;
months_forStr2double = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2double(k) = str2double( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2DOUBLE : %.3fs\n', toc ) ;
% Basic FOR loop, STR2NUM.
tic ;
n = numel( dates ) ;
months_forStr2num = zeros( n, 1 ) ;
for k = 1 : n
months_forStr2num(k) = str2num( dates{k}(6:7) ) ;
end
fprintf( 'Basic FOR, STR2NUM : %.3fs\n', toc ) ;
% Basic FOR loop, SSCANF.
tic ;
n = numel( dates ) ;
months_forScanf = zeros( n, 1 ) ;
for k = 1 : n
months_forScanf(k) = sscanf( dates{k}(6:7), '%d' ) ;
end
fprintf( 'Basic FOR, SSCANF : %.3fs\n', toc ) ;
% CELLFUN (hidden FOR), SSCANF.
tic ;
months_cellfun = cellfun( @(date) sscanf( date(6:7), '%d' ), dates ) ;
fprintf( 'CELLFUN, SSCANF : %.3fs\n', toc ) ;
% REGEXP
tic ;
months_regexp = str2double( regexp( dates,'(?<=-)\d+(?=-)','match','once' )) ;
fprintf( 'REGEXP : %.3fs\n', toc ) ;
% CELL2MAT, STR2NUM
tic ;
chardata = cell2mat( dates ) ;
months_cell2matStr2num = str2num( chardata(:,6:7) ) ;
fprintf( 'CELL2MAT, STR2NUM : %.3fs\n', toc ) ;
% DATETIME
tic ;
dt = datetime( dates) ;
months_datetime = month( dt ) ;
fprintf( 'DATETIME : %.3fs\n', toc ) ;
% SSCANF
tic ;
months_sscanf = sscanf([dates{:}],'%*4d-%2d-%*2d') ;
fprintf( 'SSCANF : %.3fs\n', toc ) ;
% EXTRACTBETWEEN
tic ;
months_extractBetween = extractBetween( dates, '-', '-' ) ;
months_extractBetween = cellfun( @str2double, months_extractBetween ) ;
fprintf( 'EXTRACTBETWEEN : %.3fs\n', toc ) ;
% Trick.
tic ;
buffer = vertcat( dates{:} ) ;
months_trick = (buffer(:,6:7) - '00') * [10;1] ;
fprintf( 'Trick: %.3fs\n', toc ) ;
% Check
disp( [isequal( months_forStr2num, months_forStr2double ), ...
isequal( months_forScanf, months_forStr2double ), ...
isequal( months_cellfun, months_forStr2double ), ...
isequal( months_regexp, months_forStr2double ), ...
isequal( months_cell2matStr2num, months_forStr2double ), ...
isequal( months_datetime, months_forStr2double ), ...
isequal( months_sscanf, months_forStr2double ), ...
isequal( months_extractBetween, months_forStr2double ), ...
isequal( months_trick, months_forStr2double )] ) ;
Output:
Basic FOR, STR2DOUBLE : 0.489s
Basic FOR, STR2NUM : 0.975s
Basic FOR, SSCANF : 0.356s
CELLFUN, SSCANF : 0.550s
REGEXP : 0.673s
CELL2MAT, STR2NUM : 0.015s
DATETIME : 0.201s
SSCANF : 0.023s
EXTRACTBETWEEN : 0.624s
Trick: 0.008s
1 1 1 1 1 1 1 1 1
  댓글 수: 2
Stephen23
Stephen23 2018년 9월 19일
편집: Stephen23 2018년 9월 19일
sscanf does not require a loop, simply concatenate the char vectors and use an appropriate format string:
C = {'2018-12-12','2018-11-05','2018-09-02'}
sscanf([C{:}],'%*4d-%2d-%*2d')
Cedric
Cedric 2018년 9월 19일
Thanks Stephen, just added this to the benchmark!

댓글을 달려면 로그인하십시오.

추가 답변 (4개)

Christopher Wallace
Christopher Wallace 2018년 9월 19일
chardata = cell2mat(a);
numdata = str2num(chardata(:,6:7));

Paolo
Paolo 2018년 9월 18일
편집: Paolo 2018년 9월 18일
str2double(regexp(celldata,'(?<=-)\d+(?=-)','match','once'))
ans =
12
11
9

Star Strider
Star Strider 2018년 9월 19일
Using datetime and its functions:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
dt = datetime(celldata);
M = month(dt)
M =
12
11
9

Akira Agata
Akira Agata 2018년 9월 19일
Another possible solution:
celldata = [{'2018-12-12'}
{'2018-11-05'}
{'2018-09-02'}];
M = extractBetween(celldata,'-','-');
M = cellfun(@str2double,M);

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by