How to split a huge string array efficently
이전 댓글 표시
Hi everyone,
I'm trying to split a huge string (~8.5mb, ~11.500 rows x ~400 columns) efficiently, but I cannot do that without a quiet slow "for" loop I cannot remove.
The number of colums may change from a file to another one so it's not possible for me to determin initially a unique format of the file and then import it according to it.

%% getting data from .txt => really fast
tic
disp('importing file');
a = string(textread([pwd '\test.txt'],'%s','headerlines',1)); %#ok<*DTXTRD>
toc
%% splitting each row in colums by delimiter ";" => slow
tic
disp('splitting each row by ";"');
b = strings(length(a),length(strsplit(a(1),';')));
for k=1:length(a)
b(k,:) = strsplit(a(k),';');
end
toc
%% date(str) to datenum => really fast
tic
disp('conv date to datenum');
dat1 = datenum(b(:,1),'yyyy-mm-dd');
toc
%% str to logical => really fast
tic
disp('converting data to logical array')
dat2 = logical(strcmp(b(:,2:end),'1')); %super fast
%dat2 = str2double(b(:,2:end)); %very slow
toc
% disp('converting data to logical array - 2'); %super fast as well
% tic
% dat2 = zeros(size(b));
% dat2(strcmp(b(:,2:end),'1')) = 1;
% toc
Thanks everyone! :)
Source file sample

댓글 수: 3
Walter Roberson
2020년 7월 24일
Why not use readtable() ?
I would also point out that textscan() can process character vectors in which the lines are separated by newlines.
endystrike
2020년 7월 24일
endystrike
2020년 7월 24일
채택된 답변
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Dates and Time에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!