I have a cellarray data like in the picture : each cell contains cellarray of strings
I am opening all all the cells with this way
counter=0;
for ind=1:length(data)
tmp=cell2str(data{ind,1});
for k=1:size(tmp,1)
counter=counter+1;
tmp2=textscan(tmp(k,:),'%s%s%s%s%s%[^\n\r]','Delimiter', ' ');
for j=1:6
if isempty(tmp2{j})==0
Raw(counter,j)=tmp2{j};
end
end
clear tmp2 j
end
clear k tmp
end
The results is correct but is there a better/faster way to do it ?
Using parfor, or other technics
Thank you in advance

댓글 수: 9

dpb
dpb 2019년 1월 9일
Attach small sample dataset and what do you want the result to be?
NirE
NirE 2019년 1월 9일
편집: NirE 2019년 1월 9일
I am joining a small piece of data, 10 rows but in fact it has 38956 rows.
The length of the cells that are inside the main cell can differ. I hope that I am clear enough
You can just run the script that I wrote it is working but very slowly
Is there a way tu use parfor or other thing in order to accelerate the process?
Luna
Luna 2019년 1월 9일
cell2str does not work for me, which version you are using?
NirE
NirE 2019년 1월 9일
Matlab R2017b
dpb
dpb 2019년 1월 9일
편집: Stephen23 2019년 1월 9일
Must be in some TB, then, it's not in base R2017b...
Again, what's the desired output? That doesn't seem to make sense reading the code; your format statement has 5 strings but some "records" have many more fields than that...
Well, that's not it either, maybe FEX submittal? A search of online help doesn't find it, either.
Not knowing what, precisely, the cell2str function actually returns it's hard to guess exactly what the result that "works" really is without more effort than have time to spend...help us help you.
Luna
Luna 2019년 1월 9일
What size should be your output? Are you expecting 64x1 cell array?
NirE
NirE 2019년 1월 9일
the str2cell function just return a string vector with the number of line that we had in the cell.
for the format it is exactly what i want 6 parts with different length.
i hope that i am helping
Jan
Jan 2019년 1월 9일
Note: Omit the useless clear commands. They will waste time only here.
dpb
dpb 2019년 1월 9일
Well, no...helping would be to show us what you really, really want instead of just describing it that we can't reproduce.
Where did you find the function? SHOW us!!!

댓글을 달려면 로그인하십시오.

 채택된 답변

Jan
Jan 2019년 1월 9일

1 개 추천

Start with a pre-allocation:
Len = cellfun('prodofsize', data);
Raw = cell(sum(Len), 6);
c = 0;
for ind = 1:numel(data)
tmp = data{ind};
for k = 1:numel(tmp)
c = c + 1;
tmp2 = strsplit(tmp{k}, ' ');
for j = 1:numel(tmp2)
Raw{c, j} = tmp2{j};
end
end
end
I cannot open your MAT file currently, so I guess, what it might contain. I guessed also, that cell2str can be avoided by scanning the cell element directly. I assume that Raw should be a cell array. All these assumptions can be wrong. If you post a small input as code and the wanted output, less guessing is required.

댓글 수: 5

NirE
NirE 2019년 1월 10일
Your code is reducing mine 3 times thanks a lot.
Can you just explain what is doing 'prodofsize' ?
Jan
Jan 2019년 1월 10일
편집: Jan 2019년 1월 10일
@Nir Eliezer: 'prodofsize' is explained in the docuemtation: doc cellfun. It is equivalent to:
Len = cellfun(@numel, data);
but much faster. If you provide a function handle to cellfun, it calls the Matlab engine for each element of the cell, while using the strings like 'length', 'prodofsize' and 'isclass' accesses the cell elements directly inside the cellfun core. Although the speed of cellfun might be negligible in your case, it is a good programming practize to use the most efficient code.
By the way, 'numel' would be much nicer than 'prodofsize'.
Maybe this is slightly faster:
Len = cellfun('prodofsize', data);
Raw = cell(sum(Len), 6);
c = 0;
for ind = 1:numel(data)
tmp = data{ind};
for k = 1:Len(ind)
c = c + 1;
tmp2 = strsplit(tmp{k}, ' ');
Raw(c, 1:numel(tmp2)) = tmp2;
end
end
NirE
NirE 2019년 1월 21일
Jan one more question is there a way to parallelize your piece of code that I could use parfor ?
Jan
Jan 2019년 1월 22일
Yes, a parallelizaion should be very straigh forward. Did you try it?
NirE
NirE 2019년 1월 22일
Will try and tell you how it increase or not

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

dpb
dpb 2019년 1월 9일
편집: dpb 2019년 1월 9일

1 개 추천

OK, I overlooked the regular expression in the format string that sucks up all of those extra blanks at the end of the odd-man-out records...
To dereference the cell content in each cell requires two levels snce textscan isn't cell-string aware. split doesn't cut it here because there's not a unique delimiter that defines the fields desired; hence the above...
You can try the following and see if the lack of preallocation shows up as a performance hit with the size; oftentimes it'll fool you and not be too bad...
fnTS=@(s) textscan(s,'%s%s%s%s%s%[^\n\r]','Delimiter', ' ');
res=[];
for i=1:length(data)
res=[res;cellfun(fnTS,data{i},'uni',0)];
end
res=cat(1,res{:});
The above yields a 64x6 cell array...
I'd have to think of the bestest way to be able to build the array directly w/o the intermediary second cell array to not be dynamically catenating the output.
ADDENDUM:
res(cellfun(@isempty,res))={''};
>> string(res)
ans =
11×6 string array
"1" "EventDataLogNewFile" "DataEventTime" "TypeSecondsSinceEpoch" "1546725641" ""
"1" "EventDataLogNewFile" "DataEventTime" "TypeFormattedDate" "Sun" "Jan 6 00:00:41 2019"
"1" "EventDataLogNewFile" "DataReportingSubsystem" "TypeString" "datalogger" ""
"1" "EventDataLogNewFile" "DataInstrumentID" "TypeString" "00:01:05:19:CF:30" ""
"1" "EventDataLogNewFile" "DataEntityName" "TypeString" "mc16" ""
"4" "EventREAD" "DataEventTime" "TypeSecondsSinceEpoch" "1546725657" ""
"4" "EventREAD" "DataEventTime" "TypeFormattedDate" "Sun" "Jan 6 00:00:57 2019"
"4" "EventREAD" "DataReportingSubsystem" "TypeString" "pc" ""
"4" "EventREAD" "DataEntityID" "TypeString" "Dev_CLPC_PressureGauge1" ""
"4" "EventREAD" "DataReading" "TypeUnitLessNumber" "34729" ""
"4" "EventREAD" "DataEventDuration" "TypeSec" "0.000686859" ""
>>
for just doing the first two elements in the for...end loop instead of all for brevity.
Luna
Luna 2019년 1월 9일

0 개 추천

I was assuming the same 64x9 cell. Here is my solution gives the same result with Jan's:
cellArray = cellfun(@(x) strsplit(x(:,:),' '), vertcat(data{:}), 'UniformOutput',false);
for i = 1:numel(cellArray)
for j = 1:numel(cellArray{i})
raw{i,j} = cellArray{i}{j} ;
end
end

카테고리

도움말 센터File Exchange에서 Data Type Identification에 대해 자세히 알아보기

제품

릴리스

R2017b

질문:

2019년 1월 9일

댓글:

2019년 1월 22일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by