Cell Array Size and Saving

조회 수: 28 (최근 30일)
Sven
Sven 2018년 7월 11일
댓글: Sven 2018년 7월 12일
Hi,
I wanted to save my quite complex and large class to a file and experienced a much larger filesize than I would have expected. So I examined which parts were driving the size. I was quite surprised how simple cell arrays of string consumed overdimensional space. Is there any easy way to avoid this?
Here my MWE:
Names50 = cell(50,1);
Names2 = cell(2,1);
for i=1:length(Names2)
Names{i} = 'a';
end % for i
for i=1:length(Names50)
Names{i} = 'b';
end % for i
When I check for saving size with a small routine I found, I get quite confusing results:
getSize(Names2) --> 228
getSize(Names50) --> 5700
getSize(Names2{1}) --> 2
The single element is just 2 bytes, while a cell array of 2*2 bytes is 228, or even 5700 if there are 50 rows. Is the overhead so unproportional large in cell arrays? Can that somehow be avoided when saving?
Thanks in advance
Best
Sven
P.S.: Codes for getSize:
function [ bytes ] = getSize( variable )
props = properties(variable);
if size(props, 1) < 1, bytes = whos(varname(variable)); bytes = bytes.bytes;
else %code of Dmitry
bytes = 0;
for ii=1:length(props)
currentProperty = getfield(variable, char(props(ii)));
s = whos(varname(currentProperty));
fprintf('Property: %s : %d bytes\n',props{ii},s.bytes)
bytes = bytes + s.bytes;
end
end
end
function [ name ] = varname( ~ )
name = inputname(1);
end

채택된 답변

Guillaume
Guillaume 2018년 7월 11일
Yes, there is necessary overhead for cell arrays. Note that whos (which your getsize uses|) does not actually show all the memory used by variables.
By necessity a cell array cannot just store the content of the data (your 2 bytes consumed by 'a'). It also needs to store:
  • where that content is actually stored in memory (since the content of the cell array can be anything, the content is not actually stored inside the cell array, just a pointer to the content)
  • the matrix header for that content which includes:
  • the type of content
  • how many dimensions that content has
  • the length of each dimension of that content
This result in an overhead of 112 bytes per non-empty cell (empty cells only need 8 bytes to store a null pointer)
To that you need to add more bytes that whos doesn't show and that are required for every variable in matlab:
  • the type of the variable (i.e it's a cell array)
  • how many dimensions that variable has
  • the length of each dimension
  댓글 수: 1
Sven
Sven 2018년 7월 12일
Thank you very much for this detailed answer. I feared there was an overhead, but did not expect it to be that large and for each cell. So I guess there is no workaround.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Type Identification에 대해 자세히 알아보기

제품


릴리스

R2016a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by