Cheat for preallocation requirement

조회 수: 2 (최근 30일)
Arda Aksu
Arda Aksu 2021년 6월 10일
댓글: dpb 2021년 6월 11일
Hi, I almost always use preallocation to make the code run faster. But recently I was faced with a problem where I couldn't use preallocation. Even creating a big array and then shrinking at the end was not possible to my use. So I needed a cheat. While trying some workaround solutions I realized preallocation is not such a problem for cell arrays as it is for numeric arrays. Below is an example code. I am wondering the reason behind this. How does Matlab handle memory management for cell arrays with different than numeric arrays?
tic
x=[];
for i=1:1e5
x(end+1,:)=1:10;
end
toc
tic
c={};
for i=1:1e5
c{end+1,1}=1:10;
end
x=zeros(numel(c),10);
for i=1:numel(c)
x(i,:)=c{i};
end
toc
% Elapsed time is 10.064584 seconds.
% Elapsed time is 0.132362 seconds.
  댓글 수: 1
dpb
dpb 2021년 6월 10일
Cell arrays don't have to be contiguous as are native numeric data types.

댓글을 달려면 로그인하십시오.

답변 (1개)

Jan
Jan 2021년 6월 10일
x = [];
for i = 1:1e5
x(end+1, :) = 1:10;
end
Here you allocate sum(1:1e5) * 10 * 8 bytes: 400 GB.
c = {};
for i=1:1e5
c{end+1,1} = 1:10;
end
This allocates sum(1:1e5) * 8 bytes, because you need only one additional pointer per iteration. In addition, the elements of c point all to the same variable, which is shared:
format debug
c{1}
% Structure address = ebf72da0
% m = 1
% n = 10
% pr = f6ba8aa0
c{2}
% Structure address = ebf72da0
% m = 1
% n = 10
% pr = f6ba8aa0 <- the same address!
In the first code, the contents of the array is copied in each iteration, while in the cell method, all cells use the same data pointer, such that much less data have to be copied and less memery is occupied. On the other hand, each cell element has a header of about 100 bytes.
The JIT acceleration can cope with an omitted pre-allocation. This can reduce the dramatic slow down under some circumstances.
By the way, I cannot imagine a situation, where a pre-allocation is not possible. Could you post a minimal working example? At least allocating in steps of 1000 should be possible.
  댓글 수: 9
James Tursa
James Tursa 2021년 6월 11일
"Do you have an explanation for the timings measured in the original question?"
Yes. As Jan pointed out, the first method copies and re-copies the data over and over and over again. The second method doesn't.
dpb
dpb 2021년 6월 11일
Well, yes...the Q? I interpreted as was about insight re: Jan's expectation of 10:1 vis a vis the OP's observed 100:1 ratio. Why on that, I've no klew, starting with I don't know why/how Jan derived his "expected" factor of 10.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

태그

제품


릴리스

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by