필터 지우기
필터 지우기

standard deviation takes for ever

조회 수: 4 (최근 30일)
gujax
gujax 2023년 9월 12일
댓글: dpb 2023년 9월 13일
I have a double precision numeric 3D matrix M (converted by fread from uint8) of size 30000 x 500 x 500 I would like to get standard deviation along dimension 2 tic, std(M,0,2) ; toc has taken more than 12 hours and still running meanwhile mean(M,2) only took 80 seconds.
Or a bit more details.. std(M(:,:,1),0,2) takes 0.3 seconds and std(M(:,:,1:100),0,2) takes 34 seconds But std(M(:,:,1:500),0,2) says out of memory
Similarly mean(M(:,:,1),2) takes 0.1 seconds But mean(M(:,:,1:500),2) does not work and gives me 'out of memory' message But mean(M,2) takes about 80 seconds. This is all very confusing! Thanks
  댓글 수: 7
dpb
dpb 2023년 9월 12일
Your original posting says "I have a double precision numeric 3D matrix M of size 30000 x 500 x 500..."
That's what I calculated above at 8 bytes/double takes up 59 GB storage.
I don't follow what " an accumulation of (500 x 100x 5) files each 31 KB in size." means?
Think you're going to have to show us specifically what your array is and how it was constructed.
gujax
gujax 2023년 9월 12일
편집: gujax 2023년 9월 13일
Ah got it!
I append 100 x 500 x 500 times a 31 KB time series streaming data chunk into one file instead of generating 5 million separate write files.
So that’s about ~8GB data
But when I read it I didn’t quite realize by default fread converts it to double

댓글을 달려면 로그인하십시오.

채택된 답변

gujax
gujax 2023년 9월 13일
calculating statistical std takes more memory than calculating mean. If performing std on double formatted large data sets, it likely will slow down the computer if memory is limited. That may not be true for evaluating statistical mean.

추가 답변 (1개)

Steven Lord
Steven Lord 2023년 9월 12일
Can you confirm you're using the std function included in MATLAB? What does this command show?
which -all std
/MATLAB/toolbox/matlab/datafun/std.m /MATLAB/toolbox/matlab/datatypes/tabular/@tabular/std.m % tabular method /MATLAB/toolbox/matlab/datatypes/datetime/@datetime/std.m % datetime method /MATLAB/toolbox/matlab/datatypes/duration/@duration/std.m % duration method /MATLAB/toolbox/matlab/timeseries/@timeseries/std.m % timeseries method /MATLAB/toolbox/matlab/bigdata/@tall/std.m % tall method /MATLAB/toolbox/parallel/parallel/@distributed/std.m % distributed method
  댓글 수: 9
gujax
gujax 2023년 9월 13일
편집: gujax 2023년 9월 13일
I think I will state this issue resolved? i.e., calculating statistical std takes more memory than calculating mean. If performing std on double formatted large data sets, it likely will slow down the computer if memory is limited. That may not be true for evaluating statistical mean.
dpb
dpb 2023년 9월 13일
The issue you're having must be in disk swapping owing to limited real memory...I'm still not positive about just how big your array is. How about
whos M
? to tell us precisely what you've processing and
memory
for the available memory your machine has?
It depends on how TMW builds the executable and what processor instructions they assume; unfortunately, it's likely they code to a "lower common denominator" of what is out there because know that not all customers are going to have latest CPU technology with enhanced vector processing instructions making use of builtin vector pipeline that exists with current processors.
I've never messed with trying it out, if you have a high-memory graphics card, you could possible try the GPU stuff...

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Numeric Types에 대해 자세히 알아보기

제품


릴리스

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by