필터 지우기
필터 지우기

load specified column in matfile too slow

조회 수: 2 (최근 30일)
Yu Li
Yu Li 2018년 10월 3일
댓글: Walter Roberson 2018년 10월 3일
I have a matfile with size of: 5e6*50.
I want to write a code to load the specific column into my memory, but I found that the time reading a specified column is nearly the same with that reading the whole matfile.
below is the test code:
A=rand(5e6,50);
save A A
f=matfile('A');
tic
tmp=f.A;
toc
tic
tmp=f.A(:,1);
toc
is there anyway to improve the performance?
Thanks!
Yu
  댓글 수: 2
Walter Roberson
Walter Roberson 2018년 10월 3일
I notice you did not specifically save with -v7.3, so you might be getting a -v7 file.
When I test on my system with -v7.3, selecting one column comes out roughly 10% faster. Not as good as one might hope, though.
Yu Li
Yu Li 2018년 10월 3일
Yes, the expected result should be 2% of reading the whole file, since I have totally 50 columns.

댓글을 달려면 로그인하십시오.

답변 (1개)

Walter Roberson
Walter Roberson 2018년 10월 3일
In this case, you can do much better by using -nocompression
Save -v7.3 -nocompression
Elapsed time is 17.814604 seconds.
Done save -v7.3 -nocompression
Start matfile -v7.3 -nocompression
Elapsed time is 0.016278 seconds.
Done matfile -v7.3 -nocompression
Start recall entire variable -v7.3 -nocompression
Elapsed time is 2.195975 seconds.
Done recall entire variable -v7.3 -nocompression
Start recall one column -v7.3 -nocompression
Elapsed time is 1.089280 seconds.
Done recall one column -v7.3 -nocompression
Save -v7.3
Elapsed time is 58.543461 seconds.
Done save -v7.3
Start matfile -v7.3
Elapsed time is 0.077814 seconds.
Done matfile -v7.3
Start recall entire variable -v7.3
Elapsed time is 10.139135 seconds.
Done recall entire variable -v7.3
Start recall one column -v7.3
Elapsed time is 9.118167 seconds.
Done recall one column -v7.3
Source code:
A=rand(5e6,50);
time_it(A, {'-v7.3' '-nocompression'})
time_it(A, {'-v7.3'})
function time_it(A, saveoptions)
savedesc = strjoin(saveoptions, ' ');
fprintf('Save %s\n', savedesc);
tic
save('A', 'A', saveoptions{:});
toc
fprintf('Done save %s\n', savedesc);
fprintf('Start matfile %s\n', savedesc);
tic
f = matfile('A');
toc
fprintf('Done matfile %s\n', savedesc);
fprintf('Start recall entire variable %s\n', savedesc);
tic
tmp=f.A;
toc
fprintf('Done recall entire variable %s\n', savedesc);
fprintf('Start recall one column %s\n', savedesc);
tic
tmp=f.A(:,1);
toc
fprintf('Done recall one column %s\n', savedesc);
end
  댓글 수: 7
Yu Li
Yu Li 2018년 10월 3일
Hi:
Thanks for your test, I got much deeper understanding about this.
I think there should have some bottom neck here, I will contact Mathworks for further investigation.
Thanks!
Yu
Walter Roberson
Walter Roberson 2018년 10월 3일
You can refer them to this post and my test code.
One thing they are likely to point out is that the default of compression is intended for "real" data, not for rand(), and that when you read/write with compression, the performance would be expected to vary with how compressible the data is. Thus you should probably run this code with the rand() replaced by load of one of your actual matrices.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Data Type Conversion에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by