How to efficiently integrate big data without using memory / (How to create big data)
조회 수: 9 (최근 30일)
이전 댓글 표시
- in a study i will produce large arrays.
- Each array will have at least 500 MB size.
- Each array will have the same number of rows.
- the total size of dataset will be approximately 20 GB or over.
- Somehow I have to create a single variable/array which includes all data and size of 20 GB.
matfile seems a good solution. However when the size of file increases, it gets slower. How can i handle this problem?
댓글 수: 9
Walter Roberson
2015년 8월 18일
I wonder if compression is leading to slowdowns? I do not know whether -v7.3 with matfile uses compression; see discussion http://www.mathworks.com/matlabcentral/answers/15521-matlab-function-save-and-v7-3 and http://www.mathworks.com/matlabcentral/answers/137592-compress-only-selected-variables-when-saving-to-mat
채택된 답변
JMP Phillips
2015년 8월 19일
편집: Walter Roberson
2015년 8월 19일
Here are some things you could try:
Use the matfile function, which allows you to access and change variables directly in MAT-files, without loading into memory: http://au.mathworks.com/help/matlab/large-mat-files.html http://au.mathworks.com/help/matlab/ref/matfile.html
Structure your data differently: - if you are representing the data as doubles, maybe you can afford less accuracy e.g. use int32. For example, you can use scaling of 1e4 to represent a double value such as 100.3425 as an integer 1003425.
With MATLAB:
- use 64 bit matlab version
- try disabling compression when saving the files, with the -v6 option
Optimize your PC for your task:
- in task manager, close any unnecessary processes running at the same time, including taskbar junk (adobe update, java update etc)
- disable your anti-virus which might be trying to scan the file and slowing it down
- under task manager, give higher priority to the MATLAB process (see http://www.sevenforums.com/tutorials/83361-priority-level-set-applications-processes.html)
- increase your virtual memory or page file size http://windows.microsoft.com/en-au/windows/change-virtual-memory-size#1TC=windows-7
- defragment your hard drive
- run MATLAB from your local hard drive and not a network drive or external harddrive
- save the .mat file to your local hard drive where it has plenty of space, not a network drive or external harddrive.
- For faster hard drive access, use a Solid State Drive (SSD)
댓글 수: 2
Walter Roberson
2015년 8월 19일
The -v6 option is incompatible with matfile and with objects over 2 Gb.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Standard File Formats에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!