필터 지우기
필터 지우기

Reading a large .dat file or some parts of it

조회 수: 11 (최근 30일)
Amin Rajabi
Amin Rajabi 2017년 2월 24일
댓글: Walter Roberson 2018년 10월 9일
Dear All,
I have a very large DAT file (almost 16GB ). It contains the electricity usage of 8000 customers for around 4 years recorded at every 30 minutes (so it has something around 8000*4*365*24*2 rows!)
MS Excel allows me to open this file, however it's obvious that it loads only a part of it. Based on that I could figure out that the format is something like this:
990814, 246745, 0, 2012-07-22 20:00:00, 3.25, 0,0,0,0
which corresponds with:
CUSTOMER_KEY, CALENDAR_KEY, EVENT_KEY, READING_DATETIME, GENERAL_SUPPLY_KWH, CONTROLLED_LOAD_KWH, GROSS_GENERATION_KWH, NET_GENERATION_KWH, OTHER_KWH
My main problem is that when I want to load it into MATLAB it can't do it because of RAM memory problems.
I read about fopen, fread, fscanf, textscan, etc. However I couldn't figure out if its is possible to read only a part of this DAT file instead of whole of it? Is it any command to read from for example the row 100 to row 1000 of this DAT file before loading whole of it into memory?
I only need the usage of about 1000 customers for one month.
Thanks in advance for your help.

채택된 답변

Walter Roberson
Walter Roberson 2017년 2월 25일
The calling sequence for textscan is:
textscan(SOURCE, FORMAT, COUNT, OPTIONS...)
where SOURCE is either a file identifier or a string, FORMAT is a string, and COUNT is the maximum number of times to apply the FORMAT.
So to read a particular portion of the file, you can use the Headerlines option to skip everything before there, and you can use the COUNT to give the number of lines to process.
It is not exactly number of lines, though, because if you have empty lines then unless you have carefully chosen your options, the empty line will be considered leading whitespace that is automatically ignored without incrementing the count. It is more that, provided there is enough data, the count will be the number of rows of data that are returned.
  댓글 수: 3
Rahimeh Rouhi
Rahimeh Rouhi 2018년 10월 8일
Dear Walter Roberson, could you please help. I have a big dataset of images in form of .mat files. I have a similar problem. I used matfile to save all the data on a hard disk and load some parts, but it is very slow. Which way is better to load a part of data into the workspace? writing the data into a text file and loading by the command you mentioned could be helpful?
Walter Roberson
Walter Roberson 2018년 10월 9일
How do you store the images inside the mst file? Cell array? One variable per image? Multiple dimensional array? Strut array?

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Data Import and Export에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by