CSV vs MAT files

조회 수: 70 (최근 30일)
Sam Da
Sam Da 2011년 5월 26일
Should I store very large amount of data as .mat files or .csv. Which is more: (1) efficient when it comes to reading the data (2) more compressed in terms of size
  댓글 수: 4
Jason Ross
Jason Ross 2011년 5월 26일
Come on, a 2TB drive is $100 nowadays :). Of course, that's not enterprise class storage but it's fairly easy to fill up with virtual machine images and HD video (in a DVR, for instance). Or piles of data coming going into or out of something or other.
Sean de Wolski
Sean de Wolski 2011년 5월 26일
We take high resolution xmt scans, do all sorts of fun math on them, and we may fill a TB over the course of a year.
I do enjoy the reward of knowing that it is faster and (obviously) requires less space to recompute something than to store it and load it.

댓글을 달려면 로그인하십시오.

채택된 답변

Ben Mitch
Ben Mitch 2011년 5월 26일
I think the answer to both (1) and (2) is .mat file. ASCII files (like CSV) require conversion to and from the format in memory (binary), which makes them slow. Moreover, if written at more than 6 significant figures, they are bigger than the usual (double precision) binary format as well. Therefore, you should probably only use CSV if either (a) you need to exchange data with software that can read CSV but cannot read MAT (like Excel) or (b) you want to be able to peruse the data yourself in a text editor or CSV editor.
Aside, if speed is your absolute goal, consider using
save myvariable myfile -v6
because both the save() and load() commands are much quicker if compression is disabled like this (compression was not available in Version 6 of Matlab). Vice versa, if small file size is your goal, use the usual save/load commands.
Depending on the data, you might get some additional savings by first casting to a smaller data type. For instance, if you are storing data as double precision but are confident that single precision will be enough... try this:
a = randn(1000, 1);
save a1 a
a = single(a);
save a2 a
and check the filesizes. Note that the filesize has got smaller in a2 because you've thrown away information which you judged yourself to be irrelevant.
  댓글 수: 1
Ken Atwell
Ken Atwell 2011년 11월 13일
Another advantage of a MAT file is random access -- a CSV file does not necessarily have predictable line lengths, so you cannot reliably seek into the middle of the file. When working with big data, the ability to seek and incrementally read is useful. More at:

댓글을 달려면 로그인하십시오.

추가 답변 (3개)

Sean de Wolski
Sean de Wolski 2011년 5월 26일
  • If you're going to be using it outside of MATLAB -> CSV
  • If you're only using it within MATLAB and you're motivating to writing/reading directly, then write it to binary using fwrite, and fread to pull it in. I used to do this, but got lazy and realized it's much easier even if slower to use MAT files
  • If you're lazy and want something easy/the ability to store multiple variables at once -> .MAT
My suggestion after years of being angry at reading in a binary file with the wrong dimensions: use .mat files.

Sam Da
Sam Da 2011년 5월 26일
I am going to use the .mat files both in Matlab and R. R does have packages to load the .mat files. So, is .mat preferred or .csv preferred for speed and size of execution.
I am talking about let's say 500GB of data.
  댓글 수: 2
Sean de Wolski
Sean de Wolski 2011년 5월 26일
Do you have 500GB of RAM? You won't be storing all 500GB in one mat file right? As long as your mat files don't exceed at least half the RAM you have, you should be fine. You can store more than that, but you won't be able to do anything with it without exceeding your RAM limit.
Walter Roberson
Walter Roberson 2011년 5월 26일
You can append to existing .mat files, if you are adding new variables. (You cannot append to an existing variable except by rewriting the whole variable.)

댓글을 달려면 로그인하십시오.

Lisa Justin
Lisa Justin 2012년 10월 18일
i have a csv file and want to convert it to a matlab file. how can i do this?
  댓글 수: 1
Sean de Wolski
Sean de Wolski 2012년 10월 18일
Lisa, Please ask a new question.

댓글을 달려면 로그인하십시오.


Help CenterFile Exchange에서 Whos에 대해 자세히 알아보기


Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by