Loading very large CSV files (~20GB)

조회 수: 20 (최근 30일)
Sadjad Fakouri Baygi
Sadjad Fakouri Baygi 2016년 3월 17일
댓글: Sadjad Fakouri Baygi 2016년 3월 18일
I have some CSV files that I need to import into the MATLAB (preferably in a .mat format). I already tried to break down these files using csvread function into 100 cluster pieces, but it's very slow and it doesn't go further than 60 steps even though my computer machine is fairly new. I need to extract only numeric values and this values are separated by comma. I will appreciate it if you can help me to get through this.
Thanks,
Sajad
  댓글 수: 2
per isakson
per isakson 2016년 3월 17일
Sadjad Fakouri Baygi
Sadjad Fakouri Baygi 2016년 3월 17일
Thanks, This shortcut worked out.

댓글을 달려면 로그인하십시오.

채택된 답변

Robert
Robert 2016년 3월 17일
You should look into datastore and mapreduce. They were introduced in R2014b and are intended for handling large data sets. The datastore object allows you to read the data in chunks, skip columns, and store the results in a table. It's behavior is somewhat similar to fread or fscanf with a size input; however, the datastore's use of a table allows you to assign different data types to each column. I use this for data that includes Boolean status bits along with doubles so that I don't have to store the Booleans as doubles in my data array.
mapreduce is specifically designed for operating on data sets that don't fit in memory. Rather than attempt to explain it I will simply suggest you check out the documentation.
docsearch Getting started with mapreduce
  댓글 수: 1
Sadjad Fakouri Baygi
Sadjad Fakouri Baygi 2016년 3월 18일
This solution is more professional, and worked very well.
Thanks,

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by