Reading large CSV files

조회 수: 40 (최근 30일)
Sanchit Sharma
Sanchit Sharma 2022년 4월 2일
답변: Esha Chakraborty 2022년 4월 5일
Hello,
I have a 28GB Large .csv file that I am trying to read. I have tried readmatrix() and readtable(). Both functions are giving me below error:
Caught "std::exception" Exception message is:
Failed to convert character code.
Could you please provide me a solution.
Thanks
  댓글 수: 3
Sanchit Sharma
Sanchit Sharma 2022년 4월 2일
편집: Sanchit Sharma 2022년 4월 2일
yes i have 512GB ram
per isakson
per isakson 2022년 4월 2일
편집: per isakson 2022년 4월 2일
Propably, your file contains some strange characters, e.g. to indicate missing data. The error message indicates that. One way to spot the position in the file that causes the error is
textscan( __________ , 'ReturnOnError',false )
It produces a better error message.

댓글을 달려면 로그인하십시오.

답변 (1개)

Esha Chakraborty
Esha Chakraborty 2022년 4월 5일
Hi Sanchit,
I understand that you are receiving the message - 'Failed to convert character code' when you are attempting to read large CSV files. Possible reason can be that the read buffer is too large and too much data is being read at once. It is suggested to reduce the amount of data being loaded and see if the situation still exists.
Here are a few ways to import large CSV array:
  1. You can try to split the file into smaller sections using any reliable third-party file splitting software, before importing to MATLAB.
  2. You can explore if the Datastore feature suits your use case. A Datastore is an object for reading a single file or a collection of files or data. The Datastore acts as a repository for data that has the same structure and formatting. You can refer to the following documentation page for more details on Datastore here.
  3. You can also explore if MapReduce is an option in your use case. MapReduce is a programming technique for analyzing data sets that do not fit in memory. You can refer to the following documentation page for more details on Mapreduce here.

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

태그

제품


릴리스

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by