How can I import only the numbers from an csv.-files with a text header?

조회 수: 8 (최근 30일)
I have hundreds of .csv-files, I attached one of them for example (Had to shorten it, beacuse it was bigger than 5 MB). Each of them has 10^6 Lines with data.
And I want to import those files automatically in my Matlab code. It is totally enough to import them one by one, but unfortunately I always had to preprocess this data manually with Text Editor. The problem is the text in the header of every .csv-file. I just want to import the numbers of the second, third and fourth column and not the text from the header. But even if I specify the columns, I cannot convert the recieved data store in numbers to run the calculations. This is my solution with the preprocessed data:
pre_data = datastore('Data.csv');
piece = zeros(1,3);
while hasdata(pre_data)
pie = read(pre_data);
pie = pie(:,1:3);
pie = table2array(pie);
piece = [piece; pie];
end
piece = piece(9:10^6+8,:);
With "piece", I can now easily run the calculations
To import the data without preprocessing, I tried "ds.SelectedVariableNames" and replacing "datastore" with "csvread". But nothing works.
Have anyone an advice, how to import such csv-files as an easily processable 1000000x3-double?
  댓글 수: 1
dpb
dpb 2018년 12월 15일
편집: dpb 2018년 12월 15일
Just attach the text of the first few (10 is enough) lines of the file that shows the header and data structure; how many data lines are in the file after the header is totally immaterial to the solution (as long as you have enough memory to hold the data).
The key Q? is whether the file structure is the same regarding the header -- is it always the same number of lines, are there a consistent number of blank records (if any) after the header, etc., etc., etc., ...
Also, are there the same number of variables (columns) in the file and are the records properly delimited if there are missing data?

댓글을 달려면 로그인하십시오.

채택된 답변

Jeremy Hughes
Jeremy Hughes 2018년 12월 16일
You should be able to add 'NumHeaderLines',7 to the datastore call and get what you want.
The issue is that this looks a lot like a CSV file exported from Excel. There are a lot of extraneous commas, and that's throwing off all the file format detection.
  댓글 수: 1
Christoph Müßig
Christoph Müßig 2018년 12월 16일
Thank you all for your ideas and tricks. The solution to add 'NumHeaderLines',7 to the datastore call worked perfectly and solved the problem.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

제품


릴리스

R2018a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by