readtable cannot handle double quotation marks very well

조회 수: 11 (최근 30일)
Kouichi C. Nakamura
Kouichi C. Nakamura 2021년 1월 6일
편집: Kouichi C. Nakamura 2021년 1월 7일
I have CSV files saved with LibreOffice with text flanked by double quotation marks (Format quoted field as text).
When I tried to read one of such CSV with two rows with readtable,
T0 = readtable('file1.csv',...
'Encoding','UTF-8','delimiter',',','ReadVariableNames',true);
readtable failed to read the first row,
Then I used this command and it can read both rows.
opts1 = delimitedTextImportOptions('Encoding','UTF-8','Delimiter',',','DataLines',[2 Inf],'VariableNamesLine',1);
T1 = readtable('file1.csv',opts1);
However, the content of table wasn't great:
ans = 2×1 cell
'"optotagging"'
'"behaviour"'
The double quotation marks remained in some columns.
setvaropts' option 'QuoteRule','remove' appeared to be promissing, but I could not get it work.
setvaropts(opts1,'QuoteRule','remove')
How do I nicely remove double quotation marks in CSVs?

답변 (1개)

Kouichi C. Nakamura
Kouichi C. Nakamura 2021년 1월 6일
편집: Kouichi C. Nakamura 2021년 1월 7일
I asked this to Mathworks and their answer was helpful:
opts = detectImportOptions('file1.csv','NumHeaderLines',0,'Delimiter',',') %will almost work for this case, but it detects the first line as a "meta-data" line because it is all string/blank
opts.DataLines = [2,inf] %will work around that issue
T2 = readtable('file1.csv',opts);
With this code, I can read both rows and remove double quotation marks nicely.
According to Mathworks:
> The solution shared, is very specific to your workflow and is an undocumented method which might change without notice.

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by