How can I get MATLAB to preserve column names as variables when importing a csv file to timetable?

조회 수: 169 (최근 30일)
Hello,
I can't seem to get MATLAB to import a csv file and preserve the column names as variables. The variable names come back as Var1, Var2, etc. Here's one of the approaches I've tried:
opts = detectImportOptions('test.csv',"VariableNamingRule","preserve");
test= readtimetable('test.csv', opts);
I've also left off the options qualifiers, but to no avail. The variable names still come back as Var1, Var2, etc.
Interestingly, the variable names are preserved if I use the Import wizard, but this is not a long-term solution. I have the same problem if readtable is used instead of readtimetable.
Any assistance greatly appreciated! Thanks for reading-
Nina
  댓글 수: 12
Nina Sweeney
Nina Sweeney 2023년 10월 25일
Walter, Star Strider:
Yes, I have no trouble preserving the variable names for the snippet; the problem is that the variable names change to Var1,...,Var14 for the full csv file.
Nina
Nina Sweeney
Nina Sweeney 2023년 10월 25일
I haven't tried datastore yet, but I plan to do so.
Here's the solution, at least for now:
  1. Import the large csv with the Import Tool.
  2. Once in the Import Tool: Import Selection > Generate Script (or Generate Function).
  3. Save and run the script. The script produces the needed options to use the column headers of the csv file as variable names.
  4. The generated code is shown below:
IMPORTFILE Import data from a text file
% NODE202FLIGHTDATAOUT = IMPORTFILE(FILENAME) reads data from text file
% FILENAME for the default selection. Returns the data as a table.
%
% NODE202FLIGHTDATAOUT = IMPORTFILE(FILE, DATALINES) reads data for the
% specified row interval(s) of text file FILENAME. Specify DATALINES as
% a positive scalar integer or a N-by-2 array of positive scalar
% integers for dis-contiguous row intervals.
%
% Example:
% node202flightdataout = importfile("C:\Nina\Aircraft Position\All Logs Update 1\node202_flight_data_out.csv", [2, Inf]);
%
% See also READTABLE.
%
% Auto-generated by MATLAB on 25-Oct-2023 08:07:30
%% Input handling
% If dataLines is not specified, define defaults
if nargin < 2
dataLines = [2, Inf];
end
%% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 15);
% Specify range and delimiter
opts.DataLines = dataLines;
opts.Delimiter = ",";
% Specify column names and types
opts.VariableNames = ["LogSet", "UTCTimeS", "Time", "rxEbno1For257", "rxEbno2For257", "RxLinkRatebps", "TxPowerdBm", "TxLinkRatebps", "Lat", "lon", "Altm", "Headingdeg", "Rolldeg", "Pitchdeg", "VarName15"];
opts.VariableTypes = ["double", "double", "datetime", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double", "string"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "VarName15", "WhitespaceRule", "preserve");
opts = setvaropts(opts, "VarName15", "EmptyFieldRule", "auto");
opts = setvaropts(opts, "Time", "InputFormat", "yyyy-MM-dd'T'HH:mm:ss");
% Import the data
node202flightdataout = readtable(filename, opts);
end

댓글을 달려면 로그인하십시오.

채택된 답변

Nina Sweeney
Nina Sweeney 2023년 10월 26일
I haven't tried datastore yet, but I plan to do so.
Here's the solution, at least for now:
  1. Import the large csv with the Import Tool.
  2. Once in the Import Tool: Import Selection > Generate Script (or Generate Function).
  3. Save and run the script. The script produces the needed options to use the column headers of the csv file as variable names.
  4. The generated code is shown below:
IMPORTFILE Import data from a text file
% NODE202FLIGHTDATAOUT = IMPORTFILE(FILENAME) reads data from text file
% FILENAME for the default selection. Returns the data as a table.
%
% NODE202FLIGHTDATAOUT = IMPORTFILE(FILE, DATALINES) reads data for the
% specified row interval(s) of text file FILENAME. Specify DATALINES as
% a positive scalar integer or a N-by-2 array of positive scalar
% integers for dis-contiguous row intervals.
%
% Example:
% node202flightdataout = importfile("C:\Nina\Aircraft Position\All Logs Update 1\node202_flight_data_out.csv", [2, Inf]);
%
% See also READTABLE.
%
% Auto-generated by MATLAB on 25-Oct-2023 08:07:30
%% Input handling
% If dataLines is not specified, define defaults
if nargin < 2
dataLines = [2, Inf];
end
%% Set up the Import Options and import the data
opts = delimitedTextImportOptions("NumVariables", 15);
% Specify range and delimiter
opts.DataLines = dataLines;
opts.Delimiter = ",";
% Specify column names and types
opts.VariableNames = ["LogSet", "UTCTimeS", "Time", "rxEbno1For257", "rxEbno2For257", "RxLinkRatebps", "TxPowerdBm", "TxLinkRatebps", "Lat", "lon", "Altm", "Headingdeg", "Rolldeg", "Pitchdeg", "VarName15"];
opts.VariableTypes = ["double", "double", "datetime", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double", "string"];
% Specify file level properties
opts.ExtraColumnsRule = "ignore";
opts.EmptyLineRule = "read";
% Specify variable properties
opts = setvaropts(opts, "VarName15", "WhitespaceRule", "preserve");
opts = setvaropts(opts, "VarName15", "EmptyFieldRule", "auto");
opts = setvaropts(opts, "Time", "InputFormat", "yyyy-MM-dd'T'HH:mm:ss");
% Import the data
node202flightdataout = readtable(filename, opts);
end

추가 답변 (2개)

Star Strider
Star Strider 2023년 10월 24일
편집: Star Strider 2023년 10월 25일
I am not certain that 'VariableNamingRule' works as part of the options structure.
Try this instead:
test= readtimetable('test.csv', ,"VariableNamingRule","preserve");
EDIT — (25 Oct 2023 at 01:28)
It seems to work correctly here —
opts = detectImportOptions('test.csv',"VariableNamingRule","preserve");
test = readtimetable('test.csv', opts)
test = 3×13 timetable
Time LogSet UTCTimeS rxEbno1For257 rxEbno2For257 RxLinkRate(bps) TxPower(dBm) TxLinkRate(bps) Lat lon Alt(m) Heading(deg) Roll(deg) Pitch(deg) ____________________ ______ __________ _____________ _____________ _______________ ____________ _______________ ______ _______ ______ ____________ _________ __________ 11-Aug-2023 21:02:29 0 1.6918e+09 1.3 9.4 76250 2.2 20250 38.292 -76.419 29 125.15 -0.86387 1.8717 11-Aug-2023 21:02:30 0 1.6918e+09 -1.7 9 76250 3.5 20250 38.292 -76.419 29 124.9 -0.87145 1.9309 11-Aug-2023 21:02:31 0 1.6918e+09 -6.8 8.6 76250 2.9 20250 38.292 -76.419 29 124.24 -0.89849 1.9858
EDIT — (25 Oct 2023 at 11:45)
@Nina Sweeney — See if the same problem occurs with datastore. It also has a VariableNamingRule option. It is designed for large arrays, so it may work with your file.
.
  댓글 수: 2
Star Strider
Star Strider 2023년 10월 25일
I tried that recently and it threw an error, although I don’t remember the context. Using it as a name-value pair (copied and pasted, so spelling or typos were not an issue) in readtable worked.

댓글을 달려면 로그인하십시오.


Walter Roberson
Walter Roberson 2023년 10월 24일
Works for me, provided that the number of header variables does not exceed the number of detected columns.
When I accidentally added extra numeric columns in the data, I did get trailing Var* variables being generated.
All of this requires that detectImportOptions was able to figure out which row had your variable names. If that goes wrong, then Sure, it might end up generating Var* variables.
filename = 'test.csv';
opts = detectImportOptions(filename,"VariableNamingRule","preserve");
%for this PARTICULAR test file, the datetime format is not recognized
opts = setvartype(opts,1,'datetime');
opts = setvaropts(opts,1,'InputFormat', 'MMM d uuuu HH:mm');
test = readtable('test.csv',opts)
test = 2×7 table
first second third one fourth %increase month to date last ____________________ ______ _________ ______ _________ _____________ ____ 03-Jan-2001 18:03:00 1 2 3 4 5 6 17-Aug-2022 05:19:00 9 8 7 6 5 4

카테고리

Help CenterFile Exchange에서 Data Preprocessing에 대해 자세히 알아보기

제품


릴리스

R2023a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by