readtable error : read all columns into a single column

조회 수: 16 (최근 30일)
Guohua
Guohua 2023년 11월 9일
댓글: Guohua 2023년 11월 10일
I am using readtable to read a *.txt file into table
data = readtable('test.txt', 'Format','auto')
the data file have 10 columns, but in the loaded data table, all columns are cancatecated into a single column, any ideas how to fix this?
thank you.
  댓글 수: 4
dpb
dpb 2023년 11월 9일
You forgot to attach the data file so we have to guess...
First, try
data = readtable('test.txt');
with no additional parameters; generally the input scanner is pretty competent about finding the delimiter on its own.
If that fails, then telling it what the delimiter is will be next; the image you attached is from Excel where you imported it, it is NOT the text file; the Q? is whether Excel was able to detect the delimiter on its own or did you have to use the whizard to separate columns -- if the text file is fixed-width, that is generally the case and would be one likely cause for your result in MATLAB.
Guohua
Guohua 2023년 11월 9일
Please see the attached .txt file, adding "ExpectedNumVariables" won't solve the issue.
I also tried "Format"
readtable(filepath,'Format','%s%s%s%f%f%f%f%f%f%f%f%f%f')
This wont work either.

댓글을 달려면 로그인하십시오.

답변 (2개)

Cris LaPierre
Cris LaPierre 2023년 11월 9일
It looks like MATLAB is not correctly identifying the delimitar, which is a tab. You can manually specify that.
Some fo the datatypes are also incorrectly selected. You can use theImport Tool to ineratively set the import parameters and then generate a script.
You could also do that manually.
opts = delimitedTextImportOptions("NumVariables",13);
opts.Delimiter = "\t";
opts.VariableNamesLine = 1;
opts.DataLines = [2,inf];
opts = setvartype(opts,["datetime", "string", "string", "double", "double", "double", "double", "double", "double", "double", "double", "double", "double"]);
data = readtable("data_20091130.txt",opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
data = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 "03216N103" <missing> NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 30-Nov-2009 "016275109" <missing> NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 30-Nov-2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002034957" <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 30-Nov-2009 <missing> <missing> NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 30-Nov-2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 30-Nov-2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 30-Nov-2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 30-Nov-2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 30-Nov-2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 30-Nov-2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 30-Nov-2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386
  댓글 수: 1
Cris LaPierre
Cris LaPierre 2023년 11월 9일
Using your second approach also works, but with some additional specifications.
readtable("data_20091130.txt","Format",'%{MM/dd/yyyy}D%s%s%f%f%f%f%f%f%f%f%f%f',...
'Delimiter','\t','TextType','string')
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
ans = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly __________ ___________ _________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 11/30/2009 "03216N103" "" NaN NaN -0.26873 NaN NaN 1 NaN NaN NaN NaN 11/30/2009 "016275109" "" NaN NaN 0.028927 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "860341106" "B298752" NaN NaN -0.19644 NaN NaN 0 NaN NaN NaN NaN 11/30/2009 "816074405" "B01RQH8" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002034957" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "00846U101" "2520153" 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 NaN 7 NaN 11/30/2009 "" "" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "004403101" "2005650" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "043436104" "2855855" 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 NaN 10 NaN 11/30/2009 "013104104" "2012467" NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 11/30/2009 "002824100" "2002305" -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 NaN 10 0.49791 11/30/2009 "00081T108" "B0G7SZ5" -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 NaN 3 NaN 11/30/2009 "G0450A105" "2740542" -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 NaN 5 NaN 11/30/2009 "025195405" "B064919" -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 NaN 1 NaN 11/30/2009 "008190100" "2034524" -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 NaN 9 NaN 11/30/2009 "032654105" "2032067" 1.3453 -0.11685 0.14799 2.7512 0 1 0 NaN 9 0.65386

댓글을 달려면 로그인하십시오.


Voss
Voss 2023년 11월 9일
filename = 'data_20091130.txt';
% detect import options, specifying tab as delimiter:
opts = detectImportOptions(filename,'Delimiter','\t');
% set variable 2 and 3's data to be interpreted as character vectors:
opts = setvartype(opts,opts.VariableNames([2 3]),{'char','char'});
% read the file:
T = readtable(filename,opts)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T = 939×13 table
Date x_cusip x_sedol ModelComposite_USHISTORY GrowthComposite_USHISTORY QualityComposite_USHISTORY SentimentComposite_USHISTORY ValuationComposite_USHISTORY GrowthOrientedIndustriesForBacktest EquityDistressedFlag_USHistory DividendGrowthProb DistressedCompositeGlobal_Rank_9 DIV_GR_PROB_LogitGlobal_Monthly ___________ _____________ ___________ ________________________ _________________________ __________________________ ____________________________ ____________________________ ___________________________________ ______________________________ __________________ ________________________________ _______________________________ 30-Nov-2009 {'03216N103'} {0×0 char } NaN NaN -0.26873 NaN NaN 1 NaN {0×0 char} NaN NaN 30-Nov-2009 {'016275109'} {0×0 char } NaN NaN 0.028927 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'860341106'} {'B298752'} NaN NaN -0.19644 NaN NaN 0 NaN {0×0 char} NaN NaN 30-Nov-2009 {'816074405'} {'B01RQH8'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002034957'} {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'00846U101'} {'2520153'} 0.025038 1.3957 -1.3517 0.052121 -1.4895 1 0 {0×0 char} 7 NaN 30-Nov-2009 {0×0 char } {0×0 char } NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'004403101'} {'2005650'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'043436104'} {'2855855'} 0.097163 -0.91015 0.72884 0.43454 1.1715 1 0 {0×0 char} 10 NaN 30-Nov-2009 {'013104104'} {'2012467'} NaN NaN NaN NaN NaN NaN NaN {0×0 char} NaN NaN 30-Nov-2009 {'002824100'} {'2002305'} -0.17297 0.83537 0.02089 -0.04318 -0.27188 0 0 {0×0 char} 10 0.49791 30-Nov-2009 {'00081T108'} {'B0G7SZ5'} -0.57832 -1.5548 -0.4272 -0.62305 -0.11965 0 0 {0×0 char} 3 NaN 30-Nov-2009 {'G0450A105'} {'2740542'} -0.40851 0.65944 0.38131 -1.0877 -0.058327 0 0 {0×0 char} 5 NaN 30-Nov-2009 {'025195405'} {'B064919'} -0.33473 -1.2372 0.94208 -2.3694 0.79164 0 1 {0×0 char} 1 NaN 30-Nov-2009 {'008190100'} {'2034524'} -0.068901 0.29572 -0.19788 -0.19018 0.45785 1 0 {0×0 char} 9 NaN 30-Nov-2009 {'032654105'} {'2032067'} 1.3453 -0.11685 0.14799 2.7512 0 1 0 {0×0 char} 9 0.65386

카테고리

Help CenterFile Exchange에서 MATLAB Report Generator에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by