Strange behavior from readtable
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
0 개 추천
I have some text files that I'm reading in with readtable. For most of the files, everything seems to work fine (example1.txt). Strings come through as strings, and numbers come through as numbers. But for some reason, I'm getting strange results when trying to read in example2.txt. The last 2 columns....which are all numbers....are coming through as strings. I'm thinking there has to be some kind of strange format in the example2.txt file, but I've scanned both files, and don't see anything that would be causing this. My call is simply data = readtable(filename). Obviously I'll try reading in everything as strings to try to fix this, but was just curious as to what could be causing this.
댓글 수: 1
Frank
2023년 6월 27일
Oops....my call is data = readtable(filename,'NumHeaderLines',2,'TreatAsMissing','-999')
채택된 답변
While READTABLE's automagic file format detection is great, the more a file deviates (missing data, lines filled with asterisks) the more help you will have to give it. For this use DETECTIMPORTOPTIONS, SETVARTYPE, etc.
Once it is set up correctly you do not need to repeat this for each file: you can reuse the options object for all files.
F1 = 'example1.txt';
F2 = 'example2.txt';
Ob = detectImportOptions(F1, 'FileType','fixedwidth', 'Range',1, 'TreatAsMissing','-999');
Ob = setvartype(Ob, 'double');
Ob = setvartype(Ob, {'Latitude___','Longitude___'},'char');
Ob = setvartype(Ob, {'FlightTimeSinceLaunch_mins_secs_','UTCTime_hrs_mins_secs_'},'duration');
Ob = setvaropts(Ob, 'FlightTimeSinceLaunch_mins_secs_', 'InputFormat','mm:ss');
T1 = readtable(F1,Ob)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T1 = 7063×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ ___________ ____________ _____________ _______________ _____________ ______________ ________________
-08:31 NaN NaN {0×0 char} {0×0 char} 1023.6 NaN NaN NaN NaN
-08:30 NaN NaN {0×0 char} {0×0 char} 1023.6 28.7 NaN NaN NaN
-08:29 NaN NaN {0×0 char} {0×0 char} 1023.6 28.7 NaN NaN NaN
-08:28 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 74.9 NaN NaN
-08:27 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:26 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:25 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:24 NaN NaN {0×0 char} {0×0 char} 1023.6 NaN NaN NaN NaN
-08:23 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.2 NaN NaN
-08:22 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.2 NaN NaN
-08:21 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75.1 NaN NaN
-08:20 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:19 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:18 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:17 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
-08:16 NaN NaN {0×0 char} {0×0 char} 1023.6 28.6 75 NaN NaN
T1{270:279,2}
ans = 10×1 duration array
NaN
NaN
NaN
NaN
14:41:01
14:41:02
14:41:03
14:41:04
14:41:05
14:41:06
T2 = readtable(F2,Ob)
Warning: Column headers from the file were modified to make them valid MATLAB identifiers before creating variable names for the table. The original column headers are saved in the VariableDescriptions property.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
Set 'VariableNamingRule' to 'preserve' to use the original column headers as table variable names.
T2 = 7136×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ _______________ ________________ _____________ _______________ _____________ ______________ ________________
-12:57 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:56 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:55 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:54 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 NaN NaN NaN
-12:53 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.1 NaN NaN
-12:52 NaN NaN {0×0 char } {0×0 char } 1021.8 NaN NaN NaN NaN
-12:51 NaN NaN {0×0 char } {0×0 char } 1021.8 NaN NaN NaN NaN
-12:50 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:49 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:48 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:47 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:46 NaN NaN {0×0 char } {0×0 char } 1021.8 26.7 74.2 NaN NaN
-12:45 10:02:23 -18.3 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:44 10:02:24 -17.7 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:43 10:02:25 -18.4 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
-12:42 10:02:26 -18.9 {'28°28'32.9"'} {'-80°33'29.7"'} 1021.8 26.7 74.2 NaN NaN
Note that you can use CONVERTVARS to efficiently convert the degree+minute+second text into numeric vectors (or scalars), e.g.:
T2 = convertvars(T2,@iscell,@myfun)
T2 = 7136×10 table
FlightTimeSinceLaunch_mins_secs_ UTCTime_hrs_mins_secs_ Altitud___ Latitude___ Longitude___ Pressure_hPa_ Temperature__C_ Humidity__RH_ WindSpeed_m_s_ WindDirection___
________________________________ ______________________ __________ ___________________ ___________________ _____________ _______________ _____________ ______________ ________________
-12:57 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:56 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:55 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:54 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 NaN NaN NaN
-12:53 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.1 NaN NaN
-12:52 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 NaN NaN NaN NaN
-12:51 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 NaN NaN NaN NaN
-12:50 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:49 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:48 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:47 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:46 NaN NaN NaN NaN NaN NaN NaN NaN 1021.8 26.7 74.2 NaN NaN
-12:45 10:02:23 -18.3 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:44 10:02:24 -17.7 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:43 10:02:25 -18.4 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
-12:42 10:02:26 -18.9 28 28 32.9 -80 33 29.7 1021.8 26.7 74.2 NaN NaN
Basic matrix multiplication could also be very useful to combine the degrees+minutes+seconds into degrees:
T2.Longitude___ * [1;1/60;1/3600]
ans = 7136×1
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
NaN
function V = myfun(C);
V = nan(numel(C),3);
V(~cellfun(@isempty,C),:) = sscanf([C{:}],'%f°%f''%f"',[3,Inf]).';
end
Thinks out loud: it would be interesting if DURATION accepted a wider range of formats, then degree+minutes+seconds could be natively imported as a duration type...
댓글 수: 3
Frank
2023년 6월 28일
Note to all, I erred saying example2.txt was the problem file......it's in fact example1.txt which is the problem file. Sorry about that. Also note that it's really the last 5 columns I'm interested in. I don't really care about the date or lat/lon data.
Thanks for the feedback Stephen. It's interesting when I do 0b.VariableTypes on example1, the result is...
'char' 'char' 'char' 'char' 'char' 'double' 'double' 'double' 'char' 'char'
while the results for example2 are...
'char' 'duration' 'double' 'char' 'char' 'double' 'double' 'double' 'double' 'double'
Randomly selecting several other files, they have the same result as example2......which is what I expect.
I guess example1.txt just has something strange embedded in it that is throwing the read off. I'll either have to toss that file, or build a different read function.
" It's interesting when I do 0b.VariableTypes on example1, the result is..."
That should not be the result after the STEVARTYPE calls. Please check after the variable types and options have been set. That is rather the whole point of many answer, I would be very surprised if it did not work.
As I wrote in my answer, you can reuse the same options object for all files. It should not change.
Frank
2023년 6월 28일
Yes.....I'm dumb.....sorry about that. I was so fixated on the fact that example1.txt was different than the others, that I didn't realize that I can specify what those data types should be. Thanks Stephen, your suggestion did the job. I actually made it simple by doing...
Ob = detectImportOptions(F2, 'FileType','fixedwidth', 'Range',1, 'TreatAsMissing','-999');
and then...
data = readtable('example1.txt',Ob);
Thanks all for the help!!
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기
태그
참고 항목
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
