이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
Textscan import string data from .txt file
조회 수: 2 (최근 30일)
이전 댓글 표시
Linus Dock
2021년 11월 13일
Hi!
When I'm using textscan to read my data I get all the data but it's not quite organized the way I would like.
I will attach a sample .txt file.
I'm using this code to import the data:
%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
filename = sprintf('%s.txt',w(h,:)); %add .txt to year and month
fileID = fopen(filename); %open filename to create fileID
Data{h} = textscan(fileID,'%s','delimiter','\n'); %read all characters in fileID
fclose(fileID); %close fileID
end
What I would like to achieve is a string starting with METAR ESXX and with varying ending (for ex. Q1011 or R08/750135 or other).
I've tried using different delimiters but I get more or less the same result with the different delimiters.
It seems to be some problem when the data is not delimited by a newline what I can tell, but I can't find the right solution to get it working.
In a previous version of my code I was using fread but I understand that textscan is better to use. Is that correct?
Do you have any suggestions to what could be changed?
Thanks!
This a sample of the result of Data.
'M04/M06 Q1020
METAR ESKN 160020Z 31003KT 0300 R08/P2000N R26/1100N BCFG NSC'
'M04/M05 Q1010 R08/750135
METAR ESKN 160050Z 31003KT 5000 BR FEW064 M04/M04 Q1011 R08/750135
METAR ESKN 160120Z 31003KT CAVOK M03/M03 Q1011 R08/750135
METAR ESKN 160150Z VRB01KT 9999 FEW003 BKN061 M03/M03 Q1011'
'R08/750135
METAR ESKN 160220Z 32004KT 9999 SCT042 BKN055 BKN066 M02/M02 Q1011'
'R08/750135
METAR ESKN 160250Z 28003KT 9999 SCT003 BKN036 BKN057 M02/M02 Q1012'
'R08/750135
METAR ESKN 160320Z VRB02KT 9999 BKN002 M02/M02 Q1012 R08/750135
METAR ESKN 160350Z 33004KT 9999 BKN002 M01/M01 Q1012 R08/750135
METAR ESKN 160420Z VRB01KT 9999 BKN002 M01/M01 Q1012 R08/750135
METAR ESKN 160450Z 00000KT 4000 BR SCT003 M02/M02 Q1013 R08/710195
METAR ESKN 160520Z 30003KT 0300 R08/P2000N R26/0750U BCFG FEW003'
'SCT072 M03/M03 Q1013 R26/710195
METAR ESKN 160550Z VRB03KT 9000 SCT066 M03/M03 Q1013 R26/710195
METAR ESKN 160620Z 29003KT 9999 FEW002 BKN068 M02/M02 Q1013'
'R26/710195
METAR ESKN 160650Z 31003KT 9999 FEW002 SCT068 M00/M00 Q1014'
채택된 답변
dpb
2021년 11월 13일
Read the file as is and then clean it up instead...
d=readcell('202103.txt','Delimiter',newline); % read a cellstr array
i1=find(~startsWith(d,'METAR'))-1; % locate first of line pairs
for i=1:numel(i1) % and merge those by pair
d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[]; % then eliminate the second
Sanity check...
>> all(startsWith(d,'METAR'))
ans =
logical
1
>>
댓글 수: 9
dpb
2021년 11월 13일
I already did...and it does. -- Presuming the broken lines are not more than two long.
Linus Dock
2021년 11월 16일
Hello again!
Thanks for your help!
I can't get the readcell function to work with my version of Matlab 2018b.
I tried incorporating your suggestion into my code but I can't get it to function properly I'm afraid.
%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
filename = sprintf('%s.txt',w(h,:)); %add .txt to year and month
fileID = fopen(filename); %open filename to create fileID
Data{h} = textscan(fileID,'%s','delimiter','\n'); %read all characters in fileID
fclose(fileID); %close fileID
end
d=Data{:}{1};
%d=readcell('202103.txt','Delimiter',newline); % read a cellstr array
i1=find(~startsWith(d,'METAR'))-1; % locate first of line pairs
for i=1:numel(i1) % and merge those by pair
d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[]; % then eliminate the second
d is now just a 1x1 cell with the following content:
'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00 Q1030 R21/09//95
METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00'
The code seems to work with separating the groups judging by the return symbol in front of the METAR group below. But how do I get the output as separate cells containing one METAR line.
{'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00 Q1030 R21/09//95↵METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010120Z 21007KT 0150 R03/0500N R21/0500N FG VV002 01/00' } {'Q1030 R21/09//95↵METAR ESGG 010150Z 19007KT 0100 R03/0500N R21/0450N FG VV002 01/00'
dpb
2021년 11월 16일
Oh. Unfortunately for you, readcell was introduced in R2019a.
I had difficulty with textscan, too...the input file contains \r at the end of each METAR line and \n after the short lines. That seemed to confuse all the past ways I've used to return records as cellstr inside textscan
My usual fallback in such cases is to return to the venerable (but deprecated) textread but it also failed with a (new to me) buffer overflow because it, too, apparently became confused by the disparate terminators.
So, before reverting to fegtl and loop (which isn't all that bad, actually, just a little more code to write, but less than your above loop), I tried the simple expedient of
>> d=importdata('202103.txt');
>> whos d
Name Size Bytes Class Attributes
d 77796x1 15658840 cell
>> d(1:6)
ans =
6×1 cell array
{'METAR ESGG 010020Z 19007KT 0150 R03/0600N R21/0550N FG VV003 01/00'}
{' Q1030 R21/09//95' }
{'METAR ESGG 010050Z 20007KT 0150 R03/0550N R21/0550N FG VV002 01/00'}
{' Q1030 R21/09//95' }
{'METAR ESGG 010120Z 21007KT 0150 R03/0500N R21/0500N FG VV002 01/00'}
{' Q1030 R21/09//95' }
>>
and joy ensues.
Now the previous join trick should work as expected.
dpb
2021년 11월 16일
편집: dpb
2021년 11월 16일
ADDENDUM
%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
filename = sprintf('%s.txt',w(h,:));
d=importdata(filename);
i1=find(~startsWith(d,'METAR'))-1;
for i=1:numel(i1)
d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[];
% Now do your business on this file BEFORE going to the next one
% That could include (and I would recommend) writing it out in clean form
% either as new text file or replacing the original (be very careful to
% have backups first if trying that) or SAVEing as .mat files.
...
Data(h)=d; % will save into your large array
end
dpb
2021년 11월 16일
편집: dpb
2021년 11월 16일
ADDENDUM SECOND
If for some reason, importdata also has a problem...
%Imports all .txt files according user input time and convert into strings
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
filename = sprintf('%s.txt',w(h,:));
fid=fopen(filename);
% replacement to read file with low-level fgetl()
d={};
i=0;
while ~feof(fid)
i=i+1;
d(i,1)={fgetl(fid)};
end
fclose(fid);
% end alternate code here...
i1=find(~startsWith(d,'METAR'))-1;
for i=1:numel(i1)
d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[];
% Now do your business on this file BEFORE going to the next one
% That could include (and I would recommend) writing it out in clean form
% either as new text file or replacing the original (be very careful to
% have backups first if trying that) or SAVEing as .mat files.
...
Data(h)=d; % will save into your large array
end
dpb
2021년 11월 16일
ADDENDUM THIRD
>> frewind(fid),clear d;d={};tic;i=0;while ~feof(fid),i=i+1;d(i,1)={fgetl(fid)};end,toc
Elapsed time is 0.476456 seconds.
>> whos d
Name Size Bytes Class Attributes
d 77796x1 15658840 cell
>>
It's not too bad to use fgetl into empty cell array as far as timing goes...
>> tic,d=importdata('202103.txt');toc
Elapsed time is 0.173747 seconds.
>>
but importdata wins hands down so use it if at all possible...
Linus Dock
2021년 11월 17일
Awesome! There was joy!
Importdata did the trick.
This is what worked for me
Data = cell(1, numfiles); %Preallocate empty cell
for h = 1:numfiles
filename = sprintf('%s.txt',w(h,:));
d=importdata(filename);
i1=find(~startsWith(d,'METAR'))-1;
for i=1:numel(i1)
d(i1(i))=join(d(i1(i):i1(i)+1));
end
d(i1+1)=[]; % then eliminate the second
Data{h}=d; % will save into your large array
end
Thanks a lot!
dpb
2021년 11월 17일
I was sure it would... :)
Glad to help; sorry didn't know were on earlier release initially...
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Text Files에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom(English)
아시아 태평양
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)
