How to remove header in .txt file while retaining format of data

조회 수: 4 (최근 30일)
Sarah
Sarah 2011년 9월 30일
댓글: Walter Roberson 2016년 3월 31일
Hello everyone,
I have a .txt file with 12 lines of a header that is useless information to me. Is there a way to remove the header so that I am left with only data?
dlmread does not work because I have dates to read, csvread is producing an error, and textscan seems to provide the data in an altered format.
This is the code I am using to retrieve the data so far:
A = textscan(FID,'%s','headerLines',12)
This is what the data looks like when it is output:
'3298602123321'
'1'
'2011-09-20'
'01:00:00'
'10.8554401397705'
'3298602123321'
'1'
'2011-09-20'
'01:15:00'
'10.8555603027344'
This is what I need the data to look like:
3298602123321 1 2011-09-20 01:00:00 10.8554401397705
3298602123321 1 2011-09-20 01:15:00 10.8555603027344
Any help will be greatly appreciated, thanks for taking the time to consider it. Thanks in advance for any posts!
~Sarah (:
  댓글 수: 1
Sarah
Sarah 2011년 9월 30일
What I need the data to look like is the way that it is in the .txt file before accessing it. I only want the header gone and the data to remain unchanged.
Thanks again!

댓글을 달려면 로그인하십시오.

답변 (3개)

Fangjun Jiang
Fangjun Jiang 2011년 9월 30일
To read it directly from the text file and got the format you want, you can do
FID=fopen('test.txt','rt');
A = textscan(FID,'%f %d %s %s %f');
fclose(FID);
format long
Sensor_ID=A{1}
Point_ID=A{2}
SampleDate=A{3}
SampleTime=A{4}
SampleValue=A{5}
Or you can use reshape()
A={'3298602123321'
'1'
'2011-09-20'
'01:00:00'
'10.8554401397705'
'3298602123321'
'1'
'2011-09-20'
'01:15:00'
'10.8555603027344'}
B=reshape(A,5,[])';
  댓글 수: 4
Sarah
Sarah 2011년 10월 6일
Thanks for the updated code. The format long option however is providing empty matrices. I can not think of a reason for this. The reshape option is still returning an error as well. Thanks for all you help. (:
Fangjun Jiang
Fangjun Jiang 2011년 10월 6일
편집: Walter Roberson 2016년 3월 31일
What do you mean? Copy the four lines below to test.txt and run the code.
3298602123321 1 2011-09-20 01:00:00 10.8554401397705
3298602123321 1 2011-09-20 01:15:00 10.8555603027344
3298602123321 1 2011-09-20 01:30:00 10.8555603027344
3298602123321 1 2011-09-20 01:45:00 10.8560495376587
I got:
Sensor_ID =
1.0e+012 *
3.298602123321000
3.298602123321000
3.298602123321000
3.298602123321000
Point_ID =
1
1
1
1
SampleDate =
'2011-09-20'
'2011-09-20'
'2011-09-20'
'2011-09-20'
SampleTime =
'01:00:00'
'01:15:00'
'01:30:00'
'01:45:00'
SampleValue =
10.855440139770501
10.855560302734400
10.855560302734400
10.856049537658700

댓글을 달려면 로그인하십시오.


Walter Roberson
Walter Roberson 2011년 9월 30일
Corrected:
A = textscan(FID,'%[^\n]','headerLines',12);
  댓글 수: 4
Sarah
Sarah 2011년 10월 6일
Thank you Walter for the update, the returned information is formatted well line by line. However, I am having trouble getting the information to be parsed as 5 separate variables while they retain the correlations with the other variables. I don't know if I am explaining myself well, but my goal is to be able to access the information either independently or as a group. Thanks for all your help! :)
Walter Roberson
Walter Roberson 2011년 10월 6일
This contradicts what you wrote earlier,
"What I need the data to look like is the way that it is in the .txt file before accessing it. I only want the header gone and the data to remain unchanged."
The code I supplied skips the header and reads in everything else as strings *exactly* the same way, space for space, character for character, as appears in the file.
If you want the data in the file split up into different variables, then you need to tell us what data type you want for each column, and you need to understand that unless you are wanting to split in to strings only, that the whitespace (blanks) might change, which would leave data that is *not* "the way it is in the text file before accessing it".

댓글을 달려면 로그인하십시오.


lpetley
lpetley 2016년 3월 31일
When using an ASCII file with several lines of headers, the best approach is to use the importdata() function. You can tell it how many lines of the file comprise the header, and it will load subsequent data in a numerical format.
  댓글 수: 1
Walter Roberson
Walter Roberson 2016년 3월 31일
That would not meet the requirement the user had for "the data to remain unchanged." Also, the data is not all in numeric format: there are two fields which contain time strings. importdata() would return a scalar struct that would have to be examined for its 'data' field and its 'textdata' field, and it would be necessary to figure out how textimport handles such things when the text fields might occur in the middle of a line. It is not clear that that is "best".
It would make more sense to use readtable() from R2013b onward, especially from R2014b onward (when it gained datetime handling), as that does not change the order of fields and is well defined as to how the various field types are handled.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Text Files에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by