I would like to read in a text file that contains a header and footer of information, where the number of rows of the header/footer, and number of rows to read can vary.
Here is an example of a row of data I would like to read from the text file.
All rows start with the ARR++ and are delineated with ':' I now I need i most likely need to use fopen / fprintf / fget1 / textscan but hoping someone can help set this up.
One other thing with the rows of data I would like to read in, there is date information like: 2013320133. I would ideally like to read only the first 5 digits of that date and separate the year and quarter into separate columns -- 20133 --> 2013 3
Here is a better example of the type of file I would like to read in. I am only interested in the ARR++ lines. I would be interested to have only the first 5 digits of 2013420134. Thanks a lot.
UNA:+.? ' UNB+UNC:140305:1444++' UNH+:2:1:E6' BGM+74' NA+Z02+' NAD+M+50' ND+MS+C2' STS+3+7' DM+242:20144:203' GI+AR3' GS+1:::-' ARR++Q:S:C:A:1N:2013420134:708:1234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:12234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:133234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:132234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123334.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:1232134.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123324.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123234.323:A:N' UNT+16+' UNZ+1+I3800'

댓글 수: 5

Image Analyst
Image Analyst 2014년 4월 5일
You'll learn it better if you set it up yourself. You may also find the function strfind() useful for ignoring lines without ARR++ in them.
dpb
dpb 2014년 4월 5일
An actual exact copy of a short segment of a file would help more than a paraphrased one--can't tell what's editorial and what's data as posted.
One useful feature in textscan is the 'commentstyle' optional argument--that'll probably allow you to account for the variable header length if it is delineated as shown w/ the ";" by using them as matching pairs.
The question is what does the actual dataline look like--are these actual lines or header/format lines w/ the ;start/;end messages?
Or, how large a file? If not large, it would be trivial to fgetl a line at a time and find the 'ARR++' strings and then just parse them. The line-at-a-time inefficiency for the i/o isn't generally that bad if files aren't quite large...
Jeff
Jeff 2014년 4월 5일
Here is a better example of the type of file I would like to read in. I am only interested in the ARR++ lines. I would be interested to have only the first 5 digits of 2013420134. Thanks a lot.
UNA:+.? ' UNB+UNC:140305:1444++' UNH+:2:1:E6' BGM+74' NA+Z02+' NAD+M+50' ND+MS+C2' STS+3+7' DM+242:20144:203' GI+AR3' GS+1:::-' ARR++Q:S:C:A:1N:2013420134:708:1234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:12234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:133234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:132234.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123334.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:1232134.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123324.323:A:N' ARR++Q:S:C:A:1N:2013420134:708:123234.323:A:N' UNT+16+' UNZ+1+I3800'
Image Analyst
Image Analyst 2014년 4월 5일
In the past 6 hours, have you at least given fgetl() or textscan() a try ? Or do you really really need us to do it 100% for you?
Jeff
Jeff 2014년 4월 5일
편집: Jeff 2014년 4월 5일
Image you ok today? Listen if you are annoyed by beginner questions then don't bother posting anything.

댓글을 달려면 로그인하십시오.

 채택된 답변

Jeff
Jeff 2014년 4월 6일

1 개 추천

fid = fopen('mydata.txt', 'r'); tline = fgetl(fid); k=1;
while ischar(tline) disp(tline) findrows = strfind(tline, 'ARR++'); if ~isempty(findrows) Data{k,:} = tline(:); k=k+1; end tline = fgetl(fid); end
fclose(fid);
filename = 'mydata_1.txt';
%this function converts the cell array back to a text file. found here: cell array to text file file;
cell2text(filename,Data);
%open file fid = fopen(filename, 'r'); %figure out how many columns are there firstline = fgetl(fid); ncol = 1 + sum(firstline == ':'); %reset to beginning of file fseek(fid,0,0); %read data data = textscan(fid,repmat('%s',1,ncol),'Delimiter',':','CollectOutput',1); data = data{:,:}; data = data(:,[3:7 9:12 14 16]);
[m,n]=size(data);
for i = 1:m data1{i,9} = data{i,9}(1:5); data{i,9} = {}; data{i,9} = data1{i,9}; end
%close file fclose(fid);

추가 답변 (2개)

Image Analyst
Image Analyst 2014년 4월 5일

0 개 추천

OK Jeff I did it for you. It just took a couple of minutes. I copied the data you gave to a test.dat file. Then I wrote code to read it in using fgetl() and search for lines that start with "ARR++Q:S:C:A:1N:" based on code I got in the help for fgetl. Then I extracted the 5 numerical characters from the string and converted it to a double number. Here is the code for you:
fid = fopen('test.dat');
tline = fgetl(fid);
k = 1; % Counter for lines that are valid.
while ischar(tline)
disp(tline)
colonLocation = strfind(tline, 'ARR++Q:S:C:A:1N:');
if ~isempty(colonLocation)
subString = tline(17:21);
output(k) = str2double(subString);
k = k + 1;
end
tline = fgetl(fid);
end
fclose(fid);
% Print output to command window:
output
Results in the command window:
output =
20134 20134 20134 20134 20134 20134 20134 20134

댓글 수: 4

Jeff
Jeff 2014년 4월 5일
Image thank-you very much; I promise that I am reviewing the code to learn. Much appreicated
Jeff
Jeff 2014년 4월 5일
편집: Jeff 2014년 4월 5일
Image I think i caused some confusion around the extraction of the year and quarter. I would like to have returned only the ARR+ rows (the entire row) and I was wondering about how to exculde parts of the text -- i.e. only the first 5 digits of the year/quarter as you did.
Also, i would like to have each column in a cell array. The file is delimiter with ':'.
Just add a line
outputStrings{k} = tline;
to save the entire line also.
Jeff
Jeff 2014년 4월 6일
편집: Jeff 2014년 4월 6일
Hi again Image,
I have to admit I wasn't successful using your code; not sure why but I don't get any output returned.
ANYWAYS you forced me to keep playing around and 1 day later :) I think i figured out a nice solution. My goal (for no particular reason but for just learning purposes) was to create a generic function that will trim out the header and footer of any file, take the remaining rows needed and then "text - to - column" the data taking only the columns needed and replacing the 10 digit date with only 5 digits.

댓글을 달려면 로그인하십시오.

Jeff
Jeff 2014년 4월 6일
편집: Jeff 2014년 4월 6일

0 개 추천

fid = fopen('mydata.txt', 'r'); tline = fgetl(fid); k=1;
while ischar(tline) disp(tline) findrows = strfind(tline, 'ARR++'); if ~isempty(findrows) Data{k,:} = tline(:); k=k+1; end tline = fgetl(fid); end
fclose(fid);
filename = 'mydata_1.txt';
%this function converts the cell array back to a textfile; found here: cell array to text cell2text(filename,Data);
%open file fid = fopen(filename, 'r'); %figure out how many columns are there firstline = fgetl(fid); ncol = 1 + sum(firstline == ':'); %reset to beginning of file fseek(fid,0,0); %read data data = textscan(fid,repmat('%s',1,ncol),'Delimiter',':','CollectOutput',1); data = data{:,:}; data = data(:,[3:7 9:12 14 16]);
[m,n]=size(data);
for i = 1:m data1{i,9} = data{i,9}(1:5); data{i,9} = {}; data{i,9} = data1{i,9}; end
%close file fclose(fid);

카테고리

도움말 센터File Exchange에서 Text Data Preparation에 대해 자세히 알아보기

질문:

2014년 4월 5일

편집:

2014년 4월 6일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by