Problem only reading in select data

Hello all,
I am currently in the process of working on reading in this data file into MATLAB however I am having issues grabbing only the data I want. The file is formatted as follows:
*Sale Item Price Profit
1200 00213 12.21 3.26*
Date Salesperson Cost Sold At Net Money
1/10/11 12 13.45 16.45 3
1/14/11 14 3.98 3.48 -0.5
1/24/11 03 4.60 14.60 10
*Sale Item Price Profit
65 01452 13.78 6.12*
Date Salesperson Cost Sold At Net Money
1/04/11 11 20.10 40.10 20
1/06/11 11 20.11 16.11 4
*Sale Item Price Profit*
...
And so on.
I only want to have Matlab read in the data within the asterisks. Any thoughts on how to do this?
Thanks

댓글 수: 4

Matt Tearle
Matt Tearle 2011년 4월 6일
Just to clarify: the asterisks are actually in the file?
Zach
Zach 2011년 4월 6일
The asterisks are not within in the file I put them in simply to show you exactly what pieces of data I needed to be read in.
Matt Tearle
Matt Tearle 2011년 4월 6일
(To clarify the clarification: or are you looking to read data in any block with a certain headerline? ie "Sale Item Price Profit")
Zach
Zach 2011년 4월 6일
I think my answer to this question if I'm following you correctly is I wish to read only the data associated with the Sale, Item, Price, Profit.

댓글을 달려면 로그인하십시오.

 채택된 답변

Matt Tearle
Matt Tearle 2011년 4월 6일

1 개 추천

On the off-chance Walter's approach doesn't work (eg there are more than two block formats in the file), here's a more brute-force approach:
fid = fopen('asterisk.txt','rt');
data = [];
while ~feof(fid)
thisline = fgetl(fid);
if strncmpi('sale',thisline,4)
thisdata = textscan(fid,'%f %f %f %f','collectoutput',true);
data = [data;thisdata{1}];
end
end
fclose(fid);
You can modify the if statement to match whatever specific pattern you want.

댓글 수: 8

Walter Roberson
Walter Roberson 2011년 4월 6일
Nasty. That relies on the property of textscan() that it falls out of textscan() when the next available data does not match the first format element. With the information given, specifying that you only wanted to repeat the format once would avoid that problem -- but then you might as well use fscanf() instead of textscan()
Matt Tearle
Matt Tearle 2011년 4월 6일
I don't understand the objection. What do you mean by "specifying that you only wanted to repeat the format once"? I agree that you could parse line-by-line, but I'm assuming
1) you want to read all blocks that start with a headerline "Sale Item Price Profit"
2) you don't know a priori how many lines are in each of those blocks
3) every block in the file starts with a headerline
4) as I said above, there are multiple block formats, not just the two shown
Under those assumptions, I don't see why you shouldn't read each "Sale Item Price Profit" block with textscan, knowing that it will stop at the next headerline.
Zach
Zach 2011년 4월 6일
Well I also learned that 6.5 doesn't have textscan as a built in function.
Walter Roberson
Walter Roberson 2011년 4월 6일
Matt, we weren't shown any examples of there being more than one line of data in a Sale block, so to match what was shown a textscan() repeat count of 1 could be used without depending upon textscan to "back up" when it figures out something is unparsable.
But that doesn't help Zach, who doesn't have textscan() and thus should probably be using fscanf()
Zach
Zach 2011년 4월 6일
Is it even possible to parse through data with varying blocks using fscanf? Also I know the format to ignore is to throw an asterisk in the identification of the read input but will this input be able to handle the string that we were passing in earlier?
Walter Roberson
Walter Roberson 2011년 4월 6일
In Matt's code example, replace the lines
thisdata = textscan(fid,'%f %f %f %f','collectoutput',true);
data = [data;thisdata{1}];
with
thisdata = fscanf(fid, '%f%f%f%f');
data = [data;thisdata];
Zach
Zach 2011년 4월 6일
Thank you all for your help and if it isn't too much trouble I have one final understanding question. What exactly does the thisline portion do along with what does the 4 represent in the strncmpi function?
Matt Tearle
Matt Tearle 2011년 4월 6일
Walter, that makes sense. Thanks for the non-textscan version.
Zach, fgetl reads a single line of text. Then sctrncmpi is comparing the the first 4 characters of that string with the string 'sale' (that's what the 4 does). You can adapt this if, for example, you had other blocks that also started with "sale" (but then had something else after).

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Walter Roberson
Walter Roberson 2011년 4월 6일

1 개 추천

textread() with 'CommentStyle', {'Date', 'Profit'}

댓글 수: 5

Matt Tearle
Matt Tearle 2011년 4월 6일
Grah! Scooped by Walter Quickdraw Roberson while I was fiddling about with clarifications. Anyway, yes:
fid = fopen('asterisk.txt','rt');
data = textscan(fid,'%f %f %f %f','CommentStyle', {'Date', 'Profit'},'headerlines',1);
fclose(fid);
Zach
Zach 2011년 4월 6일
I just tried applying this solution and unfortunately I got an error telling me that Comment style must be a string. I am confused because I thought this is what "{'Date','Profit'} did.
Matt Tearle
Matt Tearle 2011년 4월 6일
Can you cut/paste the exact code you used?
Walter Roberson
Walter Roberson 2011년 4월 6일
Zach: Which version of MATLAB are you using? Using a cell array of a pair of strings has been supported since at least 2007b, but there was probably a time when it wasn't supported.
Matt: You snooze, you loze! ;-)
Zach
Zach 2011년 4월 6일
Sorry I went out to lunch I am using Matlab 6.5 so it probably wasn't supported in this version I will try to use Matt's code listed below.

댓글을 달려면 로그인하십시오.

카테고리

제품

태그

질문:

2011년 4월 6일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by