Scanning a text file for bits and pieces of information

Hi.
I'm new to matlab and I'm a bit stuck in where to begin with coding a program to open a text file, read specific strings and extract information from specific strings. Here's my general outline and an example of the text format I'm trying to read.
Example format:(wanted information has asterisks on either side of it)
<solution solution_id="telemetry" name="SITE_FRAME" add_date="2004-01-26T12:04:23Z" index1="1">
<reference_frame name="SITE_FRAME" index1=" *0* "/>
<offset x=" *-0.0* " y=" *0.0* " z=" *0.0* "/>
<orientation s="1.0" v1="0.0" v2="0.0" v3="0.0"/>
</solution>
<alias>
<old index1="0" index2="0" index3="1" index4="59"/>
<new index1="1"/>
</alias>
<solution solution_id="telemetry" name="SITE_FRAME" add_date="2004-01-26T12:04:25Z" index1="2">
<reference_frame name="SITE_FRAME" index1=" *1* "/>
<offset x=" *-0.0* " y=" *0.0* " z=" *0.0* "/>
<orientation s="1.0" v1="0.0" v2="0.0" v3="0.0"/>
</solution>
<alias>
<old index1="1" index2="0" index3="0" index4="13"/>
<new index1="2"/>
Outline:
Open file 'mer_site' (that's the file that contains this format of information)
Look for instances of * information
ie the parts that say 'name=SITE_FRAME" index1="*"/>'
export the * information into a vector (total size [1x158])
Look for instances of * information
ie the parts that say '<offset x="*" y="*" z="*"/>'
export the * information into a vector (total size [3x158])
Close File
So far, all I know is that I need
fid=fopen('mer_site')
something with textscan
something about putting the results from textscan into a cell
closing the file
I'm not sure which arguments I need for textscan because the information is mixed.
I'd be super grateful if anyone could help with this!!!

댓글 수: 9

Would you be able to use an xml parser on this?
I tried using it and maybe it's my beginner level usage of matlab, but nothing seems to be in the structure?
per isakson
per isakson 2015년 7월 20일
편집: per isakson 2015년 7월 20일
You show an excerpt of a file and we guess that it is an xml-file. We are probably mistaken.
I originally got it as an svf file and I just saved it as a .txt file so I could upload it
That file does not appear to be in Simple File Verification (SFV) format. Perhaps it has something to do with State Fusion Vector ?
I found an "issue" with mer1_master.
<solution solution_id="telemetry" name="SITE_FRAME" add_date="2013-12-03T17:01:12Z" index1="182">
<reference_frame name="SITE_FRAME" index1="181"/>
....
<solution add_date="2014-02-16T19:02:11Z" index1="183" name="SITE_FRAME" solution_id="telemetry">
<reference_frame index1="182" name="SITE_FRAME"/>
....
The order of the "items" changes after index, 183. I guess that is not significant with an xml-file.
The directory above says that the files are indeed XML files.

댓글을 달려면 로그인하십시오.

 채택된 답변

per isakson
per isakson 2015년 7월 20일
편집: per isakson 2015년 7월 20일
I'm surprised that xml2struct by Wouter Falkena failed with your file. Did it throw any error or warning message?
Instead of trying myself I did an exercise with regular expression.
>> out = cssm('c:\m\cssm\mer1_master.txt')
out =
1x191 struct array with fields:
index1
x
y
z
>> out(5)
ans =
index1: 4
x: 12.3513
y: 4.1437
z: -0.8949
>> out(185)
ans =
index1: 184
x: -438.7025
y: -0.5040
z: -10.6850
>>
where
function out = cssm( filespec )
str = fileread( filespec );
xpr = '(?<=<solution).+?(?=</solution>)';
cac = regexp( str, xpr, 'match' );
%
out = struct( 'index1', num2cell(nan(1,length(cac)))...
, 'x',[], 'y',[], 'z',[] );
%
xpr = cat( 2 ...
, '<reference_frame' ...
, '.*' ...
, ' index1="(?<index1>\d+)"' ...
, '.*' ...
, '/>' ...
, '\s*' ...
, '<offset' ...
, ' x="(?<x>[\-\d\.]+)"' ...
, ' y="(?<y>[\-\d\.]+)"' ...
, ' z="(?<z>[\-\d\.]+)"' ...
, '/>' );
%
for jj = 1 : length( cac )
sas = regexp( cac{jj}, xpr, 'names' );
out(jj).index1 = str2double( sas.index1 );
out(jj).x = str2double( sas.x );
out(jj).y = str2double( sas.y );
out(jj).z = str2double( sas.z );
end
end
Caveat: This function is based on backward engineering of one single file and tested with the same file. It may fail sometime in the future with some other file.

댓글 수: 1

This worked!!! Thank you so much! I'm super psyched and will give you credit if I ever publish the results of my data or reference this program. Thanks! I tried using something to process xml files and got an extremely confusing structure that I was trying to sift through, bit by bit. This helped so much!

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Environment and Settings에 대해 자세히 알아보기

질문:

2015년 7월 20일

댓글:

2015년 7월 21일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by