How to parse values from .txt/XML file?

I'm trying to read and store numbers which are sandwiched between some text, but repeat. The file looks something like this:
<Info>
<Date>2022-11-01T05:36</date>
<Time>05:36</Time>
<Cost>101.30</Cost>
</Info>
<Info>
<Date>2022-11-01T13:22</date>
<Cost>107.50</Cost>
</Info>
<Info>
<Date>2022-11-01T17:05</date>
<Cost>203.73</Cost>
</Info>
And so on. This will repeat
What I'm trying to do is to parse out the date and cost for each time so that I can eventually put it into an excel spreadsheet.
The issue I'm having is I can't get it to read (and I'm not sure how to store the variables.
I started by just trying ot extract the cost. I tried this:
fid = fopen('CostFunc.txt');
tline = fgetl(fid);
lineCounter = 1;
while iscar(tline)
if contains (tline, '<Cost>', 'IgnoreCase', true)
disp(tline)
end
tline = fget(fid);
lineCounter = lineCounter +1
end
fclose(fid);
And it'll show me each cost, but it doesn't store anything and I can't do anything with them (for example find average cost, nor can I write it to excel).
I have no clue how to handle the date/time.
Any assistance is appreciated!

댓글 수: 2

Walter Roberson
Walter Roberson 2022년 11월 20일
try the new readstruct()
MathandPhysics
MathandPhysics 2022년 11월 20일
I get an error stating "Unrecognized function or variable readstruct'.
I think because I am using 2019b, I can't use this (and I unfortunately am not able to change the version of Matlab!)

댓글을 달려면 로그인하십시오.

답변 (1개)

Walter Roberson
Walter Roberson 2022년 11월 20일

0 개 추천

S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});

댓글 수: 3

MathandPhysics
MathandPhysics 2023년 3월 7일
OK, finally was able to sit down with this again and ended up with the following error:
"Error using datetime. Unable to convert the text to datetime using the format 'uuuu-MM-dd'T'HH:mm'.
I tried changing the 'uuuu' to 'yyyy', 'yy', and ended up with the exact same error.
MathandPhysics
MathandPhysics 2023년 3월 7일
I also tried it with HH:mm:ss to see if that made a difference, no change, same error.
@MathandPhysics: Seems to work using a text file made by copying and pasting the text from the question.
S = fileread('CostFunc.txt');
parts = regexp(S, '(?<=Date>)(?<Date>[^<]+).*?(?<=Cost>)(?<Cost>[^<]+)', 'names');
Dates = datetime({parts.Date}, 'InputFormat', "uuuu-MM-dd'T'HH:mm");
Costs = str2double({parts.Cost});
Dates,Costs
Dates = 1×3 datetime array
01-Nov-2022 05:36:00 01-Nov-2022 13:22:00 01-Nov-2022 17:05:00
Costs = 1×3
101.3000 107.5000 203.7300
Maybe you can upload the actual file you have, using the paperclip button.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Data Type Identification에 대해 자세히 알아보기

제품

릴리스

R2019b

질문:

2022년 11월 20일

댓글:

2023년 3월 7일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by