extracting numbers with decimal places from the body of text file and assigning to a variable

조회 수: 23 (최근 30일)
Hi,
I have a text file that I have read in to Matlab as a character array. This file has text written in the body of it, however I am after the specific variables.
I want to extract specific values from the text to assign to their specific variable.
For example my text has something like the following in italics:
Header with text and comments
other text that I am not interested in, etc.
AAA = 18.457
BBB = 34.6
CCC = 4
I would like my results to be a series of variables
AAA = 18.457
BBB = 34.6
CCC = 4
Which I could then use to perform operations on.
I tried using the following:
fid = fopen(“file”,’r’)
text = textscan(fid,'%s','Delimiter','','endofline','');
text = text{1}{1};
fid = fclose(fid);
Expression = ‘AAA = (\d+)';
AAA = regexp(text,expression,'tokens');
However, this only printed out “18” rather than my desired “18.457” (so stopping at the decimal character). Is there a way to extract a number that may or may not have decimal places?
Ideally, I would also make it so it wasn’t sensitive to matching the exact number of spaces after the variable either “e. It just needs “AAA”, rather than “AAA “.
Is there a way to use Matlab to achieve what I want?

채택된 답변

Stephen23
Stephen23 2021년 1월 1일
편집: Stephen23 2021년 1월 1일
%str = fileread(..) % <- simpler way to import the file data.
str = sprintf('%s\n','Header with text and comments','other text that I am not interested in, etc.','AAA = 18.457','BBB = 34.6','CCC = 4')
str =
'Header with text and comments other text that I am not interested in, etc. AAA = 18.457 BBB = 34.6 CCC = 4 '
rgx = '^\s*(\w+)\s*=\s*(\d+\.?\d*)';
tkn = regexp(str,rgx,'tokens','lineanchors');
tkn = vertcat(tkn{:}).';
tkn(2,:) = num2cell(str2double(tkn(2,:)));
out = struct(tkn{:})
out = struct with fields:
AAA: 18.4570 BBB: 34.6000 CCC: 4
out.AAA
ans = 18.4570
Personally I would use a different approach: open the file, read the header lines using fgetl, then import the data using textscan. It would probably be easier than messing about with matching number formats (i.e. don't reinvent the wheel).
  댓글 수: 4
James Browne
James Browne 2021년 1월 2일
Thanks, I made that work with my code.
I added in "\-?" so the token is now "(\-?\d+\.?\d*)" because I also wanted to include negative numbers as possible outputs.
Instead of pulling out individual variables from the structure array (ie. with out.aaa) is it possible to make each variable in the structure array into a variable along with it's name?
Stephen23
Stephen23 2021년 1월 2일
편집: Stephen23 2021년 1월 2일
"is it possible to make each variable in the structure array into a variable along with it's name?"
Possible yes, but only if you want to force yourself into writing slow, complex, obfuscated, buggy code that is difficult to debug:
There are so many reasons why that is a fragile, bad approach to writing your code. For example, consider what your code would do if the header name happens to be the same name as any existing variable: it would simply overwrite that variable without any warning. Such bad code design allows for all sorts of latent bugs that are difficult to track down because they depend on specific data... ugh.
If you know the headers/variables in advance then by all means allocate them explicitly:
If you do NOT know the headers in advance then magically creating variables from them would be a fragile, buggy, ugly approach: how would you even know what header had been imported? (trivially easy to do with the structure, quite tricky to do with randomly named variables in a workspace)

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by