Extract numbers from mixed string

Question

K E 2012년 7월 19일

4
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string

편집: Angkur Shaikeea 2021년 10월 21일

I have a file containing header lines like the following,

Test setup: MaxDistance = 60 m, Rate = 1.000, Permitted Error = 50
Operator Note:  Air Temperature=20 C, Wind Speed 16.375m/s, Altitude 5km (Cloudy)

For a given parameter such as MaxDistance or Wind Speed, I would like to extract its numerical value. This is tricky because sometimes there is an equal sign, space, or units, and sometimes there is not, because different operators enter their notes differently (lesson: next time enforce consistency).

How would I extract the following: All numerical characters (ignoring spaces and equal signs but keeping decimal points) that appear after the string representing the parameter name. Stop when a letter or punctuation mark is reached. In the case of 'MaxDistance', I would obtain 60. In the case of Wind Speed, I would obtain 16.375.

댓글 수: 2
없음 표시없음 숨기기

Albert Yam 2012년 7월 19일

편집: John Kelly 2015년 2월 26일

What have you tried?

Jianming She 2020년 6월 17일

편집: Jianming She 2020년 6월 18일

This seems a more general way:

function numArray = extractNumFromStr(str)
str1 = regexprep(str,'[,;=]', ' ');
str2 = regexprep(regexprep(str1,'[^- 0-9.eE(,)/]',''), ' \D* ',' ');
str3 = regexprep(str2, {'\.\s','\E\s','\e\s','\s\E','\s\e'},' ');
numArray = str2num(str3);

Example:

a = 'alpha=-3.5,beta=1e-2. but gamma = -34.1'
numArray = extractNumFromStr(a)

numArray =
   -3.5000    0.0100  -34.1000

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Jan 2012년 7월 19일

19
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_54038

편집: Jan 2012년 7월 19일

MATLAB Online에서 열기

Import the file into a string at first, e.g. by fileread. Then you get something like this (if not, please explain all necessary details):

Str = ['Test setup: MaxDistance = 60 m, Rate = 1.000, ', ...
       'Permitted Error = 50 Operator Note:  Air Temperature=20 C, ', ...
       'Wind Speed 16.375m/s, Altitude 5km (Cloudy)'];

Now omit all equal characters:

Str(strfind(Str, '=')) = [];

Finally you can get the values:

Key   = 'MaxDistance';
Index = strfind(Str, Key);
Value = sscanf(Str(Index(1) + length(Key):end), '%g', 1);

"Index(1)" cares for multiple occurences of the key.

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Jan 2012년 7월 19일

The removing of the = is clear, I think. Then STRFIND looks for the wanted string. Afterwards the first number behind this string is extracted by SSCANF. Here "behind" means the position, where the string is found plus the number of characters the string have.

Lorenzo 2013년 10월 30일

This works great! Just a quick question Jan: what if you want to find all the uccurrence of a numeric value between two strings? For instance, let's say you want the numeric values that can be found between MaxDistance and Altitude in the original example (i.e. 60, 1000, 50 ecc ecc...). How can you achieve that?

I tried this:

Key1 = 'MaxDistance'; Key2 = 'Altitude'; Index1 = strfind(file, Key1); Index2 = strfind(file, Key2); Value = sscanf(file(Index1:Index2), '%g',1);

but still I can get nothing but the first value.... Also, I dont know a-priori the number of numbers that can be encontured between the two strings...

Thanks!

Lorenzo

댓글을 달려면 로그인하십시오.

Answer 2

Stephan Koehler 2017년 6월 7일

6
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_269941

Here is a one-line answer str2num( regexprep( Str, {'\D*([\d\.]+\d)[^\d]*', '[^\d\.]*'}, {'$1 ', ' '} ) )

댓글 수: 2
없음 표시없음 숨기기

Alexandre THIBEAULT 2021년 1월 27일

Best answer

Marco A. Acevedo Z. 2021년 4월 2일

hi, good answer but how to include the - sign (if present). Thanks.

댓글을 달려면 로그인하십시오.

Answer 3

Freddy 2012년 7월 19일

2
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_54057

MATLAB Online에서 열기

Maybe a little bit too late, but i like to present you also my ("regexp training"-) solution. :)

A = regexp(Str,'(?<Keyword>(?:\w+\s*\w+))\s*=?\s*(?<Value>\d+\.?\d*)','names');
s = struct();
for i = A, 
  s.(genvarname(i.('Keyword'))) = str2double(i.('Value'));
end

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Albert Yam 2012년 7월 19일

편집: Albert Yam 2012년 7월 19일

That took a long time for me to understand what you are doing. That's cool though.

How does it skip over 'Operator Note:' ?

Edit: Never mind I get it. It doesn't have anything for ':'. The '(?:\w' has nothing to do with a ':' in the string, it is grouping the token for 'up to two words'.

댓글을 달려면 로그인하십시오.

Answer 4

Albert Yam 2012년 7월 19일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_54043

MATLAB Online에서 열기

This is how I went about it, all steps included even the errors.

teststr = 'Test setup: MaxDistance = 60 m, Rate = 1.000, Permitted Error = 50 Operator Note:  Air Temperature=20 C, Wind Speed 16.375m/s, Altitude 5km (Cloudy)';
regexp(teststr,[\d])
regexp(teststr,['\d'])
regexp(teststr,['\d'],'match')
regexp(teststr,['\d+'],'match')
regexp(teststr,['\d+.?'],'match')
regexp(teststr,['\d+\.?'],'match')
regexp(teststr,['\d+\.?\d?'],'match')
regexp(teststr,['\d+\.?\d+?'],'match')
regexp(teststr,['\d+\.?\d*?'],'match')
regexp(teststr,['\d+\.?\d?'],'match')
regexp(teststr,['\d+\.?\d*'],'match')

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

G 2013년 11월 7일

편집: G 2013년 11월 13일

MATLAB Online에서 열기

Better:

regexp(teststr,'\d+\.?\d*|-\d+\.?\d*|\.?\d+|-\.?\d+','match')

or

regexp(teststr,'-?\d+\.?\d*|-?\d*\.?\d+','match')

remains the -.34e-004 case !

Angkur Shaikeea 2021년 10월 21일

편집: Angkur Shaikeea 2021년 10월 21일

i need to extract

0.00000 0.00000 0.00000

0.00000 1.00000 0.00000

1.00000 0.00000 0.00000

from a text file containing

.............................................

Nodal positions:

0.00000 0.00000 0.00000

0.00000 1.00000 0.00000

1.00000 0.00000 0.00000

Nodal positions:

0.00000 0.00000 0.00000

0.00000 1.00000 0.00000

1.00000 0.00000 0.00000

Nodal positions:

0.00000 0.00000 0.00000

0.00000 1.00000 0.00000

1.00000 0.00000 0.00000

any help using regexp?

댓글을 달려면 로그인하십시오.

Answer 5

C.J. Harris 2012년 7월 19일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_54047

MATLAB Online에서 열기

In order to extract a certain value:

Str = ['Test setup: MaxDistance = 60 m, Rate = 1.000, ', ...
       'Permitted Error = 50 Operator Note:  Air Temperature=20 C, ', ...
       'Wind Speed 16.375m/s, Altitude 5km (Cloudy)'];
matchWord = 'Air Temperature';
[a,b]  = regexp(Str,'\d+(\.\d+)?');
strPos = find(a > strfind(Str,matchWord),1,'first');
nValue = str2double(Str(a(strPos):b(strPos)));

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 6

Dahai Xue 2016년 3월 10일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/44049-extract-numbers-from-mixed-string#answer_213144

편집: KSSV 2021년 1월 25일

MATLAB Online에서 열기

C.J. Harris, I put your regexp into a function to extract all numbers using regexp. I have hard time to find an array operation that can use the 'a' and 'b' without the loop. Hopefully somebody has ideas. Of course it is not difficult to add more parameters or options to find "certain" numbers with preceding or following landmark strings.

function nums = regExtractNums(str) 
[a,b] = regexp(str, '\d+(\.\d+)?'); 
nums = zeros(length(a),1); 
for k = 1:length(a) 
    nums(k) = str2double(str(a(k):b(k))); 
end
end

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Extract numbers from mixed string

댓글 수: 2
없음 표시없음 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (5개)

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Extract numbers from mixed string

댓글 수: 2 없음 표시없음 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (5개)

댓글 수: 2 없음 표시없음 숨기기

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 6 이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 2
없음 표시없음 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

댓글 수: 6
이전 댓글 4개 표시이전 댓글 4개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기