Extracting values from string using regexp
이전 댓글 표시
Hello,
I have a string array that contains a list of equations. I would like to read these equations and use it for calculation, but I wasn't able to find a good way to do it that doesn't involve 'eval' function (for efficiency). Instead, I'd like to extract values from it using 'regexp' and re-construct the equation. But I am struggling with how to set up regexp.
Here are example equations:
k1 = "6.8e-9*Te^0.67*exp(-4.4/(Te+0.5))"
k2 = "6.8e-9*exp(-4/Te)"
k2 = "1.2e7"
These equations follow a general form of A * Te^n * exp(B/(Te+C)). I would like to extract the value of A, n, B, and C and store it in a matrix like [A, n, B, C]. So in this case, I would like to have the following as a result
k_value = [6.8e-9, 0.67, -4.4, 0.5; 6.8e-9, 0, -4, 0; 1.2e7, 0, 0, 0]
Once I have these values as a matrix, I can evaluate the original equation (given the value of Te) like
k = k_value(:,1) .* Te.^(k_value(:,2)) .* exp(k_value(:,3) / (Te + k_value(:,4)))
How can I use 'regexp' (or other method) to construct 'k_value' as above?
Thank you for your time!
댓글 수: 1
Extracting just the numbers is easy and efficient using regular expressions:
str = {'6.8e-9*Te^0.67*exp(-4.4/(Te+0.5))','6.8e-9*exp(-4/Te)','1.2e7'};
rgx = '[-+]?\d+\.?\d*([eE][-+]?\d+)?';
out = regexp(str,rgx,'match');
out{:}
The hard part is knowing which part of the expression they come from, which the accepted answer does not do.
채택된 답변
추가 답변 (1개)
The problem is not extracting the numbers (which is easy) but in knowing which of the numbers has been extracted, which is not a trivial task when different parts of the expression can be completely missing. But it can be done with regular expressions using optional grouping parentheses, which return empty strings if the content is not matched, allowing us to keep track of exactly which values have been matched:
% regular expression:
rgd = '\d+\.?\d*';
rge = '([eE][-+]?\d+)?';
rgx = ['^([-+]?NX)\*?(Te\^)?(?(2)[-+]?N)\*?(exp\()?',...
'(?(4)[-+]?N)(?(5)/\(?Te)?(?(6)[-+]N)?\)?\)?$'];
rgx = strrep(rgx,'N',rgd);
rgx = strrep(rgx,'X',rge);
% your input data:
str = {'6.8e-9*Te^0.67*exp(-4.4/(Te+0.5))','6.8e-9*exp(-4/Te)','1.2e7'};
tkn = regexp(str,rgx,'tokens','once');
tkn = vertcat(tkn{:})
format short g
mat = str2double(tkn(:,1:2:7));
mat(isnan(mat)) = 0
Note that this regular expression does not check for syntactic correctness, it can match other strings which are not syntactically correct expressions, i.e. it relies on your a priori knowledge about the input strings. And I had to make some guesses about the permitted syntaxes, which so far you have not formally defined.
카테고리
도움말 센터 및 File Exchange에서 Characters and Strings에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!