To extract the last sub string from strings
조회 수: 96 (최근 30일)
이전 댓글 표시
Hello everybody,
I would like to extract the last string just before 'EP'.
I tried first split the string with strsplit function. and then get the last right string using for-loop.
I assume that it can be done without using for-loop.
Please give me some help how to make the code without using for-loop.
load input.mat
temp = cellfun(@(x) strsplit(x, {' ','EP'}), input, 'UniformOutput', false); % split the string
for i=1:length(temp)
num(i,:) = str2double(temp{i}(end-1)); % fill the last right string in cells
end
댓글 수: 0
채택된 답변
Dyuman Joshi
2022년 9월 2일
편집: Dyuman Joshi
2022년 9월 2일
You can use extractBetween with index of last space character and the common denominator at the end of the string ('EP')
load input.mat
num=cellfun(@(x) str2double(extractBetween(x,find(x==' ',1,'last')+1,'EP')), input)
추가 답변 (5개)
Stephen23
2022년 9월 2일
Avoid slow CELLFUN, STR2NUM, REGEXP, etc.
One SSCANF call is probably the most efficient approach by far, as well as being very simple:
S = load('input.mat');
C = S.input
V = sscanf([C{:}],'%*s %*s%fEP')
댓글 수: 2
Dyuman Joshi
2022년 9월 2일
Stephen, how do you know that cellfun, str2num, regexp etc are slower functions? And are there any resources where I can read more about this?
Stephen23
2022년 9월 2일
편집: Stephen23
2022년 9월 2일
"how do you know that cellfun, str2num, regexp etc are slower functions?"
- many years of reading this and other forums, learning from the combined knowledge of many users.
- many years of writing unit tests for my own code (i.e. making modifications and comparing).
- reading the documentation, to know what features functions have and how to use them.
- knowledge about functions, e.g. STR2NUM calls EVAL inside (so is not optimised by the JIT engine), and CELLFUN by design must call a function handle repeatedly (slower than a loop).
Lets compare the answers given so far on this thread:
S = load('input.mat');
C = repmat(S.input,1e3,1); % bigger array -> easier to compare
timeit(@()funAtsushiUeno(C)) % REGEXP, CELLFUN, STR2NUM
timeit(@()funDyumanJoshi(C)) % CELLFUN, EXTRACTBETWEEN, STR2DOUBLE
timeit(@()funChunru(C)) % REGEXP, CELLFUN, STR2NUM
timeit(@()funImageAnalyst(C)) % loop and indexing
timeit(@()funKSSV(C)) % REGEXP, STRREP, STR2DOUBLE
timeit(@()funS23(C)) % SSCANF
So, my function is more than eight times faster than the next fastest function (from KSSV), as well as being the simplest. And I had a fair idea that would be the case, even before writing this test code.
"And are there any resources where I can read more about this?"
If you want to learn how to use MATLAB efficiently, my advice is to read this forum a lot. And when I write "a lot", I don't mean "just a little bit". And not just new threads: there are some really important topics that have been discussed in some old yet canonical threads on this forum.
V1 = funAtsushiUeno(C);
V2 = funDyumanJoshi(C);
V3 = funChunru(C);
V4 = funImageAnalyst(C);
V5 = funKSSV(C);
V6 = funS23(C);
isequal(V1(:), V2(:), V3(:), V4(:), V5(:), V6(:)) % checking the function outputs:
function num = funAtsushiUeno(C)
num = regexp(C,'([\d+-e.]+)EP\s*$','tokens');
num = [num{:}];
num = [num{:}];
num = cellfun(@str2num, num);
end
function num = funDyumanJoshi(C)
num=cellfun(@(x) str2double(extractBetween(x,find(x==' ',1,'last')+1,'EP')), C);
end
function y = funChunru(C)
s = regexp(C, '\s([\d\.]*)EP$', 'tokens');
y = cellfun(@(s) str2num(s{1}{1}), s);
end
function numbers = funImageAnalyst(data)
rows = numel(data);
numbers = zeros(rows, 1);
for row = 1 : rows
words = strsplit(data{row});
lastWord = words{end};
numbers(row) = str2double(lastWord(1:end-2));
end
end
function s = funKSSV(C)
expression = '\d+\.?\d*EP';
s = regexp(C,expression,'match') ;
s = strrep([s{:}]','EP','') ;
s = str2double(s);
end
function V = funS23(C);
V = sscanf([C{:}],'%*s %*s%fEP');
end
Chunru
2022년 9월 2일
load(websave("input.mat", "https://www.mathworks.com/matlabcentral/answers/uploaded_files/1114425/input.mat"))
input
s = regexp(input, '\s([\d\.]*)EP$', 'tokens');
y = cellfun(@(s) str2num(s{1}{1}), s)
Atsushi Ueno
2022년 9월 2일
이동: Image Analyst
2022년 9월 2일
This is a bit of a challenge because we need to deal with cell arrays with irregular number sizes.
I would use the regexp function.
load input.mat
num = regexp(input,'([\d+-e.]+)EP\s*$','tokens');
num = [num{:}];
num = [num{:}];
num = cellfun(@str2num, num)
댓글 수: 0
Image Analyst
2022년 9월 2일
Here's one way. Simple, straightforward, and intuitive. There may be more compact but cryptic methods though.
s = load('input.mat')
data = s.input; % DON'T call your variables input, which is a build in function!
rows = numel(data)
numbers = zeros(rows, 1);
for row = 1 : rows
words = strsplit(data{row});
lastWord = words{end};
numbers(row) = str2double(lastWord(1:end-2));
end
KSSV
2022년 9월 2일
expression = '\d+\.?\d*EP';
s = regexp(input,expression,'match') ;
s = strrep([s{:}]','EP','') ;
s = str2double(s)
참고 항목
카테고리
Help Center 및 File Exchange에서 Cell Arrays에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!