finding a numeric pattern in a vector

조회 수: 13 (최근 30일)
Saikrishna
Saikrishna 2023년 4월 19일
댓글: Saikrishna 2023년 4월 21일
Hi all,
I have a numeric vector and I am trying to find a pattern (with one missing number) in the vector.
Example:
my numeric vector vec = [5 6 1 2 3 3 4 5 6 1 2 6 3 5 4 2 3 11 2 31 3 4 5 1 2 6 31 11 2 5]
pattern :pat = [3 4 * 1 2]
I know the solution if there can be one or multiple missing numbers for example: [start end] = regexp(char(vec),char( [3 4 *? 1 2]),'start','end') gives start and endpoints of patterns (3 4 5 6 1 2) and (3 4 5 1 2) from the vector. But I am searching for only (3 4 5 1 2) with one missing number.

채택된 답변

Stephen23
Stephen23 2023년 4월 20일
편집: Stephen23 2023년 4월 20일
Your basic concept is okay. You need to select an appropriate character match and quantifier. Note that the asterisk is actually a quantifier, as is the question mark (context dependent):
You also have not taken into account any characters that need to be escaped, e.g. char(36) == '$'
Assuming only integers between 0 and 65535, here is a robust approach (no fiddling around counting characters):
V = [5,6,1,2,3,3,4,5,6,1,2,6,3,5,4,2,3,11,2,31,3,4,5,1,2,6,31,11,2,5];
F = @(n)regexptranslate('escape',char(n));
R = sprintf('%s.%s',F([3,4]),F([1,2]));
[X,Y] = regexp(F(V),R)
X = 21
Y = 25
V(X:Y)
ans = 1×5
3 4 5 1 2
  댓글 수: 1
Saikrishna
Saikrishna 2023년 4월 21일
thank you @Stephen23 this is what i was looking.

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Les Beckham
Les Beckham 2023년 4월 19일
편집: Les Beckham 2023년 4월 19일
Note that I added an additional test at the end of vec to make sure this handles a multi-digit number in the middle position of the pattern ([3 4 10 1 2])
vec = [5 6 1 2 3 3 4 5 6 1 2 6 3 5 4 2 3 11 2 31 3 4 5 1 2 6 31 11 2 5 3 4 10 1 2];
str = num2str(vec);
pat = ['3\s+4\s+\d+\s+1\s+2'];
result = regexp(str, pat, 'match')
result = 1×2 cell array
{'3 4 5 1 2'} {'3 4 10 1 2'}
  댓글 수: 2
Saikrishna
Saikrishna 2023년 4월 20일
thank you for your reply. Do you know how to get the indices of the matching pattern?
Walter Roberson
Walter Roberson 2023년 4월 20일
vec = [5 6 1 2 3 3 4 5 6 1 2 6 3 5 4 2 3 11 2 31 3 4 5 1 2 6 31 11 2 5 3 4 10 1 2];
str = num2str(vec);
pat = ['3\s+4\s+\d+\s+1\s+2'];
result = regexp(str, pat)
would return the indices of the starting points inside the character vector str . Which is a bit of a problem because you would have to convert character vector indices to array indices, operating in the face of the possibility that not all entries might have the same width (if they had the same width then the calculation becomes straight forward.)
One way to get them to all have the same width is to use something like
digits_needed = length(num2str(max(vec));
fmt = sprintf('%%%dd', digits_needed);
str = join(compose(fmt, vec), ' ');
pat = '3\s+4\s+\d+\s+1\s+2';
str_locations = regexp(str, pat);
vec_indices = (str_locations - 1) / (digits_needed + 1) + 1
or something close to that

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

제품


릴리스

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by