Find the range of duplicates in a sorted element

조회 수: 7 (최근 30일)
Annie
Annie 2025년 4월 30일
답변: Thorsten 2025년 4월 30일
So let's say I have a vector
a = [6 2 2 5]
I sort it with the function and now:
a = [2 2 5 6]
How do I find the range of the duplicate number(2)? Like, I want it to tell me the start of the duplicte(element1) and the end of the duplicates(element2)
An if I have [2 2 5 5 6]
It tells me copies are in 1-2 and 3-5

채택된 답변

John D'Errico
John D'Errico 2025년 4월 30일
편집: John D'Errico 2025년 4월 30일
I'll create a longer vector, with a few duplicates.
V0 = randi(8,[1,15])
V0 = 1×15
7 1 2 7 1 2 7 5 1 5 6 7 2 5 3
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
V = sort(V0)
V = 1×15
1 1 1 2 2 2 3 5 5 5 6 7 7 7 7
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Now you want to know where the dups live, in the sorted vector. Just find the first and last elements of any dups. The trick is an old one that uses diff, and then a search for a specific pattern.
dV = diff(V) > 0
dV = 1x14 logical array
0 0 1 0 0 1 1 0 0 1 1 0 0 0
Hmm. That might be useful. Where a duplicate lives, we see a zero, since diff finds the difference between consecutive elements. And that means we just need to find the locations of the zero elements, and the first and last zero in a block of zeros. This means we can use a trick that employs strfind. Yes, I know, its not a string. Or, is it? strfind just looks for a desired pattern in a vector.
What does this tell us?
startloc = strfind([1,dV,1],[1 0])
startloc = 1×4
1 4 8 12
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Do you see what I did? Appending a 1 at the beginning is a way to find blocks of zeros that start at the very beginning. Appending a 1 at the end allows us to find a block of zeros at the end.
Now, how about this?
endloc = strfind([1,dV,1],[0 1])
endloc = 1×4
3 6 10 15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Do you see how that worked? It identifed the duplicate blocks in V.
blocklength = endloc - startloc + 1
blocklength = 1×4
3 3 3 4
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A set of useful tricks that are worth remembering, thus using diff and strfind. Don't forget to append those ones at each end though.

추가 답변 (1개)

Thorsten
Thorsten 2025년 4월 30일
If you are just looking for pairs, you can use
b = sort(a);
startloc = find(diff(b) == 0);
endloc = startloc + 1;

카테고리

Help CenterFile Exchange에서 Startup and Shutdown에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by