Finding duplicates without using the unique function

I'm struggling to make a user defined function that detects duplicates within a matrix. This is what I have so far:
function bmatch = testing(data)
edges = min(data):max(data);
[counts,values] = histcounts(data, edges);
if values(counts>=2)
bmatch = 1;
else
bmatch = 0;
end
However this doesn't detect duplicates or state the number of duplicates in a given matrix. I don't understand why.

댓글 수: 4

Please describe what you mean by 'doesn't work'. How do you know it doesn't work? Is there an error message?
It can't detect duplicates within a matrix.
Give an example of what you want to catch, since you can always convert a matrix into a vector. So a being a matrix is irrelevant.
Something like this:
A = [ 1 2 3 4 5 6 7 8 9 10]
match_a = testing(A) % should return bmatch as 0 and make a matirx [0]
B = [ 1 1 2 3 4 5 6 5 9]
match_b = testing(B) % should return bmatch as 1 and make a matrix [1 1 5 5]

댓글을 달려면 로그인하십시오.

답변 (1개)

Walter Roberson
Walter Roberson 2023년 5월 2일
이동: Walter Roberson 2023년 5월 2일
Watch carefully:
data = [ 1 2 3 4 5 6 7 8 9 10]
data = 1×10
1 2 3 4 5 6 7 8 9 10
edges = min(data):max(data)
edges = 1×10
1 2 3 4 5 6 7 8 9 10
[counts,values] = histcounts(data, edges)
counts = 1×9
1 1 1 1 1 1 1 1 2
values = 1×10
1 2 3 4 5 6 7 8 9 10
Notice that the final count is 2 and that the vector of counts is shorter than the number of entries in edges . Read carefully about what happens in the edge cases for histcounts
Your code also has problems if the values are not all integers, or if there are non-finite values -- or if one of the values is much larger than the others. For example your code should be able to handle testing([-1e40 1e40]) without difficulty, but your code will run out of memory.

댓글 수: 3

if values(counts>=2)
You create a logical index of the locations where the counts are >= 2. Then you use that logical index to select certain elements of values. You have if applied to that list of values.
There might be multiple places where counts>=2 so values(counts>=2) might be empty (no duplicates detected), or might be a scalar (single duplicate detected), or might be more than one element. So values(counts>=2) could be empty or scalar or vector.
When you apply if to something, MATLAB considers the condition to be true only in the situation where all of the values being tested are non-zero. It does not matter what the numeric values are (well, except for NaN), if every element is non-zero, then the condition is true, and if the thing is empty or if there is even one value that is zero, the condition is false.
So what you are testing with the if values(counts>=2) is whether the places that have duplicates are all non-zero. If there are no duplicates or if there is a duplicate where values == 0, then the condition will fail; if there is at least one duplicate and the duplicates skip where values == 0 then the condition is considered true.
I suspect that is not what you intended to test.
You really need to be thinking more about what the code should do if there are elements that are not integers.
hint: if you sort the elements, then in the case where there are no duplicates, then there are no adjacent elements that are equal, but in the case that there are duplicates then there will be places where the adjacent elements are equal.

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Matrix Indexing에 대해 자세히 알아보기

제품

릴리스

R2023a

질문:

2023년 5월 1일

댓글:

2023년 5월 2일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by