Finding Duplicate Values per Column

Question

2 개 추천

Greetings, suppose Column A has these values - 7 18 27 42 65 49 54 65 78 82 87 98

Is there a way to compare the values (row by row) and search for duplicates? (I'm using Matlab R2010b)I don't want the duplicated values to be removed.

Thanks.

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Jan 2011년 10월 22일

MATLAB Online에서 열기

10 개 추천

A = [7 18 27 42 65 49 54 65 78 82 87 98];
[n, bin] = histc(A, unique(A));
multiple = find(n > 1);
index    = find(ismember(bin, multiple));

Now the values A(index) appear mutliple times.

댓글 수: 2
없음 표시 없음 숨기기

Teik Yee 2012년 11월 19일

thx

Faheem Ud Din 2017년 5월 16일

thank u sir G

댓글을 달려면 로그인하십시오.

Answer 2

the cyclist 2011년 10월 22일

MATLAB Online에서 열기

6 개 추천

Here's a slightly different way:

X = [1 2 3 4 5 5 5 1];
uniqueX = unique(X);
countOfX = hist(X,uniqueX);
indexToRepeatedValue = (countOfX~=1);
repeatedValues = uniqueX(indexToRepeatedValue)
numberOfAppearancesOfRepeatedValues = countOfX(indexToRepeatedValue)

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

Jan 2012년 11월 19일

@Yowh: It is unlikely, that Harold listens to comments after 8 months.

Anurag Pujari 2016년 3월 25일

편집: Anurag Pujari 2016년 3월 25일

Accurate. What an excellent piece of code.

댓글을 달려면 로그인하십시오.

Answer 3

Hannes Greim 2013년 1월 19일

0 개 추천

you can use "tabulate" for cell arrays.

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Clemens 2014년 3월 19일

편집: Clemens 2014년 3월 19일

Working only for versions after 2012 with the intro of tables.

댓글을 달려면 로그인하십시오.

Answer 4

Wesley Allen 2018년 2월 9일

편집: Wesley Allen 2018년 2월 9일

MATLAB Online에서 열기

0 개 추천

Duplicate Finding with Tolerance

If you want to find duplicates with tolerances (e.g., for non-integers), I use the following:

A = [1.313;2.4;2.400000001;1.31299999999;2.25;2.25;2.25000000001;3.7];
TOL = 1e-5;
uniqueA = uniquetol(A,TOL);
duplicateBool = abs(repmat(A,size(uniqueA.'))-repmat(uniqueA.',size(A))) < max(abs(uniqueA))*TOL;
duplicateCount = sum(duplicateBool).';

Just like with the cyclist's answer, if you want to isolate only the values that have more than one instance:

iDuplicate = (duplicateCount ~= 1);
repeatedValues = uniqueA(iDuplicate);
numberOfAppearancesOfRepeatedValues = duplicateCount(iDuplicate);
repeatedBool = duplicateBool(:,iDuplicate);

Using the Results

The unique values are in uniqueA:

>> uniqueA
uniqueA =
      1.3130
      2.2500
      2.4000
      3.7000

The quantity of each unique value is in duplicateCount:

>> duplicateCount
duplicateCount =
       2
       3
       2
       1

To get the indices of A corresponding to the n-th unique value, uniqueA(n)

>> n = 2;
>> uniqueA(n)
ans =
      2.2500
>> duplicateIndex = find(duplicateBool(:,n))
duplicateIndex =
       5
       6
       7

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Answer 5

Fernando Meo 2018년 8월 13일

MATLAB Online에서 열기

0 개 추천

Here is another answer (a one liner)

If AA is a 2D matrix and you wish to find the rows which have a duplicate values in its columns,

RowsWhichHaveDuplicates = find(arrayfun(@(i (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]));

Example

AA = [6   7  11  6; 7  11  4  8; 11  15  1  10; 15  4  14  12; 
18  13  18  8; 12  13  18  1; 3  14  6  18];
>> RowsWhichHaveDuplicates = RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
  1   5

If your values are real, then a tolerance can be set by using the matlab "round" function to the decimal places you wish to use.

AA = round(rand(10)*10,1); % First decimal place
AA =
0000    2.0000    0.4000    6.7000    9.4000    0.6000    8.3000    3.1000    1.0000    3.0000
1000    7.5000    0.6000    6.0000    0.7000    3.1000    0.3000    4.2000    9.0000    3.7000
5000    8.9000    5.0000    3.4000    7.2000    6.6000    8.4000    9.3000    9.0000    7.6000
6000    1.0000    4.1000    4.0000    8.3000    4.6000    2.6000    0.6000    0.8000    3.1000
6000    5.2000    2.2000    3.9000    7.3000    0.2000    6.6000    8.2000    5.2000    9.6000
2000    6.0000    4.3000    7.0000    5.1000    6.9000    6.7000    6.4000    2.8000    2.1000
2000    9.8000    9.5000    1.4000    5.2000    4.1000    2.6000    8.2000    8.8000    7.3000
3000    6.7000    2.0000    3.8000    7.6000    5.7000    3.3000    3.3000    6.7000    2.5000
2000    8.5000    7.1000    2.2000    6.3000    9.9000    2.5000    9.5000    1.2000    8.9000
9000    1.7000    7.8000    4.1000    0.7000    8.6000    7.1000    9.1000    3.7000    7.1000
RowsWhichHaveDuplicates = find(arrayfun(@(i) (~isequal(length(unique(AA(i,:))),size(AA,2))), [1:size(AA,1)]))
RowsWhichHaveDuplicates =
   8    10

Hope this helps

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Finding Duplicate Values per Column

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2
없음 표시 없음 숨기기

추가 답변 (4개)

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

카테고리

태그

Community Treasure Hunt

Finding Duplicate Values per Column

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2 없음 표시 없음 숨기기

추가 답변 (4개)

댓글 수: 4 이전 댓글 2개 표시 이전 댓글 2개 숨기기

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

카테고리

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시 없음 숨기기

댓글 수: 4
이전 댓글 2개 표시 이전 댓글 2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기