find repeating numbers

조회 수: 31 (최근 30일)
Richard
Richard 2012년 5월 8일
편집: Andrei Bobrov 2017년 10월 19일
What is the best method to find the numbers in an array which repeat most frequently. For example, say that I have a matrix of values:
a = 1;
b = 30;
data = a + (b-a).*rand(1000,5);
I would like to find the five numbers which are repeated most in the matrix. They do not need to be in different columns/rows but simply the five most repeated numbers. How would I do this?

답변 (3개)

Oleg Komarov
Oleg Komarov 2012년 5월 8일
unNum = unique(data(:));
[n,bin] = histc(data(:),unNum);
With your example you're not likely to find any repetition (you should use randi).
The first output counts the repetitions while the second gives you which position in unNum is.
You can sort then n in decreasing order, get the sorting index and apply to unNum to retrieve the most repeated numbers.
An example:
a = 0;
b = 100;
data = randi([a b-a],1000,5);
unNum = unique(data(:));
[n,bin] = histc(data(:),unNum);
[srt,idx] = sort(n,'descend');
unNum(srt(1:5))

Andrei Bobrov
Andrei Bobrov 2012년 5월 8일
편집: Andrei Bobrov 2017년 10월 19일
way with use accumarray
a = 0;
b = 100;
data = randi([a b-a],1000,5);
[unNum,n,n] = unique(data(:));
c = accumarray(n,1);
[srt,idx] = sort(c,'descend');
unNum(idx(1:5))

dipanka tanu sarmah
dipanka tanu sarmah 2017년 10월 19일
편집: Stephen23 2017년 10월 19일
function y = repeated(x)
for i=1:length(x)
if length(find(x==i)) >=2
y{i,1}=x(i);
y{i,2}=length(find(x==i))
end
end
end
you can use this to find out the repeated values and frequency
  댓글 수: 1
Stephen23
Stephen23 2017년 10월 19일
편집: Stephen23 2017년 10월 19일
This code is extremely unreliable, and should be avoided. In particular:
  • This code does not find the values with the highest frequency, because it compares the array values with internally generated values (the loop index i). If any value inside x is not equal to one of 1:length(x) then this algorithm will not work. For example:
>> repeated([3,-2,-2]) % fails, no output
>> repeated([6,7,7]) % fails, no output
  • The output of length changes depending on the sizes of the input array, so this code gives different solutions depending on the size of the input array x.
  • length(find(...)) is slow and unnecessary. Simpler and faster would be to simply use nnz(...).
  • The slow operations length(find(...)) are duplicated, whereas they should be simply called once.
  • Bad code alignment is how beginners hide basic errors in their code. Code should be aligned using the MATLAB editor's default settings. Code can be aligned by simply selecting all code in the editor, then clicking ctrl+i.
Using histc (or the newer equivalent) is simpler and much more reliable than copying this buggy code.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Matrices and Arrays에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by