Determining quantities of different integers in a large data set

I am currently working with a large data set, specifically 500 integers. The values are randomly generated between two specified values, x1 and x2. There is not a large range between x1 and x2 and therefore the same values are repeated a number of times. What I am trying to do is to come up with a way of determining how often each integer is repeated.
To clarify, assuming x1 = 5 and x2 = 10 and 500 integers are randomly generated between these values. Within the randomly generated integers, how many 5's were generated, how many 6, how many 7, etc etc.
I am attempting to find the quickest way of doing this without having lines and lines of code.

댓글 수: 3

Hello A^2
The histcounts function will probably get the job done. It lets you set bin edges and/or widths, and you might have to experiment around a little to get the result you want.
Image Analyst
Image Analyst 2017년 11월 25일
편집: Image Analyst 2017년 11월 25일
David, put this in the Answers section so you might get credit for it. histogram() is also a related function. By the way, it's funny how he thinks 500 elements is "large".
Hi all, thanks for the input. I'm new to working with Matlab so for me 500 is large but I'm sure for you guys it's probably not even close :)

댓글을 달려면 로그인하십시오.

 채택된 답변

John D'Errico
John D'Errico 2017년 11월 25일
편집: John D'Errico 2017년 11월 25일
I'd normally recommend either sparse or accumarray to do the counts. Accumarray is arguably best here, because there is no need for a sparse result.
The only question would be if your limits x1 and x2 are VERY large numbers. Then most of the elements of the array will be zero.
x1 = 10;
x2 = 20;
X = randi([x1,x2],500,1);
counts = accumarray (X(:),1,[],@sum)
counts =
0
0
0
0
0
0
0
0
0
40
39
35
44
56
39
55
53
41
51
47
Since you know that the result will be zero below x1, just extract the counts you expect to see.
counts = counts(x1:x2)
counts =
40
39
35
44
56
39
55
53
41
51
47
Or use sparse, which creates the matrix as a sparse one.
sparse(X,1,1,20,1)
ans =
(10,1) 40
(11,1) 39
(12,1) 35
(13,1) 44
(14,1) 56
(15,1) 39
(16,1) 55
(17,1) 53
(18,1) 41
(19,1) 51
(20,1) 47
Or, you can use histcounts (or the older histc), but you need to be careful!!!!!!!! If you get sloppy and just do the obvious, you see that the last bin has too many counts in it.
histcounts(X,x1:x2)
ans =
40 39 35 44 56 39 55 53 41 98
You need to use histcounts like this to make it work properly:
histcounts(X,x1:x2+1)
ans =
40 39 35 44 56 39 55 53 41 51 47
Note that histcounts is a good choice if x1 is a very large number. Then the accumarray solution would generate an array with a huge number of zero elements at the start.
So the best solution must be based on the problem. This is often the case.

댓글 수: 2

If all the elements in X are integer values, and the bounds of that array are not too far apart, you could tell the histogram or histcounts functions to use the 'integers' BinMethod.
+1
Great review of multiple methods.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Logical에 대해 자세히 알아보기

질문:

2017년 11월 25일

댓글:

2019년 10월 7일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by