Using "unique" to identify unique values AND number of occurrences of each unique value

Question

George 2024년 9월 19일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2153925-using-unique-to-identify-unique-values-and-number-of-occurrences-of-each-unique-value

댓글: Steven Lord 2024년 9월 19일

Below is the head entries of a table
head(hits)
ID res1 score
_____________ ____ _______
AGAP001076-RD 282 0.67229
AGAP001076-RD 285 0.75292
AGAP001076-RD 286 0.66957
AGAP001076-RD 296 0.51694
AGAP001076-RD 298 0.51655
AGAP001076-RD 310 0.54564
AGAP001076-RD 314 0.74495
AGAP010077-RA 349 0.52136
Using "unique" I can obtain unique IDs. I would also like to obtain the number of occurences of each unique ID, e.g AGAP001076-RD 6
Thank you for your attention

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Steven Lord 2024년 9월 19일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2153925-using-unique-to-identify-unique-values-and-number-of-occurrences-of-each-unique-value#answer_1519500

MATLAB Online에서 열기

Use the groupcounts function.

A = {'AGAP001076-RD' 282 0.67229
'AGAP001076-RD' 285 0.75292
'AGAP001076-RD' 286 0.66957
'AGAP001076-RD' 296 0.51694
'AGAP001076-RD' 298 0.51655
'AGAP001076-RD' 310 0.54564
'AGAP001076-RD' 314 0.74495
'AGAP010077-RA' 349 0.52136};
[counts, groupID] = groupcounts(A(:, 1))
counts = 2×1
     7
     1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
groupID = 2x1 cell array
    {'AGAP001076-RD'}
    {'AGAP010077-RA'}

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

Paul 2024년 9월 19일

Check the linked doc page for groupcounts to see how to call it with a table input.

Steven Lord 2024년 9월 19일

MATLAB Online에서 열기

A = {'AGAP001076-RD' 282 0.67229
'AGAP001076-RD' 285 0.75292
'AGAP001076-RD' 286 0.66957
'AGAP001076-RD' 296 0.51694
'AGAP001076-RD' 298 0.51655
'AGAP001076-RD' 310 0.54564
'AGAP001076-RD' 314 0.74495
'AGAP010077-RA' 349 0.52136};
T = cell2table(A)
T = 8x3 table
           A1            A2       A3   
    _________________    ___    _______

    {'AGAP001076-RD'}    282    0.67229
    {'AGAP001076-RD'}    285    0.75292
    {'AGAP001076-RD'}    286    0.66957
    {'AGAP001076-RD'}    296    0.51694
    {'AGAP001076-RD'}    298    0.51655
    {'AGAP001076-RD'}    310    0.54564
    {'AGAP001076-RD'}    314    0.74495
    {'AGAP010077-RA'}    349    0.52136

If your data is in a table array like the one I created above, you just have to tell groupcounts which variable(s) in the table is/are the grouping variable(s).

countsAndID = groupcounts(T, 'A1')
countsAndID = 2x3 table
           A1            GroupCount    Percent
    _________________    __________    _______

    {'AGAP001076-RD'}        7          87.5  
    {'AGAP010077-RA'}        1          12.5  

You can use multiple grouping variables as well. Let's make some data with duplicate rows and replace the values in A2 with ones more likely to cause a collision in the combination of the grouping variables A1 and A2.

T2 = T(randi(height(T), 20, 1), :);
T2.A2 = randi(5, 20, 1)
T2 = 20x3 table
           A1            A2      A3   
    _________________    __    _______

    {'AGAP001076-RD'}    2     0.51655
    {'AGAP001076-RD'}    4     0.74495
    {'AGAP001076-RD'}    1     0.75292
    {'AGAP001076-RD'}    4     0.51655
    {'AGAP001076-RD'}    5     0.54564
    {'AGAP001076-RD'}    5     0.66957
    {'AGAP001076-RD'}    5     0.51694
    {'AGAP010077-RA'}    2     0.52136
    {'AGAP001076-RD'}    1     0.67229
    {'AGAP001076-RD'}    3     0.75292
    {'AGAP001076-RD'}    1     0.67229
    {'AGAP001076-RD'}    4     0.74495
    {'AGAP001076-RD'}    4     0.51655
    {'AGAP001076-RD'}    4     0.51694
    {'AGAP001076-RD'}    4     0.51694
    {'AGAP001076-RD'}    2     0.67229
countsAndID = groupcounts(T2, ["A1", "A2"])
countsAndID = 6x4 table
           A1            A2    GroupCount    Percent
    _________________    __    __________    _______

    {'AGAP001076-RD'}    1         4           20   
    {'AGAP001076-RD'}    2         2           10   
    {'AGAP001076-RD'}    3         1            5   
    {'AGAP001076-RD'}    4         8           40   
    {'AGAP001076-RD'}    5         4           20   
    {'AGAP010077-RA'}    2         1            5   

Let's check. How many rows of T2 have the same A1 and A2 values as the first row of the countsAndID table?

matchesForFirstRowA1 = matches(T2.A1, countsAndID{1, "A1"});
matchesForFirstRowA2 = T2.A2 == countsAndID{1, "A2"};
result = T2(matchesForFirstRowA1 & matchesForFirstRowA2, :)
result = 4x3 table
           A1            A2      A3   
    _________________    __    _______

    {'AGAP001076-RD'}    1     0.75292
    {'AGAP001076-RD'}    1     0.67229
    {'AGAP001076-RD'}    1     0.67229
    {'AGAP001076-RD'}    1     0.51655

Does that match the count that groupcount returned in that first row of countsAndID?

isequal(height(result), countsAndID{1, "GroupCount"})
ans = logical
   1

댓글을 달려면 로그인하십시오.

Answer 2

Animesh 2024년 9월 19일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/2153925-using-unique-to-identify-unique-values-and-number-of-occurrences-of-each-unique-value#answer_1519490

MATLAB Online에서 열기

Hi @George,

In MATLAB, you can use the "unique" function along with the "histcounts" function to find the number of occurrences of each unique ID in your table. Here's how you can do it:

% Assume 'hits' is your table
% Extract the 'ID' column from the table
ids = hits.ID;
% Find unique IDs and their indices
[uniqueIDs, ~, idx] = unique(ids);
% Count the occurrences of each unique ID
occurrences = histcounts(idx, 1:max(idx)+1);
% Display the results
for i = 1:length(uniqueIDs)
    fprintf('%s %d\n', uniqueIDs{i}, occurrences(i));
end

You can refer the following MathWorks documentation for more information on "histcounts" function:

https://www.mathworks.com/help/releases/R2024a/matlab/ref/histcounts.html

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

Using "unique" to identify unique values AND number of occurrences of each unique value

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (1개)

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

Using "unique" to identify unique values AND number of occurrences of each unique value

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 3 이전 댓글 1개 표시이전 댓글 1개 숨기기

추가 답변 (1개)

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

참고 항목

카테고리

태그

제품

릴리스

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 3
이전 댓글 1개 표시이전 댓글 1개 숨기기

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기