Using "unique" to identify unique values AND number of occurrences of each unique value

조회 수: 17 (최근 30일)

Below is the head entries of a table
head(hits)
ID res1 score
_____________ ____ _______
AGAP001076-RD 282 0.67229
AGAP001076-RD 285 0.75292
AGAP001076-RD 286 0.66957
AGAP001076-RD 296 0.51694
AGAP001076-RD 298 0.51655
AGAP001076-RD 310 0.54564
AGAP001076-RD 314 0.74495
AGAP010077-RA 349 0.52136
Using "unique" I can obtain unique IDs. I would also like to obtain the number of occurences of each unique ID, e.g AGAP001076-RD 6
Thank you for your attention

채택된 답변

Steven Lord
Steven Lord 2024년 9월 19일
Use the groupcounts function.
A = {'AGAP001076-RD' 282 0.67229
'AGAP001076-RD' 285 0.75292
'AGAP001076-RD' 286 0.66957
'AGAP001076-RD' 296 0.51694
'AGAP001076-RD' 298 0.51655
'AGAP001076-RD' 310 0.54564
'AGAP001076-RD' 314 0.74495
'AGAP010077-RA' 349 0.52136};
[counts, groupID] = groupcounts(A(:, 1))
counts = 2×1
7 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
groupID = 2x1 cell array
{'AGAP001076-RD'} {'AGAP010077-RA'}
  댓글 수: 3
Paul
Paul 2024년 9월 19일
Check the linked doc page for groupcounts to see how to call it with a table input.
Steven Lord
Steven Lord 2024년 9월 19일
A = {'AGAP001076-RD' 282 0.67229
'AGAP001076-RD' 285 0.75292
'AGAP001076-RD' 286 0.66957
'AGAP001076-RD' 296 0.51694
'AGAP001076-RD' 298 0.51655
'AGAP001076-RD' 310 0.54564
'AGAP001076-RD' 314 0.74495
'AGAP010077-RA' 349 0.52136};
T = cell2table(A)
T = 8x3 table
A1 A2 A3 _________________ ___ _______ {'AGAP001076-RD'} 282 0.67229 {'AGAP001076-RD'} 285 0.75292 {'AGAP001076-RD'} 286 0.66957 {'AGAP001076-RD'} 296 0.51694 {'AGAP001076-RD'} 298 0.51655 {'AGAP001076-RD'} 310 0.54564 {'AGAP001076-RD'} 314 0.74495 {'AGAP010077-RA'} 349 0.52136
If your data is in a table array like the one I created above, you just have to tell groupcounts which variable(s) in the table is/are the grouping variable(s).
countsAndID = groupcounts(T, 'A1')
countsAndID = 2x3 table
A1 GroupCount Percent _________________ __________ _______ {'AGAP001076-RD'} 7 87.5 {'AGAP010077-RA'} 1 12.5
You can use multiple grouping variables as well. Let's make some data with duplicate rows and replace the values in A2 with ones more likely to cause a collision in the combination of the grouping variables A1 and A2.
T2 = T(randi(height(T), 20, 1), :);
T2.A2 = randi(5, 20, 1)
T2 = 20x3 table
A1 A2 A3 _________________ __ _______ {'AGAP001076-RD'} 2 0.51655 {'AGAP001076-RD'} 4 0.74495 {'AGAP001076-RD'} 1 0.75292 {'AGAP001076-RD'} 4 0.51655 {'AGAP001076-RD'} 5 0.54564 {'AGAP001076-RD'} 5 0.66957 {'AGAP001076-RD'} 5 0.51694 {'AGAP010077-RA'} 2 0.52136 {'AGAP001076-RD'} 1 0.67229 {'AGAP001076-RD'} 3 0.75292 {'AGAP001076-RD'} 1 0.67229 {'AGAP001076-RD'} 4 0.74495 {'AGAP001076-RD'} 4 0.51655 {'AGAP001076-RD'} 4 0.51694 {'AGAP001076-RD'} 4 0.51694 {'AGAP001076-RD'} 2 0.67229
countsAndID = groupcounts(T2, ["A1", "A2"])
countsAndID = 6x4 table
A1 A2 GroupCount Percent _________________ __ __________ _______ {'AGAP001076-RD'} 1 4 20 {'AGAP001076-RD'} 2 2 10 {'AGAP001076-RD'} 3 1 5 {'AGAP001076-RD'} 4 8 40 {'AGAP001076-RD'} 5 4 20 {'AGAP010077-RA'} 2 1 5
Let's check. How many rows of T2 have the same A1 and A2 values as the first row of the countsAndID table?
matchesForFirstRowA1 = matches(T2.A1, countsAndID{1, "A1"});
matchesForFirstRowA2 = T2.A2 == countsAndID{1, "A2"};
result = T2(matchesForFirstRowA1 & matchesForFirstRowA2, :)
result = 4x3 table
A1 A2 A3 _________________ __ _______ {'AGAP001076-RD'} 1 0.75292 {'AGAP001076-RD'} 1 0.67229 {'AGAP001076-RD'} 1 0.67229 {'AGAP001076-RD'} 1 0.51655
Does that match the count that groupcount returned in that first row of countsAndID?
isequal(height(result), countsAndID{1, "GroupCount"})
ans = logical
1

댓글을 달려면 로그인하십시오.

추가 답변 (1개)

Animesh
Animesh 2024년 9월 19일
In MATLAB, you can use the "unique" function along with the "histcounts" function to find the number of occurrences of each unique ID in your table. Here's how you can do it:
% Assume 'hits' is your table
% Extract the 'ID' column from the table
ids = hits.ID;
% Find unique IDs and their indices
[uniqueIDs, ~, idx] = unique(ids);
% Count the occurrences of each unique ID
occurrences = histcounts(idx, 1:max(idx)+1);
% Display the results
for i = 1:length(uniqueIDs)
fprintf('%s %d\n', uniqueIDs{i}, occurrences(i));
end
You can refer the following MathWorks documentation for more information on "histcounts" function:

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

제품


릴리스

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by