How can I get unique entries and their counts and place back into the table?

조회 수: 11 (최근 30일)
Rookie Programmer
Rookie Programmer 2025년 4월 18일
편집: dpb 2025년 4월 19일
When running the code given below I get the error:
[uniqueEntries, ~, entryGroupIndices] = unique(x);
Error: Unsupported use of the '=' operator. To compare values for equality, use '=='. To specify name-value arguments,check that name is a valid identifier with no surrounding quotes.
I think is due to (x) not being defined or non existing.
% Sample data: create a table
data = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
{'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
'VariableNames', {'Fruits', 'Var2'});
% Group the data by 'Fruits' and collect Var2 entries
summaryTable = groupsummary(data, 'Fruits', @(x) {x.Var2}, 'IncludeEmptyGroups', true);
% Create a function to count unique entries and their occurrences
countUniqueEntries = @(x) {
% Get unique entries and their counts
[uniqueEntries, ~, entryGroupIndices] = unique(x);
entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
% Create a table with unique entries and their counts
table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
% Apply the function to each group using cellfun
countTables = cellfun(countUniqueEntries, summaryTable.GroupCount, 'UniformOutput', false);
% Create the final result table
resultTable = table(summaryTable.Fruits, countTables, 'VariableNames', {'Fruits', 'Counts'});
% Display the results
disp('Unique Fruits and Their Counts:');
disp(resultTable);
The output should look something like this:
Fruits Counts
_______ _______
'apple' [3x2 table]
'banana' [2x2 table]
'kiwi' [1x2 table]
'orange' [1x2 table]
I would love to get the results without having to loop.
It would also be helpful If I can sort the counts in the counts Table 'descending'. Thank you for the help.

답변 (2개)

Stephen23
Stephen23 2025년 4월 18일
편집: Stephen23 2025년 4월 18일
"I think is due to (x) not being defined or non existing. "
No, it is because you invented some syntax when defining the anonymous function here:
countUniqueEntries = @(x) {
% Get unique entries and their counts
[uniqueEntries, ~, entryGroupIndices] = unique(x);
entryCounts = histcounts(entryGroupIndices, 'BinMethod', 'integers');
% Create a table with unique entries and their counts
table(uniqueEntries, entryCounts', 'VariableNames', {'UniqueEntries', 'Counts'})
};
Curly braces define a cell array. Inside that cell array you called various functions (which is allowed inside curly braces) and attempted to assign their outputs to variables (which is definitely not allowed inside curly braces). It is not valid syntax to perform assignment inside the cell array operator (nor, for that matter, inside any other operators):
{x=sqrt(2)} % this is invalid syntax
Unsupported use of the '=' operator. To compare values for equality, use '=='. To specify name-value arguments, check that name is a valid identifier with no surrounding quotes.
Your attempt to use an anonymous function like that will not work. Write a normal function in an Mfile, then you can make as many variable assignments as you wish.
I doubt that using nested tables like that will make processing your data easier: https://xyproblem.info/
  댓글 수: 2
Stephen23
Stephen23 2025년 4월 18일
T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
{'yes'; 'no'; 'yes'; 'yes'; 'no'; 'yes'; 'yes'}, ...
'VariableNames', {'Fruits', 'Var2'})
T = 7x2 table
Fruits Var2 __________ _______ {'apple' } {'yes'} {'banana'} {'no' } {'apple' } {'yes'} {'orange'} {'yes'} {'banana'} {'no' } {'kiwi' } {'yes'} {'apple' } {'yes'}
U = groupsummary(T,'Fruits')
U = 4x2 table
Fruits GroupCount __________ __________ {'apple' } 3 {'banana'} 2 {'kiwi' } 1 {'orange'} 1
Walter Roberson
Walter Roberson 2025년 4월 19일
To be more explicit:
@(x) { CODE } is not used to define a code block. @(X) { CODE } is used to define a cell array of expressions. The individual expressions must return (possibly empty) values, and must not be assignment statements or control statements.

댓글을 달려면 로그인하십시오.


dpb
dpb 2025년 4월 19일
편집: dpb 2025년 4월 19일
Carrying on with @Stephen23's illustration of groupsummary...
"...also be helpful If I can sort the counts in the counts Table 'descending'. "
T = table({'apple'; 'banana'; 'apple'; 'orange'; 'banana'; 'kiwi'; 'apple'}, ...
'VariableNames', {'Fruits'});
T=addvars(T,~matches(T.Fruits,'banana'),'NewVariableNames',{'Round'});
T=convertvars(T,{'Fruits'},'categorical');
U=sortrows(groupsummary(T,'Fruits',@(x)all(x),{'Round'}),'GroupCount','descend');
U=renamevars(U,{'fun1_Round'},{'Round'}) % fixup annoying funN_ prefix that can't stop
U = 4x3 table
Fruits GroupCount Round ______ __________ _____ apple 3 true banana 2 false kiwi 1 true orange 1 true
% alternative is mung on variable names directly...
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,'fun1_','');
% general alternative, can use a pattern string to automate more than one
%pat='fun'+digitsPattern+'_';
%U.Properties.VariableNames=strrep(U.Properties.VariableNames,pat,'');
Although the actual logic for determing the logic state is unstated, took a guess as why 'banana' is different...
NOTA BENE that to bring along other variable(s) in the summary, one has to be able to reduce them to one statistic per group; which all does above for the characteristic variable. As noted, it would be nice if groupsummary also had the option to set 'OutputVariableNames' as does rowfun

카테고리

Help CenterFile Exchange에서 Tables에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by