count occurrences of string in a single cell array (How many times a string appear)

조회 수: 85 (최근 30일)
I have a single cell array containing long string as shown bellow:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
I am trying to achieve output in two cell array as shown:
xx = {'computer', 'car', 'bus', 'tree'}
occ = {'2', '2','1','1'}
Your suggestion and ideas are highly appreciated. Thanx in advance

채택된 답변

Azzi Abdelmalek
Azzi Abdelmalek 2014년 2월 12일
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
a=unique(xx,'stable')
b=cellfun(@(x) sum(ismember(xx,x)),a,'un',0)

추가 답변 (4개)

Jos (10584)
Jos (10584) 2014년 2월 12일
A faster method and more direct method of counting using the additional output of UNIQUE:
XX = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
[uniqueXX, ~, J]=unique(XX)
occ = histc(J, 1:numel(uniqueXX))
  댓글 수: 6
Adam Danz
Adam Danz 2020년 8월 29일
편집: Adam Danz 2020년 8월 29일
For the carsmall data used in the other comparisons, histc was actually 1.33x faster than histcounts in r2019b and 1.22 x faster on r2020a (matlab online). On both machines I repeated the 10000-rep analysis 3 times and the final results were all within +/-0.02 of what's reported.
The difference between those numbers and your results may have to do with first-time-costs if you're just measuring the execution once with tic/toc.
I like your dedication to optimization! 😎
Bruno Luong
Bruno Luong 2020년 8월 29일
편집: Bruno Luong 2020년 8월 29일
No first-time cost I ensure you. I post just one result ans snipet for simplicity, but I ran tic/toc on loop and within function and on 2 different computers (Windows 8.1 Windows 10 both with R2020a).
The conclusion on my side doesn't not change.
Yeah I'm kind of obssesing with Matlab speed, and I can't hide it.

댓글을 달려면 로그인하십시오.


MSchoenhart
MSchoenhart 2018년 9월 27일
편집: Adam Danz 2020년 8월 29일
A very fast and simple vectorized method is to use categories (since R2013b). "countcats" is also using histc in the background but the code looks much cleaner:
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'};
c = categorical(xx);
categories(c)
countcats(c)
  댓글 수: 2
Adam Danz
Adam Danz 2020년 8월 29일
편집: Adam Danz 2020년 8월 29일
*Edited question to format code
Nice solution!
Giuseppe Degan Di Dieco
Giuseppe Degan Di Dieco 2021년 4월 27일
Dear MSchoenhart,
thanks for your solution, it helped me too.
Best!

댓글을 달려면 로그인하십시오.


Bruno Luong
Bruno Luong 2020년 8월 29일
편집: Bruno Luong 2020년 8월 29일
[yy,~,i] = unique(xx,'stable');
count = accumarray(i(:),1,[numel(yy),1]);

Girish Chandra
Girish Chandra 2017년 2월 12일
편집: Adam Danz 2020년 8월 29일
Not using histc function you can do it in the following way
xx = {'computer', 'car', 'computer', 'bus', 'tree', 'car'}
U=unique(xx)
A=zeros(1,numel(U))
for i=1:numel(U)
for j=1:numel(xx)
if strcmp(U(i),xx(j))==1
A(i)=A(i)+1
end
end
end
  댓글 수: 3
Jon Adsersen
Jon Adsersen 2020년 4월 8일
Based on the answer by Jos, a function that works for both numerical and string arrays could be formulated:
function [rep_values, N_rep, ind_rep] = f_reapeated_elements(A)
% Find repeated elements in A (can be both numeric or cell strings etc.)
% Outputs:
% rep_values - repeated values in A (occuring 2 or more times)
% N_rep - Number of repetitions of the values given in "rep_values"
% ind_rep - Ind in A of repeated values (occuring 2 or more times)
[un, ~, ind_un] = unique(A) ;
N_A = histc(ind_un,1:numel(un)) ;
rep_values = un(N_A>1) ;
N_rep = N_A(N_A>1) ;
ind_cell = cell(1, numel(rep_values)) ;
A_list = 1:numel(A) ;
for k = 1:numel(rep_values)
if isnumeric(rep_values)
ind_cell{k} = find(A == rep_values(k)) ;
else
log_ind = strcmp(A,rep_values(k)) ;
ind_cell{k} = A_list(log_ind) ;
end
end
ind_rep = unique([ind_cell{:}]) ;

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Logical에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by