counting strings in cell array, is there a faster solution

조회 수: 1 (최근 30일)
Scragmore
Scragmore 2012년 1월 27일
댓글: Manduna Watson 2014년 6월 30일
Hi all.
I have two cell array's test1 240000x1 and test2 160000x1. Each cell of test1 contains a string, varying lengths 1-20 charicters. test2 is a list of unique entries from test1.
I wish to count the number of occurrences of each unique string in test2 in test1.
example strings in test1 & 2
test1 = {'ayooy'; 'ayta'; 'a'; 'aa'; 'aatl'; 'aatla'; ......};
test2 = {'a'; 'aa'; 'aaa'; 'aaaa'; 'aaaaa'; 'aaaac'; .......};
My code;
for ii = 1:length(test2)
b = ismember(test1,test2(ii,1));
test2{ii,2}(1,1) = sum(b);
end
Is there a way to speed this up or an alternative method that is faster. I know I am running a lookup that is 160k * 240k = 40,000 mill.
Thanks for you time
AD

채택된 답변

Walter Roberson
Walter Roberson 2012년 1월 27일
When you construct test2, use a different form of unique:
[test2, ua, ub] = unique(test1);
After that, the counts are:
test2counts = histc(ub, 1:length(test2));
  댓글 수: 2
Scragmore
Scragmore 2012년 1월 28일
Thanks for highlighting the additional output of unique and introducing me to new function histc. Worked a treat, supper fast compared to what I was originally doing.
Cheers,
AD
Manduna Watson
Manduna Watson 2014년 6월 30일
Thank you, this really helped me to identify and count repeated characters in my data set

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

태그

제품

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by