Fast method count unique variables
조회 수: 6 (최근 30일)
이전 댓글 표시
Ok so given an array of (n,m) dimensions populated with positive integers from 1:m and each row of the array will contain only one of each integer: so no repeats.
For example:
2,1,3,4
3,1,2,4
1,2,3,4
2,1,4,3
I'd like a fast method of creating arrays containing all the positions that the numbers fall into, for example:
One.positions = [1,2]
Two.positions = [1,2,3]
Three.postions = [1,3,4]
Four.positions = [3,4]
I'm doing this for arrays sometimes of sizes = [1e+6,30]; currently I'm using a loop moving column wise with accummarray and if the number exists in that Column it is iteratively added to an array.
Original Code:
% m = array containing sequences of integers
total = size(m,2)
A = (1:total);
intel_pos = cell(total,1);
for i = 1:total
for j = 1:total
if sum(A(accumarray(m(:,j),1) > 0) == i) == 1
intel_pos{i} = [intel_pos{i};j];
end
end
end
So obvious areas of improvement can be done.
댓글 수: 0
채택된 답변
David Goodmanson
2017년 5월 3일
편집: David Goodmanson
2017년 5월 5일
Revised answer. For a 7e6 x 30 matrix this takes about 2.7 sec on my PC.
m8 = uint8(m);
u1 = uint8(1);
ncol = size(m8,2);
nmax = ncol; % in your case. otherwise nmax = max(max(m))
A = zeros(nmax,ncol);
locations = cell(nmax,1);
% create matrix with j,k element = 1 if integer j is found in column k, 0 otherwise
for k = 1:ncol
A(m8(:,k),k) = u1;
end
for k = 1:nmax
locations{k} = find(A(k,:));
end
based on the principle that if you set an element to 1, it doesn't matter if you have set it to 1 a thousand times already.
댓글 수: 2
David Goodmanson
2017년 5월 8일
You're welcome, it was a good problem. I realized that I forgot to set A equal to a uint8 matrix so it is inconsistent to use u1 instead of 1 in the for loop. Actually it does not seem to make much difference speedwise if A is double and its elements are set 1 or if A is uint8 and its elements are set to u1.
추가 답변 (2개)
Guillaume
2017년 5월 3일
No idea if it's faster than your method, this does not need a loop:
m = [2 1 3 4;1 2 3 4;1 2 4 3;2 1 4 3]; %demo data
colindices = repmat(1:size(m, 2), size(m, 1), 1);
out = accumarray(m(:), colindices(:), [], @(x) {unique(x)})
댓글 수: 2
Sean de Wolski
2017년 5월 3일
Loops are not typically slower. This was true 15 years ago but with advances in the MATLAB execution engine and jit accelerator they are now at parity with vectorized operations much of the time.
Sean de Wolski
2017년 5월 3일
편집: Sean de Wolski
2017년 5월 4일
I'd suspect a simple loop over columns with ismember would be very fast.
EDIT from Comment Clarification
tic
ncol = size(m,2);
nrow = size(m,1);
present = cell(ncol,1);
for ii = 1:ncol
present{ii} = unique(ceil(find(m==ii)./nrow));
end
toc
OLD
[~,m] = sort(rand(1e6,30),2);
tic
ncol = size(m,2)
present = cell(ncol,1);
for ii = 1:ncol
present{ii} = find(ismember(1:ncol,m(:,ii)));
end
toc
This is taking 0.9s on my laptop.
댓글 수: 6
David Goodmanson
2017년 5월 5일
Hi Matthew, I have done a revised answer, which is about three times faster than the code listed above.
참고 항목
카테고리
Help Center 및 File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!