Index to elements not listed in numeric index?

조회 수: 21 (최근 30일)
Andrew Landau
Andrew Landau 2018년 11월 25일
댓글: Andrew Landau 2018년 11월 25일
Some functions return lists of indices, such as unique and ismember. Let's say I want to index to every element that isn't listed:
A = [1 1 2 2 3 3];
[uA, idxuA] = unique(A); % uA = [1 2 3], idxuA = [1 3 5]
idxDuplicates = true(length(A),1);
idxDuplicates(idxuA) = false;
duplicatesInA = A(idxDuplicates);
But it seems like that isn't very efficient and it would be nice to do something like-
duplicatesInA = A(~idxuA);
I really have two questions for the matlab/coding experts:
(1) Is there an efficient and direct way to use the '~' for a list of indices
(2) Is it worth it to optimize this or should I just deal with the extra few lines of code?
  댓글 수: 2
Rik
Rik 2018년 11월 25일
I don't really consider myself to be an expert, but I'll still add my thoughts on this:
  1. Not that I know of. If it were a logical vector this would indeed be the way to do it, but since linear indices are returned, this might be the only way.
  2. Longer code can actually be more optimal, and more readable. That being said, as long as you are aware where the bottlenecks of your code are, you are miles ahead of many users. Unless your function is doing this millions of times in a loop, I don't think it is worth the extra effort to optimize this particular issue.
Stephen23
Stephen23 2018년 11월 25일
setdiff does the job quite easily.

댓글을 달려면 로그인하십시오.

채택된 답변

Andrew Landau
Andrew Landau 2018년 11월 25일
편집: Andrew Landau 2018년 11월 25일
Thanks everyone. I was looking for the function Matt J suggested - setdiff. However, I did a little profiling to check speeds. Making a true array and setting the indexed elements to false is faster than setdiff by an order of magnitude. So, right you are Rik. Longer code more optimal in this case.
Here's the code I used if you want to test it:
% Set up some random data for testing
% ** the result was robust to changing N and K
N = 10000;
K = 500;
data = randn(N,1);
idx = randperm(N,K);
% if anyone has a better way to preallocate cell arrays please tell me!
P = 1000;
timing = cell(1,2);
timing = cellfun(@(c) zeros(P,1), timing, 'uni', 0);
for p = 1:P
% Fastest by order of magnitude
tic
i1 = true(1,N); % define boolean array
i1(idx) = false; % set all elements from index to false
d11 = data(i1); % keep everything that wasn't in the index
timing{1}(p) = toc;
% Ten times slower
tic
i2 = setdiff(1:N,idx); % Get index of everything from 1:N not in idx
d12 = data(i2); % setdiff(1:N,idx) as argument to data() had comparable timing
timing{2}(p) = toc;
end
avgtime = cellfun(@mean, timing, 'uni', 1);
fprintf('Boolean array: %.2fµs -- Setdiff: %.2fµs -- Ratio: %.2f\n', avgtime(1)*1000000, avgtime(2)*1000000, avgtime(2)/avgtime(1));

추가 답변 (2개)

Matt J
Matt J 2018년 11월 25일
편집: Matt J 2018년 11월 25일
Your way is probably the most efficient, but an alternative with shorter syntax is,
duplicatesInA = A( setdiff(1:numel(A), idxuA) );
  댓글 수: 1
Andrew Landau
Andrew Landau 2018년 11월 25일
Yeah, the boolean array is 10x faster. Thanks for your input though!

댓글을 달려면 로그인하십시오.


Matt J
Matt J 2018년 11월 25일
편집: Matt J 2018년 11월 25일
Is it worth it to optimize this or should I just deal with the extra few lines of code?
There's never a reason to deal with extra lines of code if it's an operation that you do often. That's what mfunctions are for.
function Ac = complement(A,idx)
Ic=true(numel(A),1);
Ic(idx)=false;
Ac=A(lc(idx));
end
  댓글 수: 1
Andrew Landau
Andrew Landau 2018년 11월 25일
right on, this is going in my library. Thanks Matt

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기

제품


릴리스

R2017a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by