Speed performance: Find all y-vector entries that have the same value in an x-vector of equal length.

Question

Zwithouta 2018년 4월 19일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/396100-speed-performance-find-all-y-vector-entries-that-have-the-same-value-in-an-x-vector-of-equal-length

댓글: Zwithouta 2018년 4월 20일

Hi Guys,

imagine you have an x-vector

   x = [1 1 2 2 1 1];

as well as an y-vector of equal length.

    y = [1 2 3 4 5 6]

And you want to find out, which values the y-vector has at those indices where the x-vector has the value 1, as well as the y-vector values at those indices where the x-vector has the value 2. So that the solution for this example would be

solution = {[1 2 5 6], [3 4]}

That is the first cell of 'solution' contains all y-values, that belong to the first unique x-value (1) & the second cell of it contains all y-values, that belong to the second unique x-value (2).

My ideas on how to implement this for vectors of any length are given below.

Does anybody has an idea, how to avoid the for-loop in approach A???

x = randi(10, 1, 100000);
y = randi(5, 1, 100000);

% Approach A:
tic 
xVals = unique(x); % get unique x values
tf = x == xVals';  % get true-or-false logical array (each row specifies where the n-th value of xVals occurs in x).
yVals = cell(1, size(tf,1)); % preallocate yVals
for i = 1:size(tf,1)
    yVals(i) = {y(tf(i,:))}; % Assign all y-values that belong to one xVal-entry to one cell.
end
toc

% Approach B: 
tic
[xs, id] = sort(x); % xsorted
ys = y(id); % sort y the same way x was sorted
[xVals, ia] = unique(xs);
% Divide the ys-vector into sections whose corresponding xs-indices have the same value in the xs vector.
yVals = mat2cell(ys, [1], [diff(ia)', length(ys)-sum(diff(ia))]); 
toc

% Runtimes:
%  x = randi(10, 1, 1000000);
%  y = randi(5, 1, 1000000);
%  A: Elapsed time is 0.100635 seconds.
%  B: Elapsed time is 0.149284 seconds.

%  x = randi(100, 1, 1000000);
%  y = randi(5, 1, 1000000);
%  A: Elapsed time is 0.886396 seconds.
%  B: Elapsed time is 0.139918 seconds.

% Comparison
% A is faster than B if there are few different x-values (about 10-20). 
% The number of different x-values has high impact on speed of A.
% A is faster than B if x & y-vector are extremely long (> 100000).
% The vector length has low impact on speed of B.

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

Star Strider 2018년 4월 20일

My pleasure!

I was surprised that accumarray was slower than the loop, even though it makes for neater code.

I didn’t test ‘Approach B’, since ‘Approach A’ (chosen randomly) was significantly faster than my accumarray call.

Zwithouta 2018년 4월 20일

편집: Zwithouta 2018년 4월 20일

MATLAB Online에서 열기

Enjoy it with caution! Approach A really suffers terribly from increasing the number of unique-values in the x-vector, since it increases the number of needed for-loop iterations. Approach B in contrast shows a really stable performance.

y = randi(5, 1, 1000000);

x = randi(10, 1, 1000000);
Elapsed time is 0.108344 seconds. % A
Elapsed time is 0.195844 seconds. % B

x = randi(100, 1, 1000000);
Elapsed time is 0.877760 seconds. % A
Elapsed time is 0.182726 seconds. % B

x = randi(1000, 1, 1000000);
Elapsed time is 11.379839 seconds.% A
Elapsed time is 0.196772 seconds. % B

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

Paul Shoemaker 2018년 4월 20일

0
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/396100-speed-performance-find-all-y-vector-entries-that-have-the-same-value-in-an-x-vector-of-equal-length#answer_316300

MATLAB Online에서 열기

If you're trying to avoid a FOR loop you can use arrayfun, although I'm not sure it improves speed.

It would look something like this:

x = [1 1 2 2 1 1];
y = [1 2 3 4 5 6];
solution = arrayfun(@(z) y(z==x),unique(x),'uniformoutput',false);

Paul Shoemaker

http://www.matlabinvesting.com

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

Zwithouta 2018년 4월 20일

Thanks for sharing your thoughts on that topic!

Your approach is the fastest for long x-vectors so far, but suffers similar to approach A from high numbers of unique x-values.

댓글을 달려면 로그인하십시오.

Speed performance: Find all y-vector entries that have the same value in an x-vector of equal length.

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

답변 (1개)

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

Speed performance: Find all y-vector entries that have the same value in an x-vector of equal length.

댓글 수: 7 이전 댓글 5개 표시이전 댓글 5개 숨기기

답변 (1개)

댓글 수: 1 이전 댓글 -1개 표시이전 댓글 -1개 숨기기

참고 항목

카테고리

태그

제품

Community Treasure Hunt

댓글 수: 7
이전 댓글 5개 표시이전 댓글 5개 숨기기

댓글 수: 1
이전 댓글 -1개 표시이전 댓글 -1개 숨기기