Using subrange makes array operations slow

Question

0 개 추천

This is a question about the performance of code which I would naively expect to be much better. I define two random arrays:

a = randn(1, 10000);
b = randn(10000, 1);

Then try to multiply them together in two ways:

timeit(@() a*b)
ans =
   4.3080e-06
timeit(@() a(1:10000)*b(1:10000))
ans =
   1.5634e-05

You can see that when I use a sub-range of the arrays, multiplication gets about 3.6x slower, even in the case when the subrange is the whole array.

I could imagine that there is some overhead in parsing the subrange part of the command; but when the vectors become longer, the "overhead" grows:

a = randn(1, 1000000);
b = randn(1000000, 1);
timeit(@() a*b)
ans =
    0.0010
timeit(@() a(1:1000000)*b(1:1000000))
ans =
    0.0062

So, I have the two-fold question:

1) As a matter of curiosity, why is the performance so much worse when I use sub-ranges?

2) When we wish to perform read-only operations on sub-ranges, what can we do (besides copying the array) to keep performance reasonable?

Thanks, Clayton

PS. Version information: MATLAB Version: 8.3.0.532 (R2014a) Operating System: Microsoft Windows 7 Ultimate Version 6.1 (Build 7601: Service Pack 1) Java Version: Java 1.7.0_11-b21 with Oracle Corporation Java HotSpot™ 64-Bit Server VM mixed mode

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Follow Question

Answer 1

Matt J 2014년 12월 18일

편집: Matt J 2014년 12월 19일

MATLAB Online에서 열기

0 개 추천

1) As a matter of curiosity, why is the performance so much worse when I use sub-ranges?

When you read a sub-array on the right hand side of a statement, a temporary variable holding a copy of the sub-array is made internally. E.g., the command

c=sum(a(1:N))

creates a temporary variable to hold [a(1),a(2),...,a(N)] occupying its own block of memory (separate from the memory occupied by the complete vector a) which is then passed to the sum() command. This mechanism makes it so that the sum() command doesn't have to know anything about what larger array a(1:N) belongs to in order to do its work. It just handles it like any other input variable. However, there is time and memory overhead for the copying, of course.

For left-hand side vectorized assignment, e.g.,

a(1:N)=1

copies are not made.

2) When we wish to perform read-only operations on sub-ranges, what can we do (besides copying the array) to keep performance reasonable?

Use a MEX file. In C-code, you can do all the in-place computations you want.

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

Matt J 2014년 12월 19일

편집: Matt J 2014년 12월 20일

MATLAB Online에서 열기

I also seem to remember that in early versions of MATLAB, in the expression

c=sum(a(1:N))

not only would the data [a(1),a(2),...,a(N)] get copied, but also the expression 1:N would internally generate a temporary index vector N elements long. I.e., it was equivalent to doing the following

      indices=1:N;
      c=sum(a(indices));

which of course has further time and memory overhead due to the creation of "indices".

댓글을 달려면 로그인하십시오.

Using subrange makes array operations slow

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

태그

Community Treasure Hunt

Using subrange makes array operations slow

댓글 수: 0 이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 1 이전 댓글 -1개 표시 이전 댓글 -1개 숨기기

추가 답변 (0개)

카테고리

제품

태그

참고 항목

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시 이전 댓글 -2개 숨기기

댓글 수: 1
이전 댓글 -1개 표시 이전 댓글 -1개 숨기기