How can I make this code run faster with a sparse matrix

Question

Esegboria Osarhemen 2019년 3월 2일

0
링크

이 질문에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/447828-how-can-i-make-this-code-run-faster-with-a-sparse-matrix

댓글: John D'Errico 2019년 3월 2일

채택된 답변: John D'Errico

Assuming I have the following code

x=[1:30164675]

y=x

N=length(x)

z=sparse(N,N)

for i=1:N

for j=1:N

if x(i)~=y(j)

z(i,j)=1

end

z

For my real 'x' data, I have 30164675 values, alot of which a equal, so I should have more zero entries in the matrix except from those on the diagonal. But when I run this code, it takes too long. Am I using the sparse matrix correctly or is there any way to use vectorization and a sparse matrix to make this code faster?

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글을 달려면 로그인하십시오.

이 질문에 답변하려면 로그인하십시오.

Answer 1

John D'Errico 2019년 3월 2일

1
링크

이 답변에 대한 바로 가기 링크

https://kr.mathworks.com/matlabcentral/answers/447828-how-can-i-make-this-code-run-faster-with-a-sparse-matrix#answer_363462

편집: John D'Errico 2019년 3월 2일

How can I say this? You are using sparse in the very worst way possible. Well, you are.

Not only that, but your code will dump HUGE amounts of stuff into the command window at every iteration. So you very, very much need to learn when to terminate your lines of code with a semi-colon.

A simple rule about sparse is you never build such a matrix one element at a time. That forces MATLAB to stuff one new element into the matrix each time through the loop. It is not so inefficient if the matrix is a full one, since MATLAB does not need to shuffle elements around in memory at ach iteration. But a sparse matrix is stored differently, not storing the zeros. So single element insertion into a sparse matrix is terribly slow. NEVER do it that way.

One caveat is that if you used spalloc instead of sparse, and inserted the elements into the array in the proper way, you could reduce the time somewhat. But that would requires you to have a very good knowledge of sparse storage, and it would still be inefficient.

Instead, you need to create a list of the non-zero elements of the sparse matrix in advance., THEN call sparse ONCE with that entire list. (Read the help for sparse. Then read it again. I recall being confused the first time I read it.)

As far as how to find all of the elements in your list that are duplicates, I'd probably start with unique or ismember. The simplest way of course to create the matrix you have is this:

X = randi(10000,[1,20000]); % a random vector
Xs = sparse(X);
A = Xs' == Xs;
nnz(A)
ans =
       60114
       
whos A
  Name          Size                Bytes  Class      Attributes
  A         20000x20000            880062  logical    sparse    

So I started with a set of 20000 integers, no larger than 10000. That tells me there will be 2 duplicates on average for each. As you can see, it creates a sparse logical matrix of the desired size.

min(sum(A,1))
ans =
   (1,1)        1
max(sum(A,1))
ans =
   (1,1)        9

The above code will work as long as you have release R2016b or later. Earlier releases would do it like this:

A = bsxfun(@eq,Xs',Xs);

The trick is to make the vector X sparse, so then the result will also be sparse. This works even though the vector is not truly a sparse one at all, but the final result needs to be so.

Could I have built that matrix using unique or ismember? Probably. But the direct test Xs'==Xs is the solution I would use. And I would NEVER, EVER solve the problem as you tried.

댓글 수: 2
없음 표시없음 숨기기

Esegboria Osarhemen 2019년 3월 2일

Thank you very much, I get it now

John D'Errico 2019년 3월 2일

Using sparse matrices forces you to think a different way about how to build them. The first time I did (in fortran, circa 1990, I think) I had a tough time. It took too long to build them, or so I thought, because I was not building them correctly.

So you need to use tools like spdiags and sparse to build them in one call. Once you get used to doing so, you will find sparse matrices can be hugely valuable.

댓글을 달려면 로그인하십시오.

How can I make this code run faster with a sparse matrix

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2
없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

How can I make this code run faster with a sparse matrix

댓글 수: 0 이전 댓글 -2개 표시이전 댓글 -2개 숨기기

채택된 답변

댓글 수: 2 없음 표시없음 숨기기

추가 답변 (0개)

참고 항목

카테고리

태그

Community Treasure Hunt

댓글 수: 0
이전 댓글 -2개 표시이전 댓글 -2개 숨기기

댓글 수: 2
없음 표시없음 숨기기