I wish to create m = 10^5 sparse matrices of size n by n, say n = 10^4. I have been using
A = cell(m, 1);
for i = 1:m
row = ...; col = ...; val = ...; % here ... means some certain assignment in column vectors
A{i} = sparse(row, col, val, n, n);
end
But it is too slow. So I tried to use the types ndSparse (https://www.mathworks.com/matlabcentral/fileexchange/29832-n-dimensional-sparse-arrays) and sptensor (https://www.sandia.gov/~tgkolda/TensorToolbox/index-2.6.html). They do the job fast by creating m matrices all at once in 3d (n*n*m). It requires concatenating index and value vectors, where the speed is acceptable. However, I then need individual matrices for some operations that do NOT work on types ndSparse and sptensor. For example,
[R, p] = chol(A(:, :, i));
does not work. If I convert the object to Matlab sparse type as
[R, p] = chol(sparse(A(:, :, i)));
then it is even slower than creating A one by one in the for loop. Considering that Matlab does not support multidimensional sparse arrays (so I cannot reshape the abovementioned types into Matlab sparse tensor), how can I speed up creating m sparse matrices? Thank you!

댓글 수: 1

Steven Lord
Steven Lord 2018년 9월 13일
How are you planning to use those 1e5 sparse matrices later in your code? I want to see if it is possible to reduce the number of sparse matrices you need to create while still achieving your ultimate goal by using a different approach or algorithm.

댓글을 달려면 로그인하십시오.

 채택된 답변

Matt J
Matt J 2018년 9월 13일
편집: Matt J 2018년 9월 13일

0 개 추천

Once you have A in ndSparse form, you can then split it into the cell array form you were originally trying to get using mat2cell:
Ar=sparse( reshape(A,n,m*n) );
Acell=mat2cell(Ar,n,ones(1,m)*n);
and then
[R, p] = chol(Acell{i});

댓글 수: 9

Melody
Melody 2018년 9월 13일
Thank you Matt! It seems that mat2cell is more expensive than assigning the matrices one by one in each cell -- the former takes about three times the elapsed time of the later...
Matt J
Matt J 2018년 9월 13일
편집: Matt J 2018년 9월 13일
What if you do something like this instead
Ar=sparse2d( reshape(A,n,m*n) );
for i=0:m-1
[R,p]=chol(A(:,1+n*i:n+n*i);
end
Matt J
Matt J 2018년 9월 13일
편집: Matt J 2018년 9월 13일
For me, the loop below completes in about 2.5 sec. That's an indication that you should be able to extract the submatrices from the wide-matrix Ar pretty fast - definitely faster than the time it takes you to do other stuff like Cholesky decompositions. What speeds are you looking for?
m=1e5; n=1e4;
A=sprand(n,m*n,1e-7);
tic;
for i=0:m-1
z=A(:,1+n*i:n+n*i);
end
toc;
Thanks for your suggestions! Here is my result:
for i = 1:m
A0{i} = sparse(row{i}, col{i}, val{i}, n, n); % 208.06 seconds
end
A = ndSparse.build([row, col], val); % 0.11 seconds
Ar = sparse(reshape(A, n, n*m)); % 8.05 seconds
A1 = mat2cell(Ar, n, n*ones(1, m)); % 616.71 seconds
for i = 1:m
A2{i} = Ar(:, (1 + n*(i - 1)):(n*i)); % 191.38 seconds
end
It seems that building A2 takes slightly less time that building A0. My m is 56000 and my n is 20000. There are more operations I wish to do than just chol, which demands separating the matrices.
Matt J
Matt J 2018년 9월 14일
편집: Matt J 2018년 9월 14일
It seems that building A2 takes slightly less time that building A0.
Did you preallocate A2? I can't see why it would be that slow, unless it's a memory allocation issue.
Do you really need all the separate sub-matrices in memory at the same time? That's a lot of extra (non-contiguous) memory consumption.
Yes, I have set
A2 = cell(m, 1);
beforehand. Is there other ways to preallocate memory? Maybe my computer is just slow. :(
Matt J
Matt J 2018년 9월 14일
편집: Matt J 2018년 9월 14일
Again, do you really need them all simultaneously? And do you really need them all in separate cells? It seems redundant to have the same data in two places at the same time (Ar and A2).
I've been trying to think of a way to not use them simultaneously but haven't found a solution. However I found a way to work around building big sparse matrix (thanks for reminding me that even empty big sparse matrix can take up a lot of memory):
[C, ~, ic] = unique([row{i}; col{i}]);
len_C = length(C);
len_ic2 = length(ic)/2;
Ai = sparse(ic, [ic((len_ic2 + 1):end); ic(1:len_ic2)], val{i}, len_C, len_C);
This works for all my operations. However, the "unique" function in the first line above takes even more time than before... Is there any way to make it faster?
Melody
Melody 2018년 9월 16일
편집: Melody 2018년 9월 16일
I came up with a way to not keep all matrices at the same time, and codes as in the 3rd comment in this answer seem to save 50% time for the large data. Thank you!
May I know what the proper way to include "ndSparse.m" in my software is? Is it enough to include "license.txt" and cite the webpage "https://www.mathworks.com/matlabcentral/fileexchange/29832-n-dimensional-sparse-arrays" in my document?

댓글을 달려면 로그인하십시오.

추가 답변 (2개)

Matt J
Matt J 2018년 9월 13일
편집: Matt J 2018년 9월 13일

0 개 추천

It might also be a good idea, instead of constructing a 3D sparse array or a cell array of separate matrices, to instead create a big block diagonal matrix, where each n x n matrix is one of the diagonal blocks. That way you can do the entire Cholesky decomposition in a single call to CHOL.

댓글 수: 4

Melody
Melody 2018년 9월 13일
Thank you, but it does not apply to my application. Is it possible to efficiently incorporate CHOL as a method of ndSparse?
Matt J
Matt J 2018년 9월 13일
편집: Matt J 2018년 9월 13일
How does it "not apply" to your application?
Making CHOL a method of ndSparse will not address the problem because the bottleneck would still be in accessing the sub-matrices A(:,:,i). Even with normal 2D sparse matrices, you are seeing that that is problem.
Matt J
Matt J 2018년 9월 13일
What are the densities of these matrices nnz(A)/numel(A) ?
Melody
Melody 2018년 9월 13일
Very sparse -- less then 10 nonzero elements for each matrix.

댓글을 달려면 로그인하십시오.

Christine Tobler
Christine Tobler 2018년 9월 13일

0 개 추천

The sparse function will often be faster if the second input, col, is sorted in ascending order. If you can cheaply construct col in a way that this is the case, that should help a bit with performance.

댓글 수: 1

Melody
Melody 2018년 9월 14일
Thank you for your advice! It may probably work well with denser matrices, but my matrices only have less than 10 nonzero entries each, so sorting them then creating the sparse matrix actually takes more time. :(

댓글을 달려면 로그인하십시오.

카테고리

도움말 센터File Exchange에서 Creating and Concatenating Matrices에 대해 자세히 알아보기

제품

릴리스

R2015a

질문:

2018년 9월 13일

편집:

2018년 9월 16일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by