MATLAB Answers

Multiplication of large matrices

조회 수: 85(최근 30일)
Nikan Fakhari
Nikan Fakhari 22 Dec 2020
댓글: Jan 22 Dec 2020
Hi there,
I have two matrices that are relarively lagre (each one is 94,340 X 2240) and I need to multiply them together.
I understand that this might take some time but I feel like its taking longer than usual, I have ran it for 30 min and the code is still running,
Do you guys have any suggestion for this?
Thank you,
Nikan

  댓글 수: 6

표시 이전 댓글 수: 3
Nikan Fakhari
Nikan Fakhari 22 Dec 2020
This is the code:
transpose_IQ_Casorati= transpose(IQ_Casorati);
UMatrix_Raw=(IQ_Casorati)*transpose_IQ_Casorati;
IQ_Casorati has a size of 94340 X 2240
so to answer your question @Walter this is exatcly what I am doing but I have a hard drive with storage of 1 T.
Nikan
Bruno Luong
Bruno Luong 22 Dec 2020
Think again: Do you need to explicitly compute the product?
What do you do later? For example if you want to compute the product with a vector, then just keep X of (94340 x 2240) dimension like this
M = X * X'; % run out of RAM
z = M * y
can be carried out by
z = X * (X'*y)
The second formula does not require to store M and it is must faster.
So think again, do you really need M?
Walter Roberson
Walter Roberson 22 Dec 2020
You are exceeding your primary memory, and MATLAB is having to write to swap space. That is a minimum of 10 times slower than keeping everything in primary memory, worse if you are writing to hard drive, less bad if you are writing to SSD. SSD would make a big difference for performance for this purpose, but you would still have to expect that it would be slow.
If you can avoid doing the multiplication, you should do so. For example if your computation really only needs one row (or column) of the result at a time and does not need to keep storing the result of the multiplication afterwards, then it could potentially save a lot of time to loop doing partial products, as the processing for each partial product might stay within main memory.

댓글을 달려면 로그인하십시오.

답변(2개)

James Tursa
James Tursa 22 Dec 2020
편집: James Tursa 22 Dec 2020
Do NOT transpose your matrix explicitly before the multiply. That will only cause a deep copy, double the memory footprint, and slow things down. Instead let the MATLAB parser see the transpose operation as part of the multiply and call a symmetric BLAS routine to do the operation without explicitly transposing first. Faster and less memory used. E.g.,
UMatrix_Raw = IQ_Casorati*IQ_Casorati.';
That being said, you are doing the worst case and trying to generate a 94340 x 94340 result. The slow computation time is very likely because of paging out to the hard drive. You need to rethink your problem and how you are going about solving it.

  댓글 수: 0

댓글을 달려면 로그인하십시오.


Jan
Jan 22 Dec 2020
A short tes:
x = rand(2240, 94340); % 1.6 GB RAM
tic
y1 = x * x';
toc
tic
xt = x';
y2 = x * xt;
toc
On my i5 mobile chip in R2018b: 15.1 sec versus 28.5 sec. This shows that James' argument of the explicitly transposed matrix has a remarkable effect.
With the original input data [94340 x 2240] the output would need 71.2 GB of RAM and this is more than my laptop has. This might exceed the memory of your computer also.
You did not answer the question, if the matrix is sparse, so if it just some elements differ from zero. If it is full, remember, that the output of X*X' is symmetric. Then half of the 71.2 GB is redundand. If you habe e.g. 64 GB of RAM than using this fact can allow you to run the code efficiently. Unfortunately there are no built-in methods to exploit the symmetry of matrices and you'd have to implement this by your own, at least the matrix multiplication for your code.
@all readers: Is there a standard library to handle full or sparse symmetric matrices efficiently?

  댓글 수: 2

Bruno Luong
Bruno Luong 22 Dec 2020
For the full matrix the size is 63.3 Gb
>> (94340*94340*8)/(1024^3)
ans =
66.3104
For sparse matrix, it would be help since the product will likely not sparse, and quite the opposite, it needs about twice the size to store inffeficiently in sparse format.
Jan
Jan 22 Dec 2020
Correct, Bruno. My "71.2" uses the metrical gigabyte, your "66.3" the binary gibibyte.
If the matrix A is a sparse block diagonal matrix, A*A' is also. But this is not the general case.

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by