Matrix inversion differences between versions

조회 수: 4 (최근 30일)
Adam Ward
Adam Ward 2020년 6월 1일
댓글: Bjorn Gustavsson 2020년 6월 2일
I have been working with some relatively large arrays (approx. 32000x32000, with a sparsity of >0.99) as part of a modelling project. I first wrote a version of my code in 2018a, and taking the inverse of my matrix, A, and multiplying this with a vector, b, worked fine using the "A\b" commmand.
I got a new laptop today and finally got around to updating to version 2020a. The same matrix inversion now doesn't work - I get the warning that the "matrix is singular to working precision". After this warning has been displayed, the operation "A\b" returns an array comprised entirely of NaN.
I tried running the exact same code on my desktop to check (using 2018a), and it worked as normal. I then installed version 2018a on my new laptop, and again the code ran fine. So I am quite sure that it is to do with some difference between 2018a and 2020a.
Does anybody know anything about this difference between the two versions? I would like to be using the most up to date release of MATLAB, but it seems that my code won't run. Any insight would be much appreciated.
**Note: I have not provided my code here as it is part of an academic study and it would not be appropriate for me to release it.

채택된 답변

Bjorn Gustavsson
Bjorn Gustavsson 2020년 6월 1일
Have you tried to look at the smallest singular values? You can calculate those with svds. If they are small or very small your calculation A\b is not necessarily all that meaningful.
In my fields we are now required to give open access to our data.
  댓글 수: 5
Adam Ward
Adam Ward 2020년 6월 1일
Thank you for the detailed response, it is very helpful.
Bjorn Gustavsson
Bjorn Gustavsson 2020년 6월 2일
To follow up on Johns comment: I should have made that a bit more clear. However, it is still just the beginning of the next chapter in practical application of linear inverse-problem theory. There are two fundamental facts that are often neglected in introductory linear algebra courses:
  1. There will always be some kind of noise in the data, b in this case, even if there is no measurement noise the AD-conversion will have some limited number of bits worth of accuracy.
  2. For the case where we have a discretized version of a continuous unknown, x in my answer, and discrete data (for example tomographic reconstruction-problems) the problem is underdetermined as soon as we have our discrete data, and the "measurement" or "theory" matrix has infinite condition-number. If we hide that by under-discretize the model-space, i.e. the dimension of x, we have implicitly constrained/regularized the solution but this does not change the fundamental character of the problem.
That we need to take a few steps beyond the condition-number is obvious if we look at two modified version of John's example. Let us have two data-vectors:
b1 = [0.99876
0.99885
0.99496
0.99317
1.0027
1.0222
6.6246e-14];
and
b1p = [7.0761e-14
8.738e-14
9.8103e-14
1.2058e-13
8.1381e-14
9.5989e-14
7.4862e-14];
Both produced from the same unknowns array:
x = ones(7,1);
but with two different theory-matrices:
M7 = diag([ones(1,6),1e-13]);
M7e = diag([ones(1,6)*1e-13,1e-13]);
The first has a condition-number of 1e13 the second has a condition-number of 1. If we have normal-distributed measurement noise with widths that is max of 1% and 4/2^16 (just to model something with finite AD-precision):
b1 = M7*ones(7,1);
b1 = b1 + [0.01*randn(6,1);0.25*randn(1)*1e-13];
b1p = M7e*ones(7,1);
b1p = b1p + [0.25(randn(7,1))].*b1p;
Our x-estimates will be far better from the first problem (x1 = M7\b1) than from the second (x1p = M7e\b1p), even though the condition-number points in the other direction. However, the last component of x1 will have much larger variance than the first 6.
For the case where we do not under-discretize the theory-matrix it will now be singular, i.e. with some number of non-zero eigenvalues, lets say p. Then the condition-number is infinity, but this is both expected and irellevant since the problem was underdetermined the moment we got our finite set of measurements. To get an estimate of our unknown we can naively try the general (Moore-Penrose) svd-inverse:
x = V_p*inv(S_p)*U_p.'*b_obs;
Where U_p is the p rows of U corresponding to the non-zero eigenvalues etc. But since our observations, b_obs, are the sum of the ideal measurements and some unavoidable noise:
b_obs = b_ideal + n
we see that with
x = V*inv(S)*U.'*b_ideal + V*inv(S)*U.'*n;
a problem with the components of U.'*n that corresponds to the really small eigenvalues will be hugely amplified, and here the amplified noise component will, in general, spread to most components in x (in contrast to the previous nice diagonal problem). Instead of using the "raw" inverse of S we can resort to either a further truncation of S (and U and V) or use a damped version (Tikhonov-inverse) where we use
alpha = 1; % Or some other suitable value
iZ_alpha = diag(diag(S)./(diag(S).^2+alpha));
x = V_p*iZ*U_p.'*b_obs;
Here we can continue and calculate the covariance matrix for x and the resolution-matrix, but not at this moment.
If the theory-matrix is too big for SVD-decomposition we have to resort to iterative methods and hope that they behave well - at least some do.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Correlation and Convolution에 대해 자세히 알아보기

제품


릴리스

R2020a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by