Geology 5670/6670 Inverse Theory 6 Feb 2015 © A.R. Lowry 2015 Read for Mon 9 Feb: Menke Ch 5 (89-114) Last time: The Generalized Inverse; Damped LS The Generalized Inverse uses Singular Value Decomposition to recast the problem as p non-zero eigenvalues & eigenvectors of G The resulting singular value decomposition of G is, p ≤ min(N, M) with pseudoinverse: minimizes both e T e and m T m. Solution variance can be reduced by setting small i = 0 (especially if i < !) This leads to a fundamental trade-off between solution variance and model resolution…
So we have a tradeoff between resolution and variance: (solution variance) decreasing p (model irresolution) This tradeoff (degraded model resolution is required to get reduced solution variance) is an inherent limitation of all inverse problems…
Damped Least Squares (Menke § ) Suppose we have an over-determined problem that is ill-conditioned (i.e., M << 1 ) so the determinant of G + is close to zero. Can we reduce solution variance without throwing away parameters? Idea : Combine a minimization of e T e and m T m for the over-determined (least-squares) problem! Define a new objective function that combines residual length & solution length: To minimize set
Recall so: or: Thus, the pseudoinverse for damped least squares ( DLS ) is: The condition number for OLS is ; Identity: If eigenvalues of A are i, eigenvalues of A + kI are i + k So condition number for DLS is
The covariance matrix for DLS (assuming C = 2 I ) Gives: As compared to OLS: The resolution matrix is now where:
(solution variance) increasing 2 (model irresolution) With the important difference that the dependence on parameters for which the solution is ill-conditioned is tapered instead of sharply cut-off. So resolution/variance curve is similar to that for the generalized inverse:
(solution variance) increasing 2 (model irresolution) Can minimize length of Can use a Bayesian statistical criterion (we’ll get to this later). 2 should be m. How do we choose an “optimal” 2 ? minimum length
Maximum Likelihood (Menke § ) Suppose we have data d with known probability density function (pdf): Here we are assuming data d depend much more strongly on parameters m than on any other possible parameters… The probability P that random variable X lies on x 1 ≤ X ≤ x 2 is so the probability of making observations within ± of those we actually measured is We assume very small so that
Method : Find the member of a family of distributions f(d | m) which maximizes the probability of “getting the d that we got” from among all possible m. The likelihood function L of m given d is and we want to maximize L as a function of m : Case 1 : Assume jointly normal, zero mean errors. Recall that Then: