INVERSE MODELING TECHNIQUES Daniel J. Jacob
GENERAL APPROACH FOR COMPLEX SYSTEM ANALYSIS Construct mathematical “forward” model describing system As function of limited # of state variables (state vector x ) Solution: a posteriori Improve observation system Assemble a priori knowledge Use model to relate state variables to observables y Use observations to improve knowledge of x; Improve model Improve a priori
EXAMPLE: INVERSE ESTIMATE OF SURFACE CO 2 FLUXES ( x ) FROM ATMOSPHERIC MIXING RATIO MEASUREMENTS Fuel consumption dataEcosystem model and dataOcean model and data bottom-up constraint x a (a priori knowledge of fluxes) Continuity equation w/ surface fluxes (state vector x ) as boundary conditions Measurements from aircraft, towers, satellites… Observation vector y Jacobian matrix describes the CTM top-down constraint on x optimal a posteriori estimate: fit x to top-down and bottom-up constraints Chemical Transport Model (CTM)
BAYES’ THEOREM: FOUNDATION FOR INVERSE MODELS P(x) = probability distribution function (pdf) of x P(y|x) = pdf of y given x A priori pdfObservation pdf Normalizing factor (unimportant) A posteriori pdf Maximum a posteriori (MAP) is the solution to
SIMPLE LINEAR INVERSE PROBLEM FOR A SCALAR consider a single measurement used to quantify a single source (fuel burned) X (emission factor) a priori bottom-up estimate x a a Monitoring site measures concentration y Forward model gives y = kx “Observational error” y instrument fwd model y = kx ± y Bayes’ theorem: Max of P(x|y) is given by minimum of cost function Solution: where g is a gain factor Let x be the true value: where a is an averaging kernel
GENERALIZATION: CONSTRAINING n SOURCES WITH m OBSERVATIONS Forward model: A cost function defined as is generally not adequate because it does not account for correlation between sources or between observations. Need to go to vector-matrix formalism: with Jacobian matrix K (elements k ij) ) and error covariance matrices leading to formulation of cost function:
VECTOR-MATRIX REPRESENTATION OF LINEAR INVERSE PROBLEM Scalar problemVector-matrix problem Optimal a posteriori solution (retrieval): Gain factor: A posteriori error: Averaging kernel: Jacobian matrixsensitivity of observations to true state (fwd model) Gain matrix sensitivity of retrieval to observations Averaging kernel matrix sensitivity of retrieval to true state
APPLICATION OF BAYESIAN INVERSION TO SATELLITE RETRIEVALS Tr (A) gives the number of pieces of info in the profile (1-2 for MOPITT) Here y is the vector of wavelength-dependent radiances (radiance spectrum); x is the state vector of concentrations; forward model y = Kx is the radiative transfer model Illustrative MOPITT averaging kernel matrix
INVERSE ANALYSIS OF MOPITT AND TRACE-P DATA TO CONSTRAIN ASIAN SOURCES OF CO TRACE-P CO DATA (G.W. Sachse) Bottom-up emissions (customized for TRACE-P) Fossil and biofuel Daily biomass burning (satellite fire counts) GEOS-CHEM Chemical Transport Model (CTM) MOPITT CO Inverse analysis validation chemical forecasts top-down constraints OPTIMIZATION OF SOURCES Streets et al. [2003] Heald et al. [2003a]
COMPARE TRACE-P OBSERVATIONS WITH CTM RESULTS USING A PRIORI SOURCES Model is low in boundary layer north of 30 o N: suggests Chinese source is low Model is high in free trop. south of 30 o N: suggests biomass burning source is high Assume that Relative Residual Error (RRE) after bias is removed describes the observational error variance (20-30%) Assume that the difference between successive GEOS-CHEM CO forecasts during TRACE-P (t o +48h and t o + 24 h) describes the covariant error structure (“NMC method”) Palmer et al. [2003], Jones et al. [2003]
CHARACTERIZING THE OBSERVATIONAL ERROR COVARIANCE MATRIX FOR MOPITT CO COLUMNS Diagonal elements (error variances) obtained by residual relative error method Add covariant structure from NMC method
SELECTING THE STATE VECTOR OF CO SOURCES Start from possible16-element vector Try separating fuel vs. biomass burning sources in 4 regions Do singular value decomposition of normalized Jacobian n singular values > 1 identify modes for which obs system gives useful constraints MOPITT: n = 10 Don’t separate fuel from biomass burning sources Merge Korea and Japan TRACE-P: n = 4 11-component state vector Don’t separate fuel from biomass burning sources Merge Korea, Japan, N. China Merge central and western China Merge India and Indonesia w/ rest of world 6-component state vector Assume that spatial distribution within region, temporal variation are known (“hard” constraints)
COMPARATIVE INVERSE ANALYSIS OF ASIAN CO SOURCES USING DAILY MOPITT AND TRACE-P DATA MOPITT and TRACE-P both show underestimate of anthropogenic emissions (40% for China, likely due to under-reporting of industrial coal use) MOPITT and TRACE-P both show overestimate of biomass burning emissions in southeast Asia ;very low values from TRACE-P could reflect transport bias MOPITT has higher information content than TRACE-P because it observes source regions and Indian outflow MOPITT information degrades if data are averaged weekly or monthly Ensemble modeling of MOPITT data indicates 10-40% uncertainty on retrieved sources Heald et al. [2004] CO observations from Spring 2001, GEOS-CHEM CTM as forward model TRACE-P Aircraft COMOPITT CO Columns 4 degrees of freedom 10 degrees of freedom (from validation)
BASIC KALMAN FILTER TO OPTIMIZE TEMPORAL VARIATION OF SOURCES Consider vector of observations at discrete times y t, to be used in a sequential manner to optimize the time-evolving source x t Assume that we have previously obtained a best estimate at time t-1 Assume a source model x t = Mx t-1 + for the evolution of x from t-1 to t, which gives the a priori value x a of x at time t with an error S at = M M T + S Run forward model from t-1 to t, optimize x t using the observations y t Repeat the process for t+1 Filter can also be run backward
ITERATIVE SOURCE OPTIMIZATION IN 4-D VAR ° ° ° ° 00 22 11 33 x2x2 x1x1 x3x3 x0x0 Minimum of cost function J Estimate with a numerical method involving Lagrange multipliers and the model adjoint, rather than analytically Advantage: computationally efficient for large state vectors Problem: errors on x are not characterized
APPLICATION OF 4-D VAR TO SYNTHETIC CO 2 FLUX INVERSION (D. Baker, NCAR) 4-D VAR allows optimization of the surface flux field on the native grid resolution of the forward model