Examples of Four-Dimensional Data Assimilation in Oceanography Ibrahim Hoteit I would like to thank you all for coming and thank you also for giving me the opportunity to present my work in data assimilation here. It is something special to talk about data assimilation in Maryland. I entitled my presentation “examples of four dimensional data assimilation in oceanography” because I wanted to give you an overview of my work in data assimilation, including the techniques and applications. The applications are all related to ocean problems, but the techniques I will be discussing can be applied as well in meteorology, ecology, or even on any high dimensional models. University of Maryland October 3, 2007
Outline 4D Data Assimilation Examples in Oceanography 4D-VAR and Kalman Filtering Application to Oceanography Examples in Oceanography 4D-VAR Assimilation Tropical Pacific, San Diego, … Filtering Methods Mediterranean Sea, Coupled models, Nonlinear filtering ... Discussion and New Applications Here is the outline of my talk. I will start with a brief overview on four dimensional data assimilation problems mainly describing the four dimensional variational problem and the Kalman filter and discussing issues with their use in oceanography. This will help me to explain the techniques and the applications that I will present in two parts: one part on the four dimensional variational assimilation systems we are developing at Scripps. I will mainly present a tropical pacific assimilation system and a coastal system for san diego, and a part on my work on filtering methods including applications to coupled physical/biological models in the Mediterranean Sea, and a new work on nonlinear filtering. I will finish the talk with a general discussion and talk about some new applications. I would like to note here that the talk is about discussing techniques and applications and is not about comparing 4D-VAR and the filtering approach.
Data Assimilation Goal: Estimate the state of a dynamical system Information: Imperfect dynamical Model: state vector, model error transition operator form to Sparse observations: observation vector, observation error observational operator A priori Knowledge: and its uncertainties As you know, data assimilation mainly consists of estimating the state of a dynamical system. What information we can use for our estimation problem: A dynamical model which describes the physics of the system and allows to evolve the state of the system in time. In general our knowledge of the physical system is not perfect and many errors are present, in oceanography for example we know poorly many parameters, the initial state, the open boundaries, the atmospheric forcing are also not perfect. We usually present such system as follows with xk the state vector at time k, eta the model error and is generally assumed to be Gaussian with the zero mean and covariance matrix Q, and M is the transition operator the enables to move forward in time from time k to time k+1. Another source of information comes from the observations but they are generally sparse in time and space, especially in oceanography. The relation between the observations and the state can be represented with this equation where yk contains all the observations at time k, epsilon denotes the associated observation errors, and H is the observational operator the describes the relation between the observations and the state. We often also have some a priori knowledge on the state, which usually include a priori estimate of the state and the associated uncertainties. This can be a forecast or a climatology. We use the terminology background state, and this explains the notation b.
Data Assimilation Data assimilation: Use all available information to determine the best possible estimate of the system state Observations show the real trajectory to the model Model dynamically interpolates the observations 3D assimilation: Determine an estimate of the state at a given time given an observation by minimizing 4D assimilation: Determine given 4D-VAR and Kalman Filtering So these are the information we have, and data assimilation makes use of all of them to determine the best possible estimate of the system state. It is like the observations guide the model toward the real trajectory and the model dynamically interpolates the observations in space and time. People started treating this problem at a given time; every time they have an observation they use it to improve their a priori estimate. This was called three dimensional data assimilation as the time dimension was not really included and the problem was formulated as follows: looking for the state that is close to the observations and not too far from my a priori knowledge. A cost function J is then defined where the first term constrains the distance from the observations and the second the distance from the background and the problem is to find x that minimizes J. The inverse of R and B play the role of weights, a large uncertainty on the observation and on the background means a small weight in the optimization. Because of the continuous progress in our computing capabilities, people are starting to use more sophisticated assimilation techniques which can use previous as well as future observations for the estimation which means more information at a given time, and that’s why there referred to as four dimensional data assimilation techniques, and they are very often classified in two categories, variational methods and Kalman filtering methods.
4D-VAR Approach Optimal Control: Look for the model trajectory that best fits the observations by adjusting a set of “control variables” minimize with the model as constraint: is the control vector and may include any model parameter (IC, OB, bulk coefficients, etc) … and model errors Use a gradient descent algorithm to minimize Most efficient way to compute the gradients is to run the adjoint model backward in time The variational approach consists of looking for the model trajectory that best fits the data by adjusting a set of well chosen parameters. We referee to these parameters as control variables and the problem is direct generalization of the three dimensional. The model state is now actually constrained by the data over a given period and we look for the control vector c that minimizes the cost function J with the dynamical model as constraint. The control vector may include any model parameter, and even model errors. But including model error might seriously complicate the problem for highly dimensional systems as in oceanography. As it is not possible to solve this problem analytically, it is generally solved using a gradient descent optimization algorithm. And the most efficient way to determine the gradients of the cost function with respect to the control is to use the so-called adjoint method which basically consists of running the adjoint of the model backward in time.
Kalman Filtering Approach Bayesian estimation: Determine pdf of given Minimum Variance (MV) estimate (minimum error on average) Maximum a posteriori (MAP) estimate (most likely) Kalman filter (KF): provides the MV (and MAP) estimate for linear-Gaussian Systems The Kalman filter is based on the Bayesian estimation theory which consists of determining the probability distribution function (pdf) of the state at a given time given previous observations up to the estimation time. I want be talking about the smoothing approach today. Once the pdf is determined, we can compute different estimates of the state as the minimum variance estimate which has the minimum error on average or the maximum a posteriori estimate the most likely given the observation. For a linear system with Gaussian errors, the solution of the Bayesian estimation problem is given by the Kalman filter which proceeds as a succession of two steps
Analysis Step (observation) The Kalman Filter (KF) Algorithm Initialization Step: Analysis Step (observation) Forecast Step (model) Kalman Gain Analysis state Analysis Error covariance Forecast state Forecast Error covariance Starting from an initialization which requires an estimate of the state and the associated error covariance matrix (the famous background). Then when we have an estimate of the state at a given time we use the model to make the forecast and to compute the associated error covariance matrix and every time we have an observation we correct the forecast with this formula and we also update the associated error covariance matrix.
Application to Oceanography 4D-VAR and the Kalman filter lead to the same estimate at the end of the assimilation window when the system is linear, Gaussian and perfect Nonlinear system: 4D-VAR cost function is non-convex multiple minima Linearize the system suboptimal Extended KF (EKF) System dimension ~ 108: 4D-VAR control vector is huge KF error covariance matrices are prohibitive Errors statistics: Poorly known Non-Gaussian: KF is still the MV among linear estimators Both approaches lead to the same solution at the end of the assimilation window for a linear Gaussian and perfect system. The application of these techniques in oceanography is not very easy. First the ocean system is nonlinear. For the variational approach this means that the cost function is non-convex which also means several local minima which seriously complicates the convergence of the gradient based optimization algorithms. A nonlinear system also means that the Kalman filter is no longer optimal, although one can linearize the system before applying the filter, which leads to the popular but no longer optimal extended Kalman filter. A more visible problem is the size of the system which is 10 to the 8 and even ten to the nine. For the 4D-VAR problem this means a huge dimension control and very few iterations because of the very important computational burden, and for the Kalman filter covariance matrices of the size of the system which can never be manipulated. And a less visible problem but equally important is the very poorly known error covariance matrices because of the lack of observations and their huge dimensions. These weighting matrices need to be accurately specified because they play an important role in the final solution. And when the errors are non Gaussian, the KF is only optimal among linear estimators.
4D Variational Assimilation ECCO 1o Global Assimilation System Eddy-Permitting 4D-VAR Assimilation ECCO Assimilation Efforts at SIO Tropical Pacific, San Diego, … Now that you have an idea of the four dimensional assimilation problem in oceanography, I will first describe the variational assimilation systems we are working on at Scripps. We work closely with the ECCO group at MIT and JPL, and now in Hamburg, especially Armin Kohl, Detlef Stammer and Patrick Heimbach. I will start with a few words on the ECCO 1 degree global assimilation system, then the I will discuss the 4D-VAR assimilation with an eddy permitting ocean model before I show the assimilation results from two assimilation systems: one in the tropical Pacific and one in San Diego. In collaboration with the ECCO group, especially Armin Köhl*, Detlef Stammer*, Patrick Heimbach** *Universitat Hamburg/Germany, **MIT/USA
ECCO 1o Global Assimilation System Model: Data: Assimilation scheme: 4D-VAR with control of the initial conditions and the atmospheric forcing (with diagonal weights!!!) ECCO reanalysis: 1o global ocean state and atmospheric forcing from 1992 to 2004, …and from 1952 2001 (Stammer et al. …) MITGCM (TAF-compiler enabled) NCEP forcing and Levitus initial conditions Altimetry (daily): SLA TOPEX, ERS SST (monthly): TMI and Reynolds Profiles (monthly) : XBTs, TAO, Drifters, SSS, ... Climatology (Levitus S/T) and Geoid (Grace mission) The ECCO assimilation system is a 4D-VAR system and is based on the MIT general circulation model and its adjoint which can be automatically generated using the TAF compiler. The Model is forced with NCEP forcing and uses Levitus climatology for initial conditions. In ECCO, they assimilate all kind of data, daily altimetry, and monthly sea surface temperature and temperature and salinity profiles. They also constrain their solution to Levitus temperature and Salinity climatology. They assimilate these data to the MIT model using the 4DVAR approach while controlling the initial conditions and the atmospheric forcing with diagonal weights which means that they do not have any dynamical or smoothing constraints on the controls. Using this system ECCO estimates are now available from 1992 to 2004. Another 50 year run assimilation from 1952 to 2001 is now being run. I note here that the ECCO system used to be run at Scripps before Detlef Stammer moves back to Germany.
Equatorial Under Current (EUC) ECCO Solution Fit Equatorial Under Current (EUC) Johnson ECCO The ECCO system seems to work pretty well. As an example, I will show you two figures comparing the ECCO mean SST and SSH with mean of the observations. ECCO SST is quite close to the data, ECCO SSH exhibits all large scale SSH variability, but small-scale structures are often missing and this is probably because of the low resolution of this system. At Scripps they had another consortium called CORC funded by NOAA to study the tropical Pacific circulation using several data sets that have been collected by Scripps scientists, and were assimilated in ECCO. When they looked at the ECCO solution in tropical Pacific, they found that many features were not well represented. For instance, the equatorial under current (EUC) was very weak and you can see that from this figure comparing the mean ECCO zonal velocity at the equator with the Johnson’s estimate. The shape of the ECCO EUC looks correct but too weak. Other currents were also not well estimated. The tropical instability waves were did not there. So they decided to implement an eddy permitting ECCO assimilation system for the tropical pacific … and that’s when I came to the US
ECCO Tropical Pacific Configuration Regional: 26oS 26oN, 1/3o, 50 layers, ECCO O.B. Data: TOPEX, TMI SST, TAO, XBT, CTD, ARGO, Drifters; all at roughly daily frequency Climatology: Levitus-T and S, Reynolds SST and GRACE Control: Initial conditions , 2-daily forcing, and weekly O.B. Smoothing: Smooth ctrl fields using Laplacian in the horizontal and first derivatives in the vertical and in time First guess: Levitus (I.C.), NCEP (forcing), ECCO (O.B.) MITGCM Tropical Pacific So we set a 1/3 degree MIT model for the tropical pacific between 26 south and 26 north with open boundaries in the south, north and in the Indonesian throughflow. Then we generated the adjoint model and we assimilated all data we found in the tropical Pacific, but this time at shorter time frequency than ECCO and we assimilated velocities as well. We adjusted the initial conditions, forcing fields, and the OB. We also added smoothing constraints on the control variables, by penalizing any un-smoothness. We used Levitus as first guess for the initial conditions, NCEP for the atmospheric forcing and ECCO state estimates for the open boundaries. OB = (U,V,S,T)
Eddy-Permitting 4D-VAR Assimilation The variables of the adjoint model exponentially increase in time Typical behavior for the adjoint of a nonlinear chaotic model Indicate unpredictable events and multiple local minima Correct gradients but wrong sensitivities Invalidate the use of a gradient-based optimization algorithm Assimilate over short periods (2 months) where the adjoint is stable Replace the original unstable adjoint with the adjoint of a tangent linear model which has been modified to be stable (Köhl et al., Tellus-2002) Exponentially increasing gradients were filtered out using larger viscosity and diffusivity terms in the adjoint model We found that the variables of the adjoint model exponentially increase in time, which is a typical behavior for the adjoint of a nonlinear chaotic model. This indicates unpredictable events and multiple tightly packed local minima in the cost function. The gradients we obtained from such system are correct but they do not represent the sensitivity to finite perturbations which means that they are not useful for a gradient based optimization algorithm. There are two avoid his problem, either we simply assimilate over short windows, two months in our case, where the linear assumption holds, but this means we will loose the information from future data and also we the system will not have enough time to adjust the heat and salinity fluxes and open boundaries, or we have to remove the unpredictable features from the adjoint by replacing the adjoint by the adjoint of a model that has been modified to remain stable … The gradients calculated by the ‘simplified adjoint' do not ``see'' the secondary minima and approximate the gradients to the envelope of the cost function. This definitely means that the features were removing will not be efficiently controlled by the system, but this is the price to pay in order to be able to apply the optimization over long enough assimilation windows. In our system, we used increased viscosity and diffusivity terms in the adjoint model as they set the degree of small-scale instabilities in the forward model.
HFL gradients after 45 days with increasing viscosities Visc = 1e11 & Diff = 4e2 10*Visc & 10*Diff 20*Visc & 20*Diff 30*Visc & 30*Diff I plotted here the gradients of the cost function with respect to the heat flux after 45 days as we obtained from the original adjoint model, and from the adjoint model using 10, 20 and 30 times the viscosity and diffusivity terms. I chose day 45 because after few days you will only see few spots of high values as the one you can see here. These spots were completely removed in the modified adjoint while the large scale are still well represented, and the larger the viscosity and diffusivity terms, the smoother is the adjoint.
Initial temperature gradients after 1 year (2000) 10*Visc & 10*Diff 20*Visc & 20*Diff The same figure but for the gradients with respect to the initial temperature, so its is the adjoint output at the end of the assimilation window which is one year in this run, and you can see how these waves were removed while the large scales are still there. Now I will show you some results of assimilation run over 1 year period in 2000.
Data Cost Function Terms 1/6;39 1/3;39 Here I plotted the total cost function decrease from the assimilation run after 100 iterations in blue and from the reference run without assimilation for the assimilated data terms and you can see that that we were able to significantly improve the model fit to all data sets 1;39 1;23
Control Cost Function Terms 1/6;39 1/3;39 The same figure but for the control terms this time, and here you can see the most of the adjustments were made by the wind stress 1;39 1;23
Fit to Data 1/6;39 1/3;39 This figure compares the mean SST and the mean zonal velocity on the equator from the reference and assimilation runs and compare them with TMI data and Johnson analysis. And the fit seems to be very good. 1;39 1;23
Assimilation Solution (weekly field end of August) And this figure compares similar things but for a given week for TMI SST and for AVISO sea level anomalies just to show you that the solution at a given time is also pretty reliable. Note that the small scale near the boundaries are not well reproduced probably because of the low horizontal resolution of the model and also because of the use of the higher viscosity and diffusivity terms in the adjoint run.
What Next … Fit is quite good and assimilation solution is reasonable Extend assimilation period over several years Add new controls to enhance the controllability of the system and reduce errors in the controls Improve control constraints … Some references The fit is quite a good and the solution looks reasonable. We will extend the assimilation window to provide an analysis set of the tropical Pacific state over several years. We are planning to add new controls to enhance the controllability of the system in the interior domain and to improve the constraints on the control variables, mainly adding dynamical constraints. We also need to look at the impact of the simplified adjoint on the control of the tropical instability waves. Hoteit et al. (QJRMS-2006) Hoteit et al. (JAOT-2007) Hoteit et al. (???-2007)
Other MITGCM Assimilation Efforts at SIO 1/10o CalCOFI 4D-VAR assimilation system Predicting the loop current in the Gulf of Mexico … San Diego high frequency CODAR assimilation Assimilate hourly HF radar data and other data Adjoint effectiveness at small scale Information content of surface velocity data MITGCM with 1km resolution and 40 layers Control: I.C., hourly forcing and O.B. First guess: one profile T, S and TAU (no U, V, S/H-FLUX) Preliminary results: 1 week, no tides We have also three other similar four dimensional variational assimilation systems in the California current system, very similar to the one I just presented, we recently started working on a new system for the prediction of the loop current in the Gulf of Mexico. We also have a coastal model form the San Diego area, I will quickly show an assimilation experiment with this system. In this application we assimilate high frequency radar radial velocities data every hour using a 1km MITGCM with 40 layers and its adjoint and we control the initial conditions as well as the atmospheric forcing and open boundaries on an hourly basis. As inputs we one had one single profile of salinity and temperature and winds data. I will show you the results of a one week assimilation experiment.
Model Domain and Radars Coverage Time evolution of the normalized radar cost 1/6;39 1/3;39 Here I plotted the coverage of the three radars we have in San Diego, of course we are not observing the currents over land, I should have removed that … In this figure I plotted the daily RMS fit with the data over the observed area from the reference run and from the assimilation run after different iterations, and the assimilation was able to improve the model over the whole assimilation period. 1;39 1;23
Assimilation Solution: SSH / (U,V) & Wind Adj. 1/6;39 1/3;39 And finally this figure shows the assimilation solution for the surface velocities and the sea surface height where you can see that the estimated currents are mainly geostrophic and in the right hand side figure I plotted the adjustments to the winds. You can see that the adjustments are concentrated over the observed area and are weaker toward the end of the assimilation period. More iterations and stronger constraints on the control would probably spread better the information over the non-observed area. 1;39 1;23
What Next … Assimilation over longer periods Include tidal forcing Coupling with atmospheric model Nesting into the CalCOFI model We keep this observation mind, we will develop a new approximate solution of the optimal nonlinear filter which has Kalman-type correction and a particle-type correction. So here is the outline of my presentation. First very brief description of the optimal filter. Next I will present the Kernel Particle Kalman filter, then a variant of this filter with low-rank covariance matrices for oceanic and atmospheric problems. And I will also show some numerical results.
Filtering Methods Low-Rank Extended/Ensemble Kalman Filtering SEEK/SEIK Filters Application to Mediterranean Sea Kalman Filtering for Coupled Models Particle Kalman Filtering I will present now some examples of assimilation systems based on the Kalman filter. I will start with a brief overview on low-rank Kalman filters, then I will describe the so-called SEEK and SEIK filters which we use to assimilate data into regional and coastal physical and biological models of the Mediterranean Sea. I will discuss the problem of Kalman filtering with coupled models, and I will finish with a few words on a new direction I am working on with Dinh-Tuan Pham, which I refer to as Particle Kalman filtering. In collaboration with D.-T. Pham*, G. Triantafyllou**, G. Korres** *CNRS/France, **HCMR/Greece
Low-rank Extended/Ensemble Kalman Filtering Reduced-order Kalman filters: Project on a low-dim subspace Kalman correction along the directions of Reduced error subspace Kalman filters: has low-rank Ensemble Kalman filters: Monte Carlo approach to Correction along the directions of To use the Kalman filter for data assimilation in a highly dimensional system as in meteorology and oceanography one has to find a way to reduce the size of the filter error covariance matrices. There are mainly three approaches that people use: The oldest and simplest one is to project the state of the system on a low dimensional subspace L and then apply the Kalman filter in the reduced subspace. This means that the covariance matrices of the filter will be parameterized in the reduced subspace and will be of low-rank and the Kalman correction will be only applied in the directions of L and it is usually called correction directions of the filter, this is the reduced-order extended Kalman filter. I have to note that this is possible because of the red spectrum that governs the ocean and the atmosphere variability. Another approach consists of assuming that the estimation error is in a reduced subspace, and this can be achieved by assuming that the filter error covariance matrices have a low-rank. I highlighted k in red to show you that the correction directions evolve in time to follow changes in the model dynamics. People refers to this approach as the reduced error subspace. There is also the popular ensemble Kalman approach that uses an ensemble of perturbations to represent the covariance matrices of the filter. And there is a whole collection of ensemble Kalman filters now …
Inflation and Localization Singular Evolutive Kalman (SEEK) Filters Low-rank (r) error subspace Kalman filters: Forecast Analysis A “collection” of SEEK filters: Among these Kalman filters, I use the singular evolutive Kalman filters, or the SEEK filters. These are reduced error subspace Kalman filters in which the Kalman filter covariance matrices are always decomposed into LULT. Only L and U are needed for the filter’s algorithm P does not show up in the filter’s algorithm. The filter operates as a succession of two steps, a forecast step to compute the forecast and to update L and an analysis to correct the forecast with the new observation after updating U. Several variants of the SEEK filters were introduced. They mainly differ in the the way they update the correction directions L. In the standard SEEK filter, we use the tangent linear model, so its is a simplified extended Kalman filter. In the so-called SFEK filter (F for fixed) the correction directions remain invariant to reduce computational burden. In this case the computational cost of the filter is greatly reduced because the update equation of L is by far the most expensive operation in the filter as it requires r+1 integration of the model where r is the rank of the filter. There is also the SEIK filter which is the ensemble variant of the SEEK filter, and only requires ensembles of size r+1 members. This filter is basically is very close to the ETKF except that it uses a stochastic approach to sample the analysis members. And of course in all these filters, we use inflation and localization of the covariance matrices. SEEK: Extended variant SFEK: Fixed variant SEIK: Ensemble variant with (r+1) members only! (~ETKF) Inflation and Localization
The Work Package WP12 in MFSTEP EU project between several European institutes Assimilate physical & biological observations into coupled ecosystem models of the Mediterranean Sea: Develop coupled physical-biological model for regional and coastal areas of the Mediterranean Sea Implement Kalman filtering techniques with the physical and biological model … Investigate the capacity of surface observations (SSH, CHL) to improve the behavior of the coupled system I will show you now two examples of application of these filters in the European MFSTEP project. The tasks of the package were are involved in consist of developing coupled regional and coastal bio-physical models in the Mediterranean Sea. Then use Kalman filters to assimilate available observations into these models. And once the system is successfully implemented, different questions can be considered.
The Coupled POM – BFM Model One way coupled: Ecology does not affect the physics The coupled ecosystem model is composed of the Princeton ocean model (POM) which is a primitive equation ocean model and the Biochemical Flux model (BFM). BFM represents the new generation of the ERSEM model. It is a generic highly complex model, with 88 variables. Different organisms are separated according to their trophic level: producers, consumers, and decomposers. As it is implemented here, only the physics affects the biology. In the advection diffusion reaction equation of the biology, we use the velocity fields and the eddy diffusivity and viscosity parameters. Temperature, salinity and light are also used in some specific routines.
A Model Snapshot 1/10o Eastern Mediterranean configuration 25 layers Elevation and Mean Velocity Mean CHL integrated 1-120m Here is a snapshot form a 1/0 degree eastern Mediterranean configuration with 25 sigma layers. This is just to show you how closely the physics and the ecology can be related. For instance, the elevation map shows two cyclones one is the Rhodes cyclone and the other is at the west of Crete. Both produce CHL local maximum as they bring waters from the deep ocean rich in nutrients. There is also another CHL maximum in the Aegean close to Turkey caused by upwelling.
Assimilation into POM Model = 1/10o Mediterranean configuration with 25 layers Observations = Altimetry, SST, Profiles T & S profiles, Argo data, and XBTs on a weekly basis SEIK Filter with rank 50 (51 members) Initialization = EOFs computed from 3-days outputs of a 3-year model integration Inflation factor = 0.5 Localization = 400 Km First I will quick show you assimilation results with POM. It is a 1/10 degree configuration of the whole Mediterranean Sea. We assimilated satellite SSH and SST, T & S profiles, Argo data, and XBTs on a weekly basis using the SEIK filter with rank 50. We tried invariant correction directions but we were not very happy with the results. The initial correction directions are EOFs computed from a sample of model outputs. Inflation factor is 0.5 and localization region is 400km.
Mean Free-run RMS Error Mean Analysis RMS Error Assimilation into POM SSH RMS Misfits Mean Forecast RMS Error Free-Run Forecast Here I plotted the time evolution of the RMS SSH misfit in time for the model free-run without assimilation in black and from the filter for the forecast in blue and for the analysis in red. The filter improves the model behavior and brings the estimation error of the state below the specified observational error of 3cm. On the right II plotted the spatial distribution of the RMS averaged over time just to show you that the filter analysis was efficient all over the model domain. Obs Error = 3cm Mean Analysis RMS Error Analysis
Salinity RMS Error Time Series FerryBox data at Rhone River SSH 07/12/2005 Salinity RMS Error Time Series FerryBox data at Rhone River 07/12/05: SATELLITE SSH FREE RUN FORECAST ANALYSIS Here I plotted the time evolution of the RMS misfit between model salinity and measurements along a commercial ferry box. I am sorry for this figure because it is not very clear, but the message is that the filter was also able to improve the salinity estimates over this localized path too. On the right side I plotted a snapshot of a SSH field in the beginning of December as it results from the model and the filter compared with the AVISO data just to show you that the must of the currents were well captured by the forecast and the analysis. 24/4/2017 ECOOP KICK-OFF 32
Assimilation into BFM Model = 1/10o Eastern Mediterranean with 25 layers with perfect physics Observations = SeaWiFS CHL every 8 days in 1999 SFEK Filter = SEEK with invariant correction subspace Correction subspace = 25 EOFs computed from 2-days outputs of a one year model integration Inflation factor = 0.3 Localization = 200 Km Now I will show the results of another application assimilating CHL SeaWiFs data into a 1/10 degree configuration of the BFM model in the Eastern Mediterranean. Here the physical model is assumed perfect. In this system, we used the SFEK filter with 25 invariant correction directions to reduce computational time. We used this filter because it provided quite good behavior at an acceptable computational cost. These directions were obtained by applying a multivariate EOF analysis on a sample of model outputs. Inflation factor was set to 0.4 and localization over 200 km.
Assimilation into BFM CHL RMS Misfits Forecast Free-Run Analysis This figure plots the RMS misfit of model data difference. The left panel shows the time evolution for the model free run, the filter forecast and analysis. The right panel shows the spatial distribution of the time averaged RMS. And you can see that the filter was able to always improve the model fit to the data in space and time except over the Bloom period. We expect that the evolution of the correction directions will help here. Analysis
CHL Cross-Section at 34oN Ph Cross-Section at 28oE Here I plotted the misfit between in depth the Medatlas climatology and the reference solution top panels and assimilation solution low panels solutions for Chlorophyll at 34 degree north and Phosphate at 28 degree east. These are independent estimates, and you can see that the assimilation has mainly positive impact in the depth and on non-observed variables. In this system we assumed perfect physics, and we know that this is not true. To improve this system we definitely need to force the biology with good physics and to do that we should use our best possible estimates of the physics to force the biology, which means we need to simultaneously assimilate physical and biological data into the coupled model.
Kalman Filtering for Coupled Models Physical System Ecological System MAP: Direct maximization of the joint conditional density We first write the estimation problem for this coupled system where the state of the ecology depends on the state of the physics, and the MAP estimator can be obtained by maximizing the conditional probability density of the joint physical/biological state given to the joint physical/biological observations. Direct optimization will lead to a Kalman filter acting on the joint state vector and assimilating physical and biological data. This is known as the joint approach, and its problem is that it will produce strong coupling between the two subsystems which might limit the fit to these different data sets. Another problem is related to the important computational burden because in this approach we have to use the same rank and the same evolution for the correction directions, and this can be quite heavy computationally. standard Kalman filter estimation problem Joint approach: strong coupling and same filter (rank) !!!
Dual Approach Decompose the joint density into marginal densities Compute MAP estimators from each marginal density Separate optimization leads to two Kalman filters … Different degrees of simplification and ranks for each filter significant cost reduction Same from the joint or the dual approach The physical filter assimilate and Another way to solve this problem is to use the so-called dual approach. This approach was originally used by Nelson to estimate the state and the parameters of a given model. Here we generalize it to the estimation of the states of two observed coupled models. The idea is to decompose the conditional density that we want to optimize into a product of two marginal densities. Then by maximizing the two densities separately we will end up with two Kalman filters on acting on the physics and one on the ecology. The two filters operate separately but not independently. This allows to apply different filter in each system basis and also different ranks which means significant reduction in computational cost and more degrees of freedom to fit the data.
Twin-Experiments 1/10o Eastern Mediterranean (25 layers) RRMS for state vectors Twin-Experiments 1/10o Eastern Mediterranean (25 layers) Joint: SEEK rank-50 Dual: SEEK rank-50 for physics SFEK rank-20 for biology . Physics Biology I will show very quickly the result of a twin experiment with these two approaches with 1/10 of degree eastern Mediterranean model. In the joint approach we used the standard SEEK filter with rank 50. For the dual approach, we used the SEEK filter with rank 50 for the physics and the SFEK filter with rank 20 for the ecology. Here I plotted the RRMS for the whole state vector for the physics in the left panel and for the ecology in the right panel. Of course, every variable was normalized by the spatial average of its standard deviation. The model without assimilation couldn’t not reduce the initial error and both the joint and the dual approaches significantly improve the model behavior. For the physics, the dual approach is slightly better, this is probably because the physical filter has more degrees of freedom to fit the data. For the ecology, the joint approach is better, but this can be expected because of the less sophisticated filter we used in the dual approach. Ref Dual Joint Ref Dual Joint
What Next … Joint/Dual Kalman filtering with real data State/Parameter Kalman estimation Better account for model errors Some references Hoteit et al. (JMS-2003), Triantafyllou et al. (JMS-2004), Hoteit et al. (NPG-2005), Hoteit et al. (AG-2005), (Hoteit et al., 2006), Korres et al. (OS-2007) What is next … We need to test the joint and the dual approach with real data. Several poorly known parameters in the ecological model needs to be estimated, and we need to include these parameters in the estimation process. We need to better account for the model error, simple inflation is certainly not enough. But to do this we need to have some information about the model error. A plausible approach would be to parameterize it and then include it in the estimation process. I am also interested in improving the dynamical balance of the Kalman analysis step.
Nonlinear Filtering - Motivations The EnKF is “semi-optimal”; it is analysis step is linear The optimal solution can be obtained from the optimal nonlinear filter which provides the state pdf given previous data Particle filter (PF) approximates the state pdf by mixture of Dirac functions but suffers from the collapse (degeneracy) of its particles (analysis step only update the weights ) Surprisingly, recent results suggest that the EnKF is more stable than the PF for small ensembles because the Kalman correction attenuates the collapse of the ensemble I will finish this presentation with a few words about a new approach that we are working on that might allow an efficient implementation of the optimal nonlinear filter for meteorological and oceanic problems. First, what are the motivations for this work. As you know the ensemble Kalman filter is only semi-optimal, in the sense it is analysis step is linear and based on a Gaussian assumption. When the model is linear, we can determine the state pdf given the observations using the optimal nonlinear filter, but the implementation of this filter can be quite expensive even for a system with only few dimensions. The so-called particle filter is a discrete solution of the optimal nonlinear and makes use of a mixture of dirac functions to approximate the pdf of the filter. The problem is that this filter requires a large number of particles because it strongly suffers from the collapse of its particles as the filter only updates the weights and let the particles propagate with the model. Recently, few studies suggested that the ensemble Kalman filter can be more stable than the particle filter when small ensembles are used because the Kalman correction of the particles attenuates the risk of degneracy.
The Particle Kalman Filter (PKF) The PKF uses a Kernel estimator to approximate the pdfs of the nonlinear filter by a mixture of Gaussian densities The state pdfs can be always approximated by mixture of Gaussian densities of the same form: Analysis Step: Kalman-type: EKF analysis to update and Particle-type: weight update (but using instead of ) Forecast Step: EKF forecast step to propagate and Resampling Step: … I will present a new filter that will generalize the correction step of the EnKF to nonlinear systems. The idea is to follow the Kernel method to approximate the state pdfs by a mixture of Gaussian densities of the following form. Then under some assumptions, one can show that the pdfs of the optimal nonlinear filters analysis can be always approximated with mixtures of Gaussian densities of the same form. This will lead to a filter which operates as follows: For the analysis, there are two steps: one Kalman analysis to update the particles and the associated error covariance matrix, and one particle-type correction to update the weights. Then there is a forecast step to propagate the particles in time. A resampling step that I will not talk about here might be needed sometimes.
Particle Kalman Filtering in Oceanography It is an ensemble of extended Kalman filters with weights!! Particle Kalman Filtering requires simplification of the particles error covariance matrices The EnKF can be derived as a simplified PKF Hoteit et al. (MWR-2007) successfully tested one low-rank PKF with twin experiments What Next … Derive and test several simplified variants of the PKF Assess the relevance of a nonlinear analysis step: comparison with the EnKF Assimilation of real data … So basically it is an ensemble of extended Kalman filters with weights, that’s why I refer to it as a particle Kalman filter. But it is already very difficult to implement one Kalman filter, and here I am talking about an ensemble of Kalman filters. So approximations are inevitable. The simplest variant leads to the Ensemble Kalman filter. We can also imagine several other variants through other simplifications and/or parameterizations of the filter error covariance matrices, and we actually did that already in a recent work. Our next steps will be to develop and test other variants of the PKF filter while considering the Ensemble Kalman filter as the reference to beat. And again this approach need to be tested with real data.
Discussion and New Applications Advanced 4D data assimilation methods can be now applied to complex oceanic and atmospheric problems More work is still needed for the estimation of the error covariance matrices, the assimilation into coupled models, and the implementation of the optimal nonlinear filter New Applications: ENSO prediction using neural models and Kalman filters Hurricane reconstruction using 4D-VAR ocean assimilation! Ensemble sensitivities and 4D-VAR Optimization of Gliders trajectories in the Gulf of Mexico … And this brings to the conclusion. We know have enough tools to apply advanced four dimensional data assimilation techniques to complex oceanic and atmospheric problems and we should do that. More work is still needed for the estimation of the model error, the assimilation into coupled models, and why not try to improve the behavior of the filters by directly approximating the optimal nonlinear filter rather than approximating the linear Kalman filter. And a few words about some new applications I am working on: A low-order ENSO assimilation system with Sergey Frolov from OSU and Armin Khol from Germany. We started this month a new application of the 4DVAR MITGCM to see if we can reconstruct hurricanes from ocean data, and what we can reconstruct. This work is in collaboration with Sarah szedler from Texas A&M. I am discussing with Jeffrey Anderson ideas to use the so called ensemble sensitivities in a 4D-VAR setup. We are also thinking about optimizing Gilders trajectories in order to improve the prediction of the loop current. And this will end my talk. Thank you THANK YOU
4D-VAR or (Ensemble) Kalman Filter? EnKF Easier to understand More portable (easier to implement?) No low-rank deficiency Support different degrees of simplifications Easier to incorporate a complex background covariance matrix Low-rank estimates of the error cov. matrices (better forecast!) Dynamically consistent solution Still room for improvement … I highlighted the most important advantages in red because for me one should made his decision based on these two statements. Ensemble methods will always be rank deficient and 4D-VAR methods kind like reached the top and its difficult to further improve them. So which one is better, it is your call … 4D-VAR or EnkF? …
Sensitivity to first guess (25 Iterations) Results of assimilation starting the optimization from the NCEP or QSCAT winds. QSCAT winds improve the starting cost function, but after 30 iterations, the cost function stabilized at about the same level for either starting point. Mean adjustments to $TAUU$ starting the optimization from NCEP or QSCAT winds. $TAUU$ solution seems to converge toward similar solutions
Comparison with TAO-Array RMS RMS Zonal Velocity (m/s) 1/6;39 1/3;39 RMS Meridional Velocity (m/s) Mean SST (left panel) and mean zonal velocity on the equator (right panel) as they result from the reference and assimilation runs compared with TMI data and Johnson analysis, respectively. Overall, model/data consistency is improved, except that the core of the EUC becomes … 1;39 1;23
San Diego HF Radar Currents Assimilation Assimilate hourly HF radar data and other data Goals: Adjoint effectiveness at small scale Information content of surface velocity data Dispersion of larvae, nutrients, and pollutants MITGCM with 1km resolution with 40 layers Control: I.C., hourly forcing, and O.B. First guess: one profile, no U and V, and no forcing! Preliminary results: 1 week, no tides In this application we assimilate high frequency radar radial velocities data every hour using the MITGCM and its adjoint and we control the initial conditions as well as the atmospheric forcing and open boundaries on an hourly basis. We only used one single profile of salinity and temperature to initialize the reference run and we run it without forcing for 1 week.
Cost Function terms 1/6;39 1/3;39 1;39 1;23 The adjoint method greatly improves the model consistency with the HF radar velocity data. Overall, the total cost function is reduced by almost 80%. Adjustments terms for the control variables are zero at the start of the assimilation. The contribution of these terms to the total cost function is not significant after assimilation. 1;39 1;23
Assimilation Solution: U & V 1/6;39 1/3;39 Comparison between reference run solution and assimilation solution for the surface components of the horizontal velocity (cm/s) at day 5. The effects of the zero boundary conditions in the reference run are clear. 1;39 1;23
Assimilation Solution 1/6;39 1/3;39 Cross-section at 32.5oN of the assimilation solution for temperature, salinity, zonal and meridional velocities at day 5. The density differences have not reached into the deeper ocean, although some currents are visible. 1;39 1;23
Gulf of Mexico Loop current prediction Observations HF radar, Gliders, ADCP, … Adjoint effectiveness … 1/10o with 50 layers Ctrl: I.C. (S,T), daily forcing, and weekly O.B. Proof of concept assimilating SSH, Levitus and Reynolds … We keep this observation mind, we will develop a new approximate solution of the optimal nonlinear filter which has Kalman-type correction and a particle-type correction. So here is the outline of my presentation. First very brief description of the optimal filter. Next I will present the Kernel Particle Kalman filter, then a variant of this filter with low-rank covariance matrices for oceanic and atmospheric problems. And I will also show some numerical results.
What Next … Ensemble forecasting Ensemble Kalman filtering Optimization of observation systems We keep this observation mind, we will develop a new approximate solution of the optimal nonlinear filter which has Kalman-type correction and a particle-type correction. So here is the outline of my presentation. First very brief description of the optimal filter. Next I will present the Kernel Particle Kalman filter, then a variant of this filter with low-rank covariance matrices for oceanic and atmospheric problems. And I will also show some numerical results.
Twin-Experiments Setup Model = 1/10o Eastern Mediterranean with 25 layers 1996 2000 05/03/02 05/03/02 Spin up EOFs REF – OBS 4 years 2 years 3 months Pseudo-obs: SSH and CHL surface data every 3 days Initialization: start from mean state of the 2 years run Free-run: run without assimilation starting from mean state Evaluation: RMS misfit relative to the misfit from mean state I will finally show you preliminary results of simultaneous assimilation of SSH and CHL data in a twin-experiments setup. We used the 1/10 degree eastern Mediterranean configuration. First we performed a model spin up for a 4 year. Then we ran the model for two years to generate a historical ensemble of model outputs from which we computed a set of EOFs. An additional run for 3 months between March and June was finally preformed to generate the reference states for twin experiments. These states are used to generate the pseudo-observations and to validate the filter’s performance. So we assumed SSH and CHL were observed every 5 grid points. Every 2 days. We start the filter from the mean state of the 2 years run. It is like assuming that the only error in the model is due to initial conditions. We also performed a free-run starting from the filter initial conditions to evaluate the filter behavior to the model run without assimilation. I will show the results of the filter in term of the rms misfit relative to the rms misfit when there is no assimilation and the mean state is considered as the best estimate.
Low-rank Extended/Ensemble Kalman Filtering EnKF Reduced-order Kalman filters: Project x on a low-dim subspace Analysis along the directions of Reduced error subspace Kalman filters: has low-rank Ensemble Kalman filters: Monte Carlo approach to ● ● ● Model ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Data ● ● ● ● ● ● ● ● ● SPF ● ● ● ● ● ● ● Resampling If Pa or Pf is of low rank, then the other one has the same rank
Low-Rank Deficiency Issues Error covariance matrices are underestimated Few degrees of freedom to fit the data Amplification by an inflation factor Localization of the covariance matrix (using Schur product) In the equation of U, I dropped the term representing the model error and replaced it by a simple amplification factor. The amplification factor is also used to account for other sources of errors in the filter.
Joint Approach Direct maximization of the joint conditional density standard Kalman filter estimation problem acting on and assimilating We can solve this optimization problem directly by considering the joint state vector x which contains the state vectors of the physics and the ecology and simultaneously assimilating the physical and ecological data. So we end up with one single Kalman filter acting on the joint state vector of the two sub-models. The problem with this approach the coupling between the two systems can be very strong so that we might not have enough degrees of freedom to fit the data. In our case also we will be forced to use the same rank for both sub-systems and also to use the same update equation for the correction directions, which can be quite expensive to implement. Issues: Strong coupling and same filter (rank)
Dual Approach – Some Facts Only the second marginal density depends on , this means same from the joint or the dual approach does not depend on , more in line with the one-way coupling of the system The physical filter assimilates both and : assimilation of guaranties consistency between the two subsystems The ecological filter assimilates only , but it is forced with the solution of the physical filter The linearization of the observational operator in the physical filter is a complex operation because of the dependency of the ecology on , it was neglected in this preliminary application Now some remarks about the dual approach. Only the first marginal density depends on the ecological state, so the dual and the joint solution are identical for the ecology in case of a linear Gaussian model. The estimate of the physical state does not depend on the estimate of the ecological state, more in line with the one-way coupling of the system, and more degrees of freedom to fit the physical observations. The physical filter assimilate both physics and biological data. This guaranties consistency between the solutions of the two filters. The ecological filter assimilates only the biological data, but is forced with estimate of the physical state. I should mention here, that there is a complex linearization in the dual approach that is due to the dependency of the ecological state on the physical state. In this application we just neglect it.
Twin-Experiments Setup Model = 1/10o Eastern Mediterranean with 25 layers 1996 2000 05/03/02 05/03/02 Spin up EOFs REF – OBS 4 years 2 years 3 months Pseudo-obs: SSH and CHL surface data every 3 days Initialization: start from mean state of the 2 years run Free-run: run without assimilation starting from mean state Evaluation: RMS misfit relative to the misfit from mean state I will finally show you preliminary results of simultaneous assimilation of SSH and CHL data in a twin-experiments setup. We used the 1/10 degree eastern Mediterranean configuration. First we performed a model spin up for a 4 year. Then we ran the model for two years to generate a historical ensemble of model outputs from which we computed a set of EOFs. An additional run for 3 months between March and June was finally preformed to generate the reference states for twin experiments. These states are used to generate the pseudo-observations and to validate the filter’s performance. So we assumed SSH and CHL were observed every 5 grid points. Every 2 days. We start the filter from the mean state of the 2 years run. It is like assuming that the only error in the model is due to initial conditions. We also performed a free-run starting from the filter initial conditions to evaluate the filter behavior to the model run without assimilation. I will show the results of the filter in term of the rms misfit relative to the rms misfit when there is no assimilation and the mean state is considered as the best estimate.
The Optimal Nonlinear Filter As the Kalman filter, it operates as a succession of forecast and analysis steps to update the state pdf: Forecast Step: Integrate the analysis pdf with the model Analysis Step: Correct the predictive pdf with the new data Particle Filter approximates the state pdf by mixture of Dirac functions but suffers from degeneracy. The filter operates recursively as a succession of a prediction step and an analysis step: The prediction pdf is obtained by integrating the most recent analysis pdf with the model dynamics. This is the Gaussian distribution because of the Gaussian assumption on the model error. When a new observation is available, we recover the analysis pdf from the predictive pdf using the Bayes rule. The analysis density is obtained by the observation density and then normalized by a constant to ensure a probability density. As I said before the particle filter uses a mixture of Dirac functions to approximate the state pdf. Here we will use a mixture of Gaussian densities.
New Directions/Applications New Applications ENSO prediction using surrogate models and Kalman filters Hurricane reconstruction using 4DVAR ocean assimilation! Ensemble sensitivities and 4DVAR Other Interests Optimal Observations Estimate Model and Observational Errors Estimate Background Covariance Matrices in 4DVAR Study the behavior of the different 4DVAR methods with highly nonlinear models We keep this observation mind, we will develop a new approximate solution of the optimal nonlinear filter which has Kalman-type correction and a particle-type correction. So here is the outline of my presentation. First very brief description of the optimal filter. Next I will present the Kernel Particle Kalman filter, then a variant of this filter with low-rank covariance matrices for oceanic and atmospheric problems. And I will also show some numerical results.