Methods for dealing with spurious covariances arising from small samples in ensemble data assimilation Jeff Whitaker jeffrey.s.whitaker@noaa.gov NOAA Earth.

Slides:



Advertisements
Similar presentations
Variational data assimilation and forecast error statistics
Advertisements

Introduction to Data Assimilation NCEO Data-assimilation training days 5-7 July 2010 Peter Jan van Leeuwen Data Assimilation Research Center (DARC) University.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Initialization Issues of Coupled Ocean-atmosphere Prediction System Climate and Environment System Research Center Seoul National University, Korea In-Sik.
1 What constrains spread growth in forecasts initialized from ensemble Kalman filters? Tom Hamill (& Jeff Whitaker) NOAA Earth System Research Lab Boulder,
Targetting and Assimilation: a dynamically consistent approach Anna Trevisan, Alberto Carrassi and Francesco Uboldi ISAC-CNR Bologna, Italy, Dept. of Physics.
1 アンサンブルカルマンフィルターによ る大気海洋結合モデルへのデータ同化 On-line estimation of observation error covariance for ensemble-based filters Genta Ueno The Institute of Statistical.
Balance in the EnKF Jeffrey D. Kepert Centre for Australian Weather and Climate Research A partnership between the Australian Bureau of Meteorology and.
Ibrahim Hoteit KAUST, CSIM, May 2010 Should we be using Data Assimilation to Combine Seismic Imaging and Reservoir Modeling? Earth Sciences and Engineering.
Toward a Real Time Mesoscale Ensemble Kalman Filter Gregory J. Hakim Dept. of Atmospheric Sciences, University of Washington Collaborators: Ryan Torn (UW)
Review of “Covariance Localization” in Ensemble Filters Tom Hamill NOAA Earth System Research Lab, Boulder, CO NOAA Earth System Research.
ASSIMILATION of RADAR DATA at CONVECTIVE SCALES with the EnKF: PERFECT-MODEL EXPERIMENTS USING WRF / DART Altuğ Aksoy National Center for Atmospheric Research.
To understand the differing localization strength, consider two grid points, observation at grid point 1. K for grid point 2: (d 12 = distance between.
Advanced data assimilation methods- EKF and EnKF Hong Li and Eugenia Kalnay University of Maryland July 2006.
Two Methods of Localization  Model Covariance Matrix Localization (B Localization)  Accomplished by taking a Schur product between the model covariance.
A comparison of hybrid ensemble transform Kalman filter(ETKF)-3DVAR and ensemble square root filter (EnSRF) analysis schemes Xuguang Wang NOAA/ESRL/PSD,
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
Ensemble-Based Atmospheric Data Assimilation: A Tutorial Thomas M. Hamill NOAA-CIRES Climate Diagnostics Center Boulder, Colorado USA 80301
The impact of localization and observation averaging for convective-scale data assimilation in a simple stochastic model Michael Würsch, George C. Craig.
CORRELATION & REGRESSION
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
Ensemble Data Assimilation and Uncertainty Quantification Jeffrey Anderson, Alicia Karspeck, Tim Hoar, Nancy Collins, Kevin Raeder, Steve Yeager National.
EnKF Overview and Theory
Oceanography 569 Oceanographic Data Analysis Laboratory Kathie Kelly Applied Physics Laboratory 515 Ben Hall IR Bldg class web site: faculty.washington.edu/kellyapl/classes/ocean569_.
WWOSC 2014, Aug 16 – 21, Montreal 1 Impact of initial ensemble perturbations provided by convective-scale ensemble data assimilation in the COSMO-DE model.
ISDA 2014, Feb 24 – 28, Munich 1 Impact of ensemble perturbations provided by convective-scale ensemble data assimilation in the COSMO-DE model Florian.
Computational Issues: An EnKF Perspective Jeff Whitaker NOAA Earth System Research Lab ENIAC 1948“Roadrunner” 2008.
1 ESTIMATING THE STATE OF LARGE SPATIOTEMPORALLY CHAOTIC SYSTEMS: WEATHER FORECASTING, ETC. Edward Ott University of Maryland Main Reference: E. OTT, B.
CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss High-resolution data assimilation in COSMO: Status and.
MPO 674 Lecture 20 3/26/15. 3d-Var vs 4d-Var.
Dusanka Zupanski And Scott Denning Colorado State University Fort Collins, CO CMDL Workshop on Modeling and Data Analysis of Atmospheric CO.
Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.
Data Assimilation Using Modulated Ensembles Craig H. Bishop, Daniel Hodyss Naval Research Laboratory Monterey, CA, USA September 14, 2009 Data Assimilation.
MODEL ERROR ESTIMATION EMPLOYING DATA ASSIMILATION METHODOLOGIES Dusanka Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University.
Sensitivity Analysis of Mesoscale Forecasts from Large Ensembles of Randomly and Non-Randomly Perturbed Model Runs William Martin November 10, 2005.
Data assimilation, short-term forecast, and forecasting error
Data assimilation and forecasting the weather (!) Eugenia Kalnay and many friends University of Maryland.
Sabrina Rainwater David National Research Council Postdoc at NRL with Craig Bishop and Dan Hodyss Naval Research Laboratory Multi-scale Covariance Localization.
Ensemble Kalman Filter in a boundary layer 1D numerical model Samuel Rémy and Thierry Bergot (Météo-France) Workshop on ensemble methods in meteorology.
A kernel-density based ensemble filter applicable (?) to high-dimensional systems Tom Hamill NOAA Earth System Research Lab Physical Sciences Division.
Local Predictability of the Performance of an Ensemble Forecast System Liz Satterfield and Istvan Szunyogh Texas A&M University, College Station, TX Third.
Implementation and Testing of 3DEnVAR and 4DEnVAR Algorithms within the ARPS Data Assimilation Framework Chengsi Liu, Ming Xue, and Rong Kong Center for.
A Random Subgrouping Scheme for Ensemble Kalman Filters Yun Liu Dept. of Atmospheric and Oceanic Science, University of Maryland Atmospheric and oceanic.
Pg 1 SAMSI UQ Workshop; Sep Ensemble Data Assimilation and Uncertainty Quantification Jeff Anderson National Center for Atmospheric Research.
Future Directions in Ensemble DA for Hurricane Prediction Applications Jeff Anderson: NCAR Ryan Torn: SUNY Albany Thanks to Chris Snyder, Pavel Sakov The.
The application of ensemble Kalman filter in adaptive observation and information content estimation studies Junjie Liu and Eugenia Kalnay July 13th, 2007.
École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.
École Doctorale des Sciences de l'Environnement d’ Î le-de-France Année Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.
ECMWF/EUMETSAT NWP-SAF Satellite data assimilation Training Course Mar 2016.
Hybrid Data Assimilation
Scale-dependent localization: test with quasi-geostrophic models
Jeffrey Anderson, NCAR Data Assimilation Research Section
Data Assimilation Research Testbed Tutorial
Data Assimilation Theory CTCD Data Assimilation Workshop Nov 2005
Jeff Anderson, NCAR Data Assimilation Research Section
PSG College of Technology
EnKF Overview and Theory
Jeff Anderson, NCAR Data Assimilation Research Section
background error covariance matrices Rescaled EnKF Optimization
New Approaches to Data Assimilation
FSOI adapted for used with 4D-EnVar
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Error statistics in data assimilation
Impact of Assimilating AMSU-A Radiances on forecasts of 2008 Atlantic TCs Initialized with a limited-area EnKF Zhiquan Liu, Craig Schwartz, Chris Snyder,
Project Team: Mark Buehner Cecilien Charette Bin He Peter Houtekamer
MOGREPS developments and TIGGE
Kalman Filter: Bayes Interpretation
Sarah Dance DARC/University of Reading
Presentation transcript:

Methods for dealing with spurious covariances arising from small samples in ensemble data assimilation Jeff Whitaker jeffrey.s.whitaker@noaa.gov NOAA Earth System Research Lab, Boulder what is ensemble data assimilation? what are the consequences of sampling error? covariance localization. alternatives to covariance localization.

Ensemble data assimilation Parallel forecast and analysis cycles Background-errors estimated from sample covariances, depend on weather situation. Ensemble forecasting well established in NWP Analysis schemes still mostly deterministic (only a single background forecast is evolved). Use dynamically evolved ensemble to estimate ‘errors of the day’ - examples.

Ensemble Kalman Filter Ensemble mean is updated via Kalman Filter equations. H is operator that takes model state vector and converts it to predicted observations.

Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean Instead of propagating covariance matrix, the individual samples that are used to construct the covariance are evolved individually (essential a ‘square-root’ formulation).

Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean background-error (sample) covariance Don’t actually need to compute the entire matrix (would not fit in memory).

Ensemble Kalman Filter k ensemble members from a forecast model ensemble (sample) mean background-error (sample) covariance Update of ‘perturbations’ (deviations from mean) computed so that analysis error covariance is what you expect from KF equations analysis-error covariance

Consequences of Sampling Error Top: mean SLP, sample covariance between SLP everywhere and point in East Asia 2nd: same for 400 member ensemble. 3rd: function that is 1 at ob location, tapers to zero several thousand km away. 4rd: 25 member covariance multiplied by taper function (looks more like 400 member covariance).

Mis-specification of background-error covariance Sampling error results in large errors in sample covariance ‘far’ from observation location. Results in inappropriate updates to state vector. Simple example of how errors in Pb can affect state update in KF. 2-D state vector (x1 and y2), single ob only for x1 (x2 unobserved). Heavy line is prior covariance (marginal distribution on axes). Dot on x1 axis is value of ob, light solid is marginal distribution for ob. Dashed line is posterior covariance. Note the true background x1 is uncorrelated with x2. Underestimating covariance causes ob not to be used enough, posterior covariance too similar to prior. Over-estimating correlations between state variables causes state vector to be incremented too much in that direction. Posterior variance is too small in x2 direction, x2 mean is biased.

Effect of localization in a simplied GCM (1) 2-layer PE model on a sphere 46 observations over the globe A few years ago, we wrote a paper looking at the effect of covariance localization in a simplified GCM (no model error). We varied the ensemble size and the severity of the localization. For small ensembles, if not enough localization used, the filter diverged. More ensemble members, less localization needed. There’s a ‘sweet spot’ which minimizes ensemble mean error for a given ensemble size (and observation network).

Effect of localization in a simplied GCM (2) Rank histograms show what’s happening to the ensemble as the filter length scale is varied. Trend toward too much population at extreme ranks when length scale increased (then filter divergence). When filter length scale is too short, not enough population at extreme ranks (too much variance, because ensemble is not corrected enough far away from ob).

Effect of localization in a simplied GCM (3) Eigen-analysis of prior ensemble (prior to localization. Spectrum too steep for small ensembles. Assume 400 members represents ‘truth’. Apply localization to 25 member ensemble flattens the spectrum, adding new directions (increasing the rank of Pb). Limit of delta function, spectrum is flat. Without localization, ensemble is updated in too small a subspace. With localization, updates depend on distance from ob (more degrees of freedom).

Covariance localization increases rank of Pb If the ensemble has k members, then Pb describes nonzero uncertainty only in a k-dimensional subspace . Analysis only adjusted in this subspace. If the system is high-dimensionally unstable (if it has more than k positive Lyapunov exponents) then forecast errors will grow in directions not accounted for by the ensemble, and these errors will not be corrected by the analysis. Sampling error manifests itself directly in the form of spurious long-range covariances. Alternaltely, one can think of the sampling error as a manifestation of rank-deficiency in the ensemble (if k < the number of degrees of freedom in the dynamical model). Can’t correct the missing directions, so errors grow and ensemble variance shrinks, leading to filter divergence.

Alternative to localization Localizing covariances works because it increases the dimensionality…. So, one can instead compute updates in local regions where error dynamics evolves in a lower-dimensional subspace (< k). (LETKF - Hunt et al, 2007) This interpretation leads naturally to another way of solving the problem. Instead of updating the entire state vector at once, update localized pieces where error dynamics can be described by the small ensemble.

Two EnKF approaches Serial approach - for each observation, update each model variable (tapering the influence of the observation to zero at a specified distance). Used in NCAR DART. Local approach - update each model variable one at a time, using all observations within a specified radius (increasing R with distance between observation and model variable) - we use this approach since it scales well on massively parallel computers So, we have two ways of solving the problem. Serial approach makes the most sense when you have few obs and a large state vector Local approach makes more sense when you have lots of obs - then it’s much easier to parallelize the problem and it scales well on a MP system. Mathematically, there are differences between the two (the Local approach involves some approximations), but practical experience has shown there is little or no difference in accuracy.

Outstanding issues Both methods assume a priori that covariance is maximized at the observation location - problematic for non-local and time-lagged obs. Both methods are flow-independent (assume same degree of locality for every situation). Localization can destroy balance. Both method have deficiencies - the localization is the same all the time, and geostrophic balance is not maintained.

Localization and Balance Analysis of single zonal wind observation, using idealized nondivergent and geostrophically balanced covariances. Control imbalance by time-filtering first-guess forecast. Mid-latitude geophysical flows are nearly in geostrophic balance at the scales of cyclonic storms and fronts. That is, the wind field is proportional to the gradient of the geopot. Height field. Localization can mess this up. Here’s an example of the increment association with a single wind ob, where Pb is in exact geostrophic balance (ageostrophic wind increments are zero). Applying localization to the an ensemble sampled from that perfect Pb results in ageostropic winds. In practice, this means gravity waves will be excited during the forecast cycle. Houtekamer show that a measure of this gravity wave activity decreases as localization relaxed. In practice, the gravity waves tend not to interact very much with the balanced flow, so they can be filtered out of the forecast.

Flow Dependent Localization (Hodyss and Bishop, QJR) Stable flow error correlations km Unstable flow error correlations Recently, some new algorithms have been proposed to make localization flow dependent. When flow is stable, in the atmosphere on can expect the background error correlations to be larger scale (errors are larger scale). When flow is unstable (e.g. lots of small scale turbulence, convection), the errors will also be smaller scale. More sampling error for a finite ensemble in the unstable situation. km Ensembles give flow dependent noisy correlations

Flow Dependent Localization Stable flow error correlations Fixed moderation Current ensemble DA techniques reduce noise by multiplying ensemble correlation function by fixed moderation function (green line). Resulting correlations (blue line) are too thin when true correlation is broad and too noisy when true correlation is thin. km Unstable flow error correlations Unstable flow error correlations If the same localization is used for both cases, we may unduly sharpen the correlations in the stable case, while not removing enough of the noise in the unstable case. What is needed is ‘adaptivity’ Fixed moderation km km Today’s fixed moderation functions limit adaptivity

Flow Dependent Localization Stable flow error correlations SENCORP moderation Smoothed ENsemble Correlations Raised to a Power (SENCORP) moderation functions provide flow adaptive moderations functions. km Unstable flow error correlations Unstable flow error correlations Unstable flow error correlations In this scheme, the tapering function is broad in the stable case, and much narrower in the unstable case, resulting in better use of the observations in both situations. SENCORP moderation km km km SENCORP moderation functions adapt

“SENCORP” Recipe Smooth Pb = P1b Element-wise cube of P1b = P2b Normalized matrix product of P2b with itself = P3b Use element-wise square of P3b to compute K. Here’s a recipe for how their adaptive scheme works. True covariance shown if top right (temporal separation, two peaks) Spatially smooth cube 3) Matrix product

Hierarchical Ensemble Filter Proposed by Jeff Anderson (NCAR). Evolve K coupled N-member ensemble filters. Use differences between sample covariances to design a situation-dependent localization function. asymptotes to optimally localized N member ensemble (not K*N).

Conclusions Localization (tapering the impact of observations with distance from analysis grid point) makes ensemble data assimilation feasible with large NWP models. Both model errors and localization make filter performance suboptimal. Right now model error is the bigger problem, but improvements in localization are needed.