Observation Informed Generalized Hybrid Error Covariance Models

Slides:



Advertisements
Similar presentations
Variational data assimilation and forecast error statistics
Advertisements

Inference for Regression
1 Regression Models & Loss Reserve Variability Prakash Narayan Ph.D., ACAS 2001 Casualty Loss Reserve Seminar.
Errors in Error Variance Prediction and Ensemble Post-Processing Elizabeth Satterfield 1, Craig Bishop 2 1 National Research Council, Monterey, CA, USA;
Effects of model error on ensemble forecast using the EnKF Hiroshi Koyama 1 and Masahiro Watanabe 2 1 : Center for Climate System Research, University.
EARS1160 – Numerical Methods notes by G. Houseman
Empirical Localization of Observation Impact in Ensemble Filters Jeff Anderson IMAGe/DAReS Thanks to Lili Lei, Tim Hoar, Kevin Raeder, Nancy Collins, Glen.
Accounting for ensemble variance inaccuracy with Hybrid Ensemble 4D-VAR “There are known knowns; there are things we know we know. We also know there are.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter.
The Simple Linear Regression Model: Specification and Estimation
Review of “Covariance Localization” in Ensemble Filters Tom Hamill NOAA Earth System Research Lab, Boulder, CO NOAA Earth System Research.
ASSIMILATION of RADAR DATA at CONVECTIVE SCALES with the EnKF: PERFECT-MODEL EXPERIMENTS USING WRF / DART Altuğ Aksoy National Center for Atmospheric Research.
WWRP/THORPEX WORKSHOP on 4D-VAR and ENSEMBLE KALMAN FILTER INTER-COMPARISONS BUENOS AIRES - ARGENTINA, NOVEMBER 2008 Topics for Discussion in Session.
Advanced data assimilation methods- EKF and EnKF Hong Li and Eugenia Kalnay University of Maryland July 2006.
Estimation and the Kalman Filter David Johnson. The Mean of a Discrete Distribution “I have more legs than average”
UNBIASED ESTIAMTION OF ANALYSIS AND FORECAST ERROR VARIANCES
Bill Campbell and Liz Satterfield Naval Research Laboratory, Monterey CA Presented at the AMS Annual Meeting 4-8 January 2015 Phoenix, AZ Accounting for.
A comparison of hybrid ensemble transform Kalman filter(ETKF)-3DVAR and ensemble square root filter (EnSRF) analysis schemes Xuguang Wang NOAA/ESRL/PSD,
Lecture II-2: Probability Review
Slide 1 Evaluation of observation impact and observation error covariance retuning Cristina Lupu, Carla Cardinali, Tony McNally ECMWF, Reading, UK WWOSC.
Comparison of hybrid ensemble/4D- Var and 4D-Var within the NAVDAS- AR data assimilation framework The 6th EnKF Workshop May 18th-22nd1 Presenter: David.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Lecture Presentation Software to accompany Investment Analysis and Portfolio Management Seventh Edition by Frank K. Reilly & Keith C. Brown Chapter 7.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
CSDA Conference, Limassol, 2005 University of Medicine and Pharmacy “Gr. T. Popa” Iasi Department of Mathematics and Informatics Gabriel Dimitriu University.
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss High-resolution data assimilation in COSMO: Status and.
Dusanka Zupanski And Scott Denning Colorado State University Fort Collins, CO CMDL Workshop on Modeling and Data Analysis of Atmospheric CO.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
1 GSI/ETKF Regional Hybrid Data Assimilation with MMM Hybrid Testbed Arthur P. Mizzi NCAR/MMM 2011 GSI Workshop June 29 – July 1, 2011.
Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.
Data Assimilation Using Modulated Ensembles Craig H. Bishop, Daniel Hodyss Naval Research Laboratory Monterey, CA, USA September 14, 2009 Data Assimilation.
A unifying framework for hybrid data-assimilation schemes Peter Jan van Leeuwen Data Assimilation Research Center (DARC) National Centre for Earth Observation.
Research Vignette: The TransCom3 Time-Dependent Global CO 2 Flux Inversion … and More David F. Baker NCAR 12 July 2007 David F. Baker NCAR 12 July 2007.
Craig H. Bishop Elizabeth A Satterfield Kevin T. Shanley, David Kuhl, Tom Rosmond, Justin McLay and Nancy Baker Naval Research Laboratory Monterey CA November.
Data assimilation and forecasting the weather (!) Eugenia Kalnay and many friends University of Maryland.
BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,
Sabrina Rainwater David National Research Council Postdoc at NRL with Craig Bishop and Dan Hodyss Naval Research Laboratory Multi-scale Covariance Localization.
Local Predictability of the Performance of an Ensemble Forecast System Liz Satterfield and Istvan Szunyogh Texas A&M University, College Station, TX Third.
1 Module One: Measurements and Uncertainties No measurement can perfectly determine the value of the quantity being measured. The uncertainty of a measurement.
WCRP Extremes Workshop Sept 2010 Detecting human influence on extreme daily temperature at regional scales Photo: F. Zwiers (Long-tailed Jaeger)
A Random Subgrouping Scheme for Ensemble Kalman Filters Yun Liu Dept. of Atmospheric and Oceanic Science, University of Maryland Atmospheric and oceanic.
École Doctorale des Sciences de l'Environnement d’ Î le-de-France Année Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.
École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
ECMWF/EUMETSAT NWP-SAF Satellite data assimilation Training Course Mar 2016.
Estimating standard error using bootstrap
Hybrid Data Assimilation
LOGNORMAL DATA ASSIMILATION: THEORY AND APPLICATIONS
Physics 114: Lecture 13 Probability Tests & Linear Fitting
Chapter 7. Classification and Prediction
Regression Analysis AGEC 784.
Part 5 - Chapter
Multiple Imputation using SOLAS for Missing Data Analysis
Data Assimilation Theory CTCD Data Assimilation Workshop Nov 2005
Ch3: Model Building through Regression
Statistical Methods for Model Evaluation – Moving Beyond the Comparison of Matched Observations and Output for Model Grid Cells Kristen M. Foley1, Jenise.
CH 5: Multivariate Methods
Chapter 2 Minimum Variance Unbiased estimation
A comparison of 4D-Var with 4D-En-Var D. Fairbairn. and S. R. Pring
Statistical Methods For Engineers
Accelerating Local Ensemble Tangent Linear Models with order reduction Craig H. Bishop The University of Melbourne, Parkville, Australia Joanna.
Ensemble variance loss in transport models:
Presented by: David Groff NOAA/NCEP/EMC IM Systems Group
Modelling data and curve fitting
Developing 4D-En-Var: thoughts and progress
Error statistics in data assimilation
Product moment correlation
Parametric Methods Berlin Chen, 2005 References:
Sarah Dance DARC/University of Reading
Presentation transcript:

Observation Informed Generalized Hybrid Error Covariance Models Workshop on Sensitivity Analysis and Data Assimilation in Meteorology and Oceanography 1-6 July 2018 Aveiro, Portugal   Elizabeth Satterfield1, Daniel Hodyss1, David D. Kuhl2, Craig H. Bishop1 1Naval Research Laboratory, Monterey, California, USA 2Naval Research Laboratory, Washington, DC, USA

Introduction In a chaotic system like the atmosphere, the true error covariance of short term forecasts is highly flow dependent. Ensembles can capture such flow dependence, however limited ensemble size and imperfections in methods used for initial condition and model error perturbations mean that ensemble covariances are inevitably inaccurate. Hybrid error covariance models, which combine flow dependent (localized, ensemble-based) and quasi-static (climatologically- based) error covariance estimates, have been shown to yield superior data assimilation performance.

Introduction A primary aim of this work is to investigate the extent to which optimal hybridization parameters can be derived from an archive of (observation-minus-forecast, ensemble-variance) pairs generated by a single long run of a hybrid data assimilation scheme. Such an approach is attractive due to its simplicity and flexibility, benefits from the use of observational information, and has the potential to greatly reduce the amount of time required to tune hybrid error covariance models

Regression Based Model for the True Prior Variance Binned Sample-True Variance Linear Regression Fit Bishop and Satterfield (2013) detail a distribution of true variances, given an imperfect ensemble estimate. Here, we generalize this idea by estimating the mean of this conditional density using the techniques of regression. Although, in practice, the true error variance is unknown, this quantity can be estimated as the difference between the innovation variance and the observation error variance, where observation error variance estimates are obtained using methods of Desroziers et al. (2005) among others. Flow Dependent Static

Regression Based Model for the True Prior Variance A traditional linear model would take the form: A nonlinear relationship between the ensemble variance and true error variance implies a more general form, Binned Sample-True Variance Linear Regression Fit σt2~Г-1(αt=3 βt=2), D represents nth power of the corresponding element in the diagonal matrix of variances σt2~a+bu, u~U(0,1)

Defining the Kalman Gain Average error variance in a Kalman state estimate is We perform the standard operations to derive the Kalman gain, G, that minimizes the posterior error. We take the derivative of Pa with respect to G and set the result equal to zero. Solving for G: We did not know that this was the right gain to minimize the posterior MSE Now we have proved that for the Gaussian priors and non-Gaussian priors when only accounting for sampling error in the variance If prior is Gaussian we can account for sampling error in true prior mean by inflating the true prior variance given an ensemble variance by (1+1/N) For non-Gaussian priors, the situation is more complicated since sampling error in mean and variance are correlated. The gain that minimizes the posterior variance uses the expected true prior variance given an ensemble variance

Lorenz ’96 Model Experiments One-to-One Binned Sample-True Variance Ne=5 We assess the regression based models of hybrid error covariance using an implementation of the perturbed observations form of the Ensemble Kalman Filter (EnKF) on a 10-variable version of the Lorenz ’96 model. Gaspari-Cohn localization is applied to the ensemble based covariance matrix Every grid point is observed at each time step Multiplicative variance inflation is used to ensure that the ensemble variance approximately equals the mean squared error. A static error covariance matrix is generated from a 100,000 time step run of fully ensemble based system Cubic regression gives a better fit than linear for 5–member case linear Ne=5 cubic True Error Variance For larger ensemble sizes, errors in the linear approximation are not as large Ne=10 linear Inflated Ensemble Variance

Lorenz ’96 Results: Tuning versus Regression Based Models Ne=5 Linear regression underestimates the optimal we=0.7 found by “brute force” tuning The cubic regression based hybrid shown in red gives a lower RMSE than any of the linear models. RMSE as a function of ensemble weighting for N=5 Also shown are the RMSE based on a linear regression (magenta) and cubic regression. Lines indicate the mean of iterations 4-9 and error bars show the 95% confidence interval. All experiments use an entirely ensemble based static. For the N=5 “brute force” tuning experiment we get a minimum RMSE when alpha=0.6. Due to the fact that the spread-skill relationship is curved the ensemble weights for linear model are underestimated in this case. The cubic regression based hybrid shown in red gives a lower RMSE than any of the linear models, as would be expected based on the spread-skill plots. we

Lorenz ’96 Results: Influence of Model Error and Observation Density Brute force now finds minimum posterior error with we=0.6 consistent with a less accurate ensemble Include model error by changing the forcing in the nature run from F=8 to F=10 Little difference is seen between the cubic and linear fits Brute force minimum posterior error (calculated over all points) with we=0.7 In all the cases considered in this section, the cubic fit has either equally performed or outperformed the optimal weights found through brute force tuning. It is also found that the degree to which the cubic outperforms the best linear model is based on the curvature of the spread skill relationship, which depends on the performance of the ensemble. These two results together indicate that there may be a range in which a higher order polynomial fit is preferable. Observe only 50% of the grid points The cubic and linear fits converge

NAVGEM Results: Tuning versus Regression Based Models Alpha=0.75 Alpha=0.50 Alpha=1.00 Alpha=0.25 400 hPa 600 hPa Brute force tuning Red indicates improvement Linear Top row: Percent improvement over static (using ECMWF as verification). Operationally we currently use alpha=0.25 at all levels. Bottom Rows: We used some care in defining observation error, using only RS92 type radiosondes and restricting to levels where: We had a reasonable agreement between H-L and Desroziers methods We had > 600 points per bin The assumed to estimated R and B ratios fell between [0.5, 3.5] We are encouraged by the rough agreement by regression and brute force tuning. This gives us a viable option for weights that vary with region, season and altitude and will further consider How many regional patches are needed Temporal variation How to best estimate the errors in the prior And continue with testing. Regression Analysis with Prior Variance from Deszroiser Method Cubic 400 hPa 600 hPa U.S. Naval Research Laboratory

Main Conclusions In this study we built on the findings of Bishop and Satterfield (2013) to show that the expected true prior variance given an ensemble sample variance is consistent with regression. Theory was developed to show that (when only the sampling error in the ensemble variance was considered) the gain that minimizes the posterior error variance uses the expected true prior variance given an ensemble sample variance. We demonstrated that, after a single run of a fully ensemble data assimilation scheme, one can use regression to obtain a model of optimal hybrid variance. *Extended control variable form (Lorenc 2003) and Hamill and Snyder (2000) form (as in NAVGEM) can be shown to be equivalent (Wang 2007) *Further work is needed to implement in serial filter

Main Conclusions For the idealized univariate data assimilation and multi-variate cycling ensemble data assimilation considered here, it was found that, when the relationship between the ensemble variance and true error variance is linear, linear regression closely approximates the optimal weights found through the simple, but computationally expensive process of testing every plausible combination of weights. For the case that the relationship between the ensemble variance and true error variance is nonlinear, we introduce a hybrid model defined by higher order polynomial regression and demonstrated that such a scheme outperformed any plausible linear model. The degree to which higher order polynomial regression outperforms the best linear model is dependent on the performance of the ensemble. Typically, fitting to a higher degree polynomial is most important when the ensemble variance more accurately tracks the true error variance. *Extended control variable form (Lorenc 2003) and Hamill and Snyder (2000) form (as in NAVGEM) can be shown to be equivalent (Wang 2007) *Further work is needed to implement in serial filter

Future Work The focus of this work has been on hybrid formulations which form a hybridized error covariance matrix (e.g. Hamill and Snyder, 2000) and for filters which assimilate all available observations at once. Additional work would be needed to implement in serial filters. The theory presented here has only been applied to variances and additional work is needed to account for correlations, although similar methodology could be applied. These issues as well as application of this theory to the Navy Global Environmental Model (NAVGEM) are the subject of future work.

References Bishop, C. H., E. A. Satterfield, 2013: Hidden Error Variance Theory. Part I: Exposition and Analytic Model. Mon. Wea. Rev., 141, 1454–1468. Bishop, C. H., E. A. Satterfield, K. T. Shanley, 2013: Hidden Error Variance Theory. Part II: An Instrument That Reveals Hidden Error Variance Distributions from Ensemble Forecasts and Observations. Mon. Wea. Rev., 141, 1469–1483. Desroziers, G., L. Berre, B. Chapnik, and P. Poli. Diagnosis of observation, background and analysis-error statistics in observation space. Quart. J. Roy. Meteor. Soc.,131:3385–3396, 2005. Gaspari, G. and S. E. Cohn, 1999: Construction of correlation functions in two and three dimensions. Q. J. R. Meteorol. Soc., 125, 723–757. Hamill, Thomas M., Chris Snyder, 2000: A Hybrid Ensemble Kalman Filter–3D Variational Analysis Scheme. Mon. Wea. Rev., 128, 2905–2919. Lorenz, E.N., 2005: Designing Chaotic Models. J. Atmos. Sci., 62, 1574–1587. Satterfield, E., D. Hodyss, D.D. Kuhl, and C.H. Bishop. “On the Likely Utility of Hybrid Weights Optimized for Variances in Hybrid Error Covariance Models”. Mon. Wea. Rev. In Review.