Parameter estimation: To what extent can data assimilation techniques correctly uncover stochasticity? Jim Hansen MIT, EAPS (with lots.

Slides:



Advertisements
Similar presentations
Data-Assimilation Research Centre
Advertisements

Variational data assimilation and forecast error statistics
Regression and correlation methods
The Multiple Regression Model.
Threshold Autoregressive. Several tests have been proposed for assessing the need for nonlinear modeling in time series analysis Some of these.
Initialization Issues of Coupled Ocean-atmosphere Prediction System Climate and Environment System Research Center Seoul National University, Korea In-Sik.
Effects of model error on ensemble forecast using the EnKF Hiroshi Koyama 1 and Masahiro Watanabe 2 1 : Center for Climate System Research, University.
Simple Linear Regression
A Short Introduction to Curve Fitting and Regression by Brad Morantz
Building and Testing a Theory Steps Decide on what it is you want to explain or predict. 2. Identify the variables that you believe are important.
1 Lecture 2: ANOVA, Prediction, Assumptions and Properties Graduate School Social Science Statistics II Gwilym Pryce
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Toward a Real Time Mesoscale Ensemble Kalman Filter Gregory J. Hakim Dept. of Atmospheric Sciences, University of Washington Collaborators: Ryan Torn (UW)
Deterministic Solutions Geostatistical Solutions
Performance Characteristics of a Pseudo-operational Ensemble Kalman Filter April 2006, EnKF Wildflower Meeting Greg Hakim & Ryan Torn University of Washington.
Applied Geostatistics
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
Advanced data assimilation methods- EKF and EnKF Hong Li and Eugenia Kalnay University of Maryland July 2006.
Economics 20 - Prof. Anderson
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
REGRESSION AND CORRELATION
Linear and generalised linear models
UNBIASED ESTIAMTION OF ANALYSIS AND FORECAST ERROR VARIANCES
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
Models for model error –Additive noise. What is Q(x 1, x 2, t 1, t 2 )? –Covariance inflation –Multiplicative noise? –Parameter uncertainty –“Structural”
Today Concepts underlying inferential statistics
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Linear Regression Models Powerful modeling technique Tease out relationships between “independent” variables and 1 “dependent” variable Models not perfect…need.
1 BA 555 Practical Business Analysis Review of Statistics Confidence Interval Estimation Hypothesis Testing Linear Regression Analysis Introduction Case.
A comparison of hybrid ensemble transform Kalman filter(ETKF)-3DVAR and ensemble square root filter (EnSRF) analysis schemes Xuguang Wang NOAA/ESRL/PSD,
Lecture II-2: Probability Review
Simple Linear Regression. Introduction In Chapters 17 to 19, we examine the relationship between interval variables via a mathematical equation. The motivation.
Classification: Internal Status: Draft Using the EnKF for combined state and parameter estimation Geir Evensen.
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Standard error of estimate & Confidence interval.
Regression and Correlation Methods Judy Zhong Ph.D.
Correlation and Regression
Gaussian process modelling
Ensemble-variational sea ice data assimilation Anna Shlyaeva, Mark Buehner, Alain Caya, Data Assimilation and Satellite Meteorology Research Jean-Francois.
ESA DA Projects Progress Meeting 2University of Reading Advanced Data Assimilation Methods WP2.1 Perform (ensemble) experiments to quantify model errors.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Eidgenössisches Departement des Innern EDI Bundesamt für Meteorologie und Klimatologie MeteoSchweiz Statistical Characteristics of High- Resolution COSMO.
Basic Probability (Chapter 2, W.J.Decoursey, 2003) Objectives: -Define probability and its relationship to relative frequency of an event. -Learn the basic.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Applications of Regression to Water Quality Analysis Unite 5: Module 18, Lecture 1.
Model dependence and an idea for post- processing multi-model ensembles Craig H. Bishop Naval Research Laboratory, Monterey, CA, USA Gab Abramowitz Climate.
Data Assimilation Using Modulated Ensembles Craig H. Bishop, Daniel Hodyss Naval Research Laboratory Monterey, CA, USA September 14, 2009 Data Assimilation.
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
MODEL ERROR ESTIMATION EMPLOYING DATA ASSIMILATION METHODOLOGIES Dusanka Zupanski Cooperative Institute for Research in the Atmosphere Colorado State University.
MPO 674 Lecture 28 4/23/15. The course on one slide 1. Intro: numerical models, ensembles, the science of prediction 2. Lorenz 1963,
Inferential Statistics Introduction. If both variables are categorical, build tables... Convention: Each value of the independent (causal) variable has.
Latent regression models. Where does the probability come from? Why isn’t the model deterministic. Each item tests something unique – We are interested.
Tracking with dynamics
Chapter 11: Linear Regression and Correlation Regression analysis is a statistical tool that utilizes the relation between two or more quantitative variables.
Statistics Presentation Ch En 475 Unit Operations.
Page 1 Andrew Lorenc WOAP 2006 © Crown copyright 2006 Andrew Lorenc Head of Data Assimilation & Ensembles Numerical Weather Prediction Met Office, UK Data.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
A Random Subgrouping Scheme for Ensemble Kalman Filters Yun Liu Dept. of Atmospheric and Oceanic Science, University of Maryland Atmospheric and oceanic.
Economics 20 - Prof. Anderson1 Time Series Data y t =  0 +  1 x t  k x tk + u t 1. Basic Analysis.
Future Directions in Ensemble DA for Hurricane Prediction Applications Jeff Anderson: NCAR Ryan Torn: SUNY Albany Thanks to Chris Snyder, Pavel Sakov The.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
ECMWF/EUMETSAT NWP-SAF Satellite data assimilation Training Course Mar 2016.
Introduction to Inference
Regression Analysis AGEC 784.
Chapter 10 Verification and Validation of Simulation Models
New Approaches to Data Assimilation
What are their purposes? What kinds?
Type I and Type II Errors
Presentation transcript:

Parameter estimation: To what extent can data assimilation techniques correctly uncover stochasticity? Jim Hansen MIT, EAPS (with lots of help from Cecile Penland and Greg Lawson)

Indistinguishable?

Accounting for vs. reducing model inadequacy Accounting for model inadequacy –“If you can show me how I can make better forecasts using chicken bones and voodoo dolls, then I’m going to use them!” »Harold Brooks, NSSL –initial conditions (Q-term) –forecasts (MM, stochastic, MOS, forecast 4d-Var) Reducing model inadequacy –Making changes to our model so that it becomes a better representation of the true system –parametric error –structural error

Accounting for vs. reducing model inadequacy Accounting for model inadequacy –“If you can show me how I can make better forecasts using chicken bones and voodoo dolls, then I’m going to use them!” »Harold Brooks, NSSL –initial conditions (Q-term) –forecasts (MM, stochastic, MOS, forecast 4d-Var) Reducing model inadequacy –Making changes to our model so that it becomes a better representation of the true system –parametric error –structural error

Reducing model inadequacy Reducing model inadequacy is best framed as an off-line, or “reanalysis” activity –The process of attempting to identify model inadequacy tends to make both initial conditions and forecasts worse –The aim is to quantify how the model is wrong, fix it, and then worry about data assimilation and forecasting

A proposed approach Use data assimilation tools to alter model parameters to better fit observations Identify relationships between fit parameters and prognostic variables (a parametric MOS) Change model to reflect relationships Repeat When all relationships have been uncovered, the history of fit parameter values provides a distribution from which to (carefully) draw for the purpose of stochastic parameterizations

Use data assimilation tools to alter model parameters to better fit observations Augment control vector with unknown parameters Augmentation removes the nonlinearity from the observation operator and inserts it into the specification of the control vector

Augmented control vector sample covariance

Parametric error example: L63 System equationsModel equations

Importance of a state-dependent background error covariance 4d-Var, static covarianceEnsemble 4d-Var time parameter

Structural error example: Lorenz ‘96 System: Model:

x(1) parameter Regressing parameter vs. prognostic variable gives:

Alter model equations with new information System: Original model: New model:

Example: Lorenz ’96 Model II

System: Model:

parameter x(1) Regressing parameter vs. prognostic variable gives:

System: Original model: New model:

SDE crash course The type of calculus used to integrate the stochastic bits of SDEs matters –Stratonovich calculus noise process is continuous (typical assumption for geophysical fluid flows) –Ito calculus noise process is discrete (like DA!) SDEs can be tricky (and expensive) to integrate –used stochastic RK4 (Hansen and Penland, 2005)

What if the system really is stochastic? System is an SDEModel is an ODE

Can DA uncover the correct form of the stochasticity? - NO Ensemble 4d-VarEnKF parameter time

Why can’t DA uncover the correct form of the stochasticity? Stochasticity operating at different time-scales –SDE has infinitesimal time-scale –ODE with DA has 6-hourly time-scale System is using Stratonovich calculus, DA is using Ito calculus All leads to a danger of misinterpretation Model error!

1.Deterministic model using constant, tuned parameter value ( ) 2.Stochastic model using mean and standard deviation of tuned parameter value ( ) 3.Deterministic, multi-model ensemble with parameters drawn from ( ) 4.Deterministic model where parameter varies in the same manner as it was estimated ( ) How should we use this information for forecasting? Tuned deterministic

1.Deterministic model using constant, tuned parameter value ( ) 2.Stochastic model using mean and standard deviation of tuned parameter value ( ) 3.Deterministic, multi-model ensemble with parameters drawn from ( ) 4.Deterministic model where parameter varies in the same manner as it was estimated ( ) How should we use this information for forecasting? Incorrect SDE

1.Deterministic model using constant, tuned parameter value ( ) 2.Stochastic model using mean and standard deviation of tuned parameter value ( ) 3.Deterministic, multi-model ensemble with parameters drawn from ( ) 4.Deterministic model where parameter varies in the same manner as it was estimated ( ) How should we use this information for forecasting? Multi-model where the draw from is held constant over the entire forecast period.

1.Deterministic model using constant, tuned parameter value ( ) 2.Stochastic model using mean and standard deviation of tuned parameter value ( ) 3.Deterministic, multi-model ensemble with parameters drawn from ( ) 4.Deterministic model where parameter varies in the same manner as it was estimated ( ) How should we use this information for forecasting? Hybrid where the draw from is made every 6 model hours.

Median of ensemble mean forecast distributions Tuned deterministic Incorrect SDE Multi-model Hybrid Perfect Normalized RMSE Forecast lead (model days)

std(err/ens_std) Must assess probabilistically! Tuned deterministic Incorrect SDE Multi-model Hybrid Perfect Forecast lead (model days)

relative entropy Relative (to perfect) entropy Multi-model Hybrid Forecast lead (model days)

What if we use a stochastic model? System is an SDEModel is an SDE

Now can DA uncover the correct form of the stochasticity? - NO parameter std time parameter mean time

What’s the problem this time? Wrong trajectory of random numbers Model error!

SDE forecast errors

What does it all mean? Deterministic model DA approaches alone are not enough to uncover the correct form of stochasticity –Implies that we cannot attach physical significance to tuned parameter values or distributions Our efforts to reduce model inadequacy ultimately lead to a sensible way to account for model inadequacy Synoptic time-scale, Ito-like stochasticity via parameter estimation does a great job accounting for model inadequacy during forecasting

The future(?) of data assimilation Model error issues Nonlinearity New disciplines: e.g. paleo, climate Improved image DA is part of a larger problem

The future(?) of data assimilation Nonlinearity –Implementing nonlinear approaches –Extend minimum error variance approaches a bit more into the nonlinear regime Feature-based non-Gaussianity

The future(?) of data assimilation Improved image –DA has a bad/boring reputation –Ensemble methods bringing DA to the masses University research can be quasi-operational Reasonable DA now where none before

ATMOS

COLLEGE

The future(?) of data assimilation Improved image –DA has a bad/boring reputation –Ensemble methods bringing DA to the masses University research can be quasi-operational Reasonable DA now where none before

The future(?) of data assimilation DA is part of a larger problem –The future of DA is not independent of the future of observations, ensemble forecasting, verification, calibration, etc.. Ensemble forecasting Targeting Increasing ensemble forecast size at low cost Ensemble synoptic analysis

Transformed Lag Ensemble Forecasting (TLEF) Ensemble size is increased by using ensemble-based data assimilation techniques to transform (scale and rotate) old forecasts using new observations. Time 

The future(?) of data assimilation DA is part of a larger problem –The future of DA is not independent of the future of observations, ensemble forecasting, verification, calibration, etc.. Ensemble forecasting Targeting Increasing ensemble size at low cost Ensemble synoptic analysis

Hakim and Torn WRF, 100 ensemble members, surface pressure obs

Hakim and Torn

Ensembles make PV inversion fun and easy! Note, no worries about balance assumptions or boundary conditions Approach defined by Hakim and Torn

The future(?) of data assimilation Model error issues Nonlinearity New disciplines: e.g. paleo, climate Sales/Marketing DA is part of a larger problem