Error statistics in data assimilation

Slides:



Advertisements
Similar presentations
Ross Bannister Balance & Data Assimilation, ECMI, 30th June 2008 page 1 of 15 Balance and Data Assimilation Ross Bannister High Resolution Atmospheric.
Advertisements

Variational data assimilation and forecast error statistics
Introduction to Data Assimilation NCEO Data-assimilation training days 5-7 July 2010 Peter Jan van Leeuwen Data Assimilation Research Center (DARC) University.
Page 1 of 26 A PV control variable Ross Bannister* Mike Cullen *Data Assimilation Research Centre, Univ. Reading, UK Met Office, Exeter, UK.
Properties of Least Squares Regression Coefficients
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Estimation  Samples are collected to estimate characteristics of the population of particular interest. Parameter – numerical characteristic of the population.
Point estimation, interval estimation
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
Lecture 8 The Principle of Maximum Likelihood. Syllabus Lecture 01Describing Inverse Problems Lecture 02Probability and Measurement Error, Part 1 Lecture.
Retrieval Theory Mar 23, 2008 Vijay Natraj. The Inverse Modeling Problem Optimize values of an ensemble of variables (state vector x ) using observations:
Course AE4-T40 Lecture 5: Control Apllication
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Lecture 7 1 Statistics Statistics: 1. Model 2. Estimation 3. Hypothesis test.
Lecture II-2: Probability Review
Modern Navigation Thomas Herring
Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring Room A;
Standard error of estimate & Confidence interval.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
Short Resume of Statistical Terms Fall 2013 By Yaohang Li, Ph.D.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
3D-Var Revisited and Quality Control of Surface Temperature Data Xiaolei Zou Department of Meteorology Florida State University June 11,
ESA DA Projects Progress Meeting 2University of Reading Advanced Data Assimilation Methods WP2.1 Perform (ensemble) experiments to quantify model errors.
Statistics for Data Miners: Part I (continued) S.T. Balke.
PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.
Random Sampling, Point Estimation and Maximum Likelihood.
The Importance of Atmospheric Variability for Data Requirements, Data Assimilation, Forecast Errors, OSSEs and Verification Rod Frehlich and Robert Sharman.
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
An introduction to data assimilation for the geosciences Ross Bannister Amos Lawless Alison Fowler National Centre for Earth Observation School of Mathematics.
V Gorin and M Tsyrulnikov Can model error be perceived from routine observations?
Federal Department of Home Affairs FDHA Federal Office of Meteorology and Climatology MeteoSwiss High-resolution data assimilation in COSMO: Status and.
Potential benefits from data assimilation of carbon observations for modellers and observers - prerequisites and current state J. Segschneider, Max-Planck-Institute.
Lab 3b: Distribution of the mean
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
A unifying framework for hybrid data-assimilation schemes Peter Jan van Leeuwen Data Assimilation Research Center (DARC) National Centre for Earth Observation.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Risk Analysis & Modelling Lecture 2: Measuring Risk.
ME Mechanical and Thermal Systems Lab Fall 2011 Chapter 3: Assessing and Presenting Experimental Data Professor: Sam Kassegne, PhD, PE.
Geology 5670/6670 Inverse Theory 21 Jan 2015 © A.R. Lowry 2015 Read for Fri 23 Jan: Menke Ch 3 (39-68) Last time: Ordinary Least Squares Inversion Ordinary.
Confidence Interval & Unbiased Estimator Review and Foreword.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Forecasts & their Errors Ross Bannister 7th October / 23 PDFsAssimilationMeasuringModellingRefiningChallenges Forecasts & their Errrors Ross Bannister.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
École Doctorale des Sciences de l'Environnement d’Île-de-France Année Universitaire Modélisation Numérique de l’Écoulement Atmosphérique et Assimilation.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.
ECMWF/EUMETSAT NWP-SAF Satellite data assimilation Training Course Mar 2016.
ESTIMATION.
(5) Notes on the Least Squares Estimate
3. The X and Y samples are independent of one another.
Sampling Distributions and Estimation
Data Assimilation Theory CTCD Data Assimilation Workshop Nov 2005
STAT 311 REVIEW (Quick & Dirty)
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Inference about the Slope and Intercept
Chapter 7 Sampling Distributions.
Inference about the Slope and Intercept
EE513 Audio Signals and Systems
Chapter 7 Sampling Distributions.
Chapter 7 Sampling Distributions.
Chapter 8: Fundamental Sampling Distributions and Data Descriptions:
Chapter 7 Sampling Distributions.
Sarah Dance DARC/University of Reading
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Regression Models - Introduction
Fundamental Sampling Distributions and Data Descriptions
Presentation transcript:

Error statistics in data assimilation “All models are wrong …” (George Box) “All models are wrong and all observations are inaccurate” (a data assimilator) Ross Bannister NCEO University of Reading, UK, r.n.bannister@reading.ac.uk

Distinction between ‘errors’ and ‘error statistics’ When people say ‘errors’ they sometimes (but not always) mean ‘error statistics’ Error: The difference between some estimated/measured quantity and its true value. E.g. εest = xest – xtrue or εy = y – ytrue Errors are unknown and unknowable quantities Error Statistics: Some useful measure of the possible values that ε could have. E.g. a PDF Error statistics are knowable, although often difficult to determine – even in the Gaussian case. Here, error statistics = second moment (ie assume PDFs are Gaussian and unbiased, <ε> = 0). E.g. second moment of ε, <ε2> (called a variance), or <ε2>1/2 = σ (standard deviation). If only the variance is known, then the PDF is approximated as a Gaussian. P(ε) ~ exp – ε2/2<ε2> ε PDF(ε) σ= √<ε2> ε

A B C D E F G H I This talk ... A. What quantities should be assigned error statistics in data assimilation? B. How are error statistics important in data assimilation? C. ‘Observation’ and ‘state’ vectors. D. ‘Inner’ and ‘outer’ products. E. Forms of (Gaussian) error covariances. F. Link between Bayes’ Theorem and the variational cost function. G. Link between the variational cost function and the ‘BLUE’ formula. H. Example with a single observation. I. Forecast error covariance statistics.

A B C D E F G H I A. What quantities should be assigned error statistics in data assimilation? All data that are being fitted to. Observations (real-world data). Prior data for the system’s state. Prior data for any unknown parameters. Data that have been fitted. Data assimilation-fitted data (analysis, ie posteriori error statistics). } Information available about the system before observations are considered.

B. How are error statistics important in data assimilation? A B C D E F G H I B. How are error statistics important in data assimilation? 1. Error statistics give a measure of confidence in data No assim Assim with large obs errors Assim with small obs errors

B. How are error statistics important in data assimilation? A B C D E F G H I B. How are error statistics important in data assimilation? 2. Error statistics of prior data imply relationships between variables time x2 Background forecast (no assim) Analysis forecast (consistent with prior and ob errors) x1 x1 and x2 cannot be varied independently by the assimilation here because of the shape of the prior joint PDF. Known relationships between variables are often exploited to gain knowledge of the complicated nature of the prior error statistics (e.g. changes in pressure are associated with changes in wind in the mid-latitude atmosphere (geostrophic balance).

C. ‘Observation’ and ‘state’ vectors A B C D E F G H I C. ‘Observation’ and ‘state’ vectors y = x = The observation vector – comprising each observation made. There are p observations. The structure of the state vector for the example of meteorological fields (u, v, θ, p, q are meteorological 3-D fields; λ, φ and ℓ are longitude, latitude and vertical level). There are n elements in total.

D. ‘Inner’ and ‘outer’ products A B C D E F G H I D. ‘Inner’ and ‘outer’ products The inner product (‘scalar’ product) between two vectors gives a scalar The outer product between two vectors gives a matrix When ε is a vector of errors, <εεT> is a (symmetric) error covariance matrix

E. Forms of (Gaussian) error covariances A B C D E F G H I E. Forms of (Gaussian) error covariances σ= √<ε2> The one-variable case The many variable case <x>

F. Link between Bayes’ Theorem and the variational cost function A B C D E F G H I F. Link between Bayes’ Theorem and the variational cost function Bayes theorem links the following PDF of the observations (given the truth) PDF of the prior information (the background state) PDF of the state (given the observations – this is the objective of data assimilation)

G. Link between the variational cost function and the ‘BLUE’ formula A B C D E F G H I G. Link between the variational cost function and the ‘BLUE’ formula

H. Example with a single observation A B C D E F G H I H. Example with a single observation Analysis increment of the assimilation of a direct observation of one variable. Obs of atmospheric pressure →

I. Forecast error covariance statistics A B C D E F G H I I. Forecast error covariance statistics In data assimilation prior information often comes from a forecast. Forecast error covariance statistics (Pf) specify how the forecast might be in error εf = xf – xtrue, Pf = <εf εfT>. How could Pf be estimated for use in data assimilation? Analysis of innovations (*). Differences of varying length forecasts. Monte-Carlo method (*). Forecast time lags. Problems with the above methods. A climatological average forecast error covariance matrix is called B.

I.1 Analysis of innovations A B C D E F G H I I.1 Analysis of innovations We don’t know the truth, but we do have observations of the truth with known error statistics. Definition of observation error : y = ytrue + εy = h(xtrue) + εy Definition of forecast error : xtrue = xf – εf Eliminate xtrue : y = h(xf – εf) + εy ≈ h(xf ) - Hεf + εy ‘Innovation’ : y - h(xf ) ≈ εy - Hεf LHS (known), RHS(unknown) Take pairs of in-situ obs whose errors are uncorrelated (for variable v1, posn r and v2, r+Δr) y(v1,r) - xf (v1,r) ≈ εy(v1,r) - εf(v1,r) y(v2,r +Δr) - xf (v2,r +Δr) ≈ εy(v2,r +Δr) - εf(v2,r +Δr) Covariances <[y(v1,r) - xf (v1,r)] [y(v2,r +Δr) - xf (v2,r +Δr)]> = <[εy(v1,r) - εf(v1,r)] [εy(v2,r +Δr) - εf(v2,r +Δr)]> = <εy(v1,r) εy(v2,r +Δr)> - <εy(v1,r) εf(v2,r +Δr)> - <εf(v1,r ) εy(v2,r +Δr)> + <εf(v1,r) εf(v2,r +Δr)> ↑ ↑ ↑ ↑ Obs error covariance Zero (obs and forecast errors Forecast error covariance between (v1, r) and (v2, r+Δr) uncorrelated) between (v1, r) and (v2, r+Δr) zero unless v1=v2 and Δr=0 (one particular matrix element of Pf or B) <> average over available observations and sample population of forecasts

I.2 Monte-Carlo method (ensembles) A B C D E F G H I I.2 Monte-Carlo method (ensembles) N members of an ensemble of analyses. Leads to N members of an ensemble of forecasts. The ensemble must capture the errors contributing to forecast errors. Initial condition errors (forecast/observation/assimilation errors from previous data assimilation). Model formulation errors (finite resolution, unknown parameters, …). Unknown forcing. Can be used to estimate the forecast error covariance matrix, e.g. Pf ≈ < (x-<x>) (x-<x>) T > = 1/(N-1) ∑i=1,N (xi - <x>) (xi - <x>)T Problem: for some applications N << n. n elements of the state vector (in Meteorology can be 107). N ensemble members (typically 102). Consequence – when Pf acts on a vector, the result is forced to lie in the sub-space spanned by the N ensemble members. Ensembles t x

Comments and Questions ! ?