BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics,

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
Managerial Economics in a Global Economy
General Linear Model With correlated error terms  =  2 V ≠  2 I.
Computational Statistics. Basic ideas  Predict values that are hard to measure irl, by using co-variables (other properties from the same measurement.
The Simple Regression Model
The Multiple Regression Model.
Pattern Recognition and Machine Learning
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
The General Linear Model Or, What the Hell’s Going on During Estimation?
L.M. McMillin NOAA/NESDIS/ORA Regression Retrieval Overview Larry McMillin Climate Research and Applications Division National Environmental Satellite,
Chapter 2: Lasso for linear models
Lecture 8 Relationships between Scale variables: Regression Analysis
The Multiple Regression Model Prepared by Vera Tabakova, East Carolina University.
The Simple Linear Regression Model: Specification and Estimation
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
9. SIMPLE LINEAR REGESSION AND CORRELATION
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
A quick introduction to the analysis of questionnaire data John Richardson.
Ordinary least squares regression (OLS)
Topic 3: Regression.
Review for Exam 2 Some important themes from Chapters 6-9 Chap. 6. Significance Tests Chap. 7: Comparing Two Groups Chap. 8: Contingency Tables (Categorical.
Method of Soil Analysis 1. 5 Geostatistics Introduction 1. 5
Statistical Methods for long-range forecast By Syunji Takahashi Climate Prediction Division JMA.
Lecture II-2: Probability Review
Introduction to Regression Analysis, Chapter 13,
Simple Linear Regression Analysis
1 PREDICTION In the previous sequence, we saw how to predict the price of a good or asset given the composition of its characteristics. In this sequence,
Two and a half problems in homogenization of climate series concluding remarks to Daily Stew Ralf Lindau.
Regression and Correlation Methods Judy Zhong Ph.D.
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Chapter 11 Simple Regression
Intermediate Econometrics
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Hypothesis test in climate analyses Xuebin Zhang Climate Research Division.
1 The Venzke et al. * Optimal Detection Analysis Jeff Knight * Venzke, S., M. R. Allen, R. T. Sutton and D. P. Rowell, The Atmospheric Response over the.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 3: LINEAR MODELS FOR REGRESSION.
Detection of an anthropogenic climate change in Northern Europe Jonas Bhend 1 and Hans von Storch 2,3 1 Institute for Atmospheric and Climate Science,
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
Managerial Economics Demand Estimation & Forecasting.
Y X 0 X and Y are not perfectly correlated. However, there is on average a positive relationship between Y and X X1X1 X2X2.
Why Model? Make predictions or forecasts where we don’t have data.
ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.
3.4 The Components of the OLS Variances: Multicollinearity We see in (3.51) that the variance of B j hat depends on three factors: σ 2, SST j and R j 2.
CHAPTER 5 SIGNAL SPACE ANALYSIS
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
WCRP Extremes Workshop Sept 2010 Detecting human influence on extreme daily temperature at regional scales Photo: F. Zwiers (Long-tailed Jaeger)
Class 5 Multiple Regression CERAM February-March-April 2008 Lionel Nesta Observatoire Français des Conjonctures Economiques
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
Quantitative Methods. Bivariate Regression (OLS) We’ll start with OLS regression. Stands for  Ordinary Least Squares Regression. Relatively basic multivariate.
Learning Theory Reza Shadmehr Distribution of the ML estimates of model parameters Signal dependent noise models.
Estimation Econometría. ADE.. Estimation We assume we have a sample of size T of: – The dependent variable (y) – The explanatory variables (x 1,x 2, x.
Bias-Variance Analysis in Regression  True function is y = f(x) +  where  is normally distributed with zero mean and standard deviation .  Given a.
Presentation : “ Maximum Likelihood Estimation” Presented By : Jesu Kiran Spurgen Date :
Estimating standard error using bootstrap
Why Model? Make predictions or forecasts where we don’t have data.
Dynamic Models, Autocorrelation and Forecasting
Ch3: Model Building through Regression
Fundamentals of regression analysis
The Regression Model Suppose we wish to estimate the parameters of the following relationship: A common method is to choose parameters to minimise the.
Contrasts & Statistical Inference
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Simple Linear Regression
Chapter 3 General Linear Model
Principal Component Analysis
Contrasts & Statistical Inference
Presentation transcript:

BioSS reading group Adam Butler, 21 June 2006 Allen & Stott (2003) Estimating signal amplitudes in optimal fingerprinting, part I: theory. Climate dynamics, 21,

1: Introduction Optimal fingerprinting: statistical methods for climate change detection & attribution Attempt to assess the extent to which spatial and temporal patterns in observed climate data are related to corresponding patterns within outputs generated by climate models Assume climate variability independent of externally forced signals of climate change

“attribution of observed climate change to a given combination of human activity and natural influences… requires careful assessment of multiple lines of evidence to demonstrate, within a pre-specified margin of error, that the observed changes are: unlikely to be due entirely to natural variability consistent with the estimated responses to the given combination of anthropogenic and natural forcing; and not consistent with alternative explanations of recent climate change that exclude important elements of the given combination of forcings.”

The current paper Optimal fingerprinting is just a particular take on multiple regression The current paper attempts to deal with one element of climate model uncertainty Does this by replacing Ordinary Least Squares with Total Least Squares: a standard approach to “errors-in-variables”

Model uncertainty A+S define sampling uncertainty to be - “the variability in the model-simulated response which would be observed if the ensemble of simulations were repeated with an identical model and forcing and different initial conditions…” They argue this limited definition is difficult to generalise in practice...

Avoiding model uncertainty Restrict attention to mid c21 estimates - signal-to-noise ratio by then so high that inter-ensemble variation is unimportant Use a purely correlative approach Use a noise-free model such an energy balance model to simulate response pattern Use a large number of ensemble runs

Problems Standard optimal fingerprinting uses OLS, estimates can be severely biased towards zero when errors in explanatory variables Bias particularly problematic when estimating upper limits of uncertainty intervals (Fig. 1)

2.1: Optimal fingerprinting Basic model: “Pre-whitening”: find a matrix P such that Rank of P typically [much] smaller than length of y

Minimise P is IID noise, so the solution is: (ordinary least squares) Compute confidence intervals based on standard asymptotic distributions…

2.2: Noise variance unknown Ignoring uncertainty in estimated noise properties can lead to “artificial skill” Solution: base uncertainty analysis on sets of noise realisations which are statistically independent of those used to estimate P Obtain such realisations from segments of a control run of a climate model Elements are not mutually independent…

3. Errors in variables Extended model: Pre-whitening: Seek to solve (Fig. 2)

3.1: Total least squares: estimation of  Seek to minimise:

Solution to the corresponding eigenequation takes s 2 to be smallest eigenvalue of Z T Z & takes as the corresponding eigenvector Use a singular value decomposition Can show that

“…in geometric terms minimising s 2 is equivalent to finding the m- dimensional plane in an m /- dimensional space which minimises the sum squared perpendicular distance from the plane to the k points defined by the rows of Z…” (Adcock, 1878)

3.2: Total least squares: unknown noise variance If the same runs are used to derive P and to construct CIs about estimates of  then uncertainty will again be underestimated As in standard Optimal Fingerprinting, can account for uncertainty in noise variance by using a set of independent control runs…

3.3: Open-ended confidence intervals The  quantify the ratio of the observed to the model-simulated responses In TLS we estimate the angle of the slope relating observations to model response Can obtain highly asymmetric confidence intervals when transform back to  scale via tan(slope) - intervals can even contain infinity

4. Application to a chaotic system Non-linear system of Palmer & Lorenz, which corresponds to low-order deterministic chaos:

Some properties of the Palmer model – Radically different properties at differ aggregation levels (Figs. 3 & 4) Sign of response in X direction depends on the amplitude of the forcing (Fig. 5) Variability at fine resolution changes due to forcing with a plausible amplitude, but variability at coarser resolution does not…

A+S choose this system because: it is a plausible model of true climate - “…Palmer (1999) observed that climate change is a nonlinear system which could also thouht of as a change in the occupancy statistics of certain preferred ‘weather regimes’ in response to external forcing…” optimal fingerprinting may be expected to have problems with the nonlinearity

Use the Palmer model to simulate - 1)pseudo-observations y under a linear increase in forcing from 0 to 5 units 2)spatio-temporal response patterns X for a set of ensemble runs 3)The level of internal variability using an unforced control run from the model

Investigate performance of OLS and TLS, for different numbers of ensembles and different averaging periods (50Ld or 500Ld) Figure 6: look at the (true) hypothesis  =1 OLS consistently underestimates observed response amplitude for small number of ensembles

5. Discussion Promoted as an approach to attribution problems when few ensembles are available Most relevant for low signal-to-noise ratio Linear: relies on assumption that forcing does not change level of climate variability Good performance relative to OF with OLS in simulations under deterministic chaos