Verification of clustering properties of extreme daily temperatures in winter and summer using the extremal index in five downscaled climate models José.A.

Slides:



Advertisements
Similar presentations
Review bootstrap and permutation
Advertisements

Unit 1.1 Investigating Data 1. Frequency and Histograms CCSS: S.ID.1 Represent data with plots on the real number line (dot plots, histograms, and box.
Statistics 100 Lecture Set 6. Re-cap Last day, looked at a variety of plots For categorical variables, most useful plots were bar charts and pie charts.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
Snow Trends in Northern Spain. Analysis and Simulation with Statistical Downscaling Methods Thanks to: Daniel San Martín, Sixto.
Unit 32 STATISTICS.
ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.
Statistics.
1 Statistical Inference H Plan: –Discuss statistical methods in simulations –Define concepts and terminology –Traditional approaches: u Hypothesis testing.
Lesson Fourteen Interpreting Scores. Contents Five Questions about Test Scores 1. The general pattern of the set of scores  How do scores run or what.
Lecture 5 Outline – Tues., Jan. 27 Miscellanea from Lecture 4 Case Study Chapter 2.2 –Probability model for random sampling (see also chapter 1.4.1)
Edpsy 511 Homework 1: Due 2/6.
Descriptive statistics (Part I)
Quantitative Business Methods for Decision Making Estimation and Testing of Hypotheses.
Density Curves and Normal Distributions
Measures of Central Tendency
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 12 Describing Data.
ITEC6310 Research Methods in Information Technology Instructor: Prof. Z. Yang Course Website: c6310.htm Office:
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Review and Preview This chapter combines the methods of descriptive statistics presented in.
Lecture 14 Sections 7.1 – 7.2 Objectives:
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Chapter 4 Statistics. 4.1 – What is Statistics? Definition Data are observed values of random variables. The field of statistics is a collection.
Chapter Eleven A Primer for Descriptive Statistics.
Section 3.1 Measures of Center. Mean (Average) Sample Mean.
University of Ottawa - Bio 4118 – Applied Biostatistics © Antoine Morin and Scott Findlay 08/10/ :23 PM 1 Some basic statistical concepts, statistics.
RESULTS AND CONCLUSIONS  Most of significant trends for N are negative for all thresholds and seasons. The largest number of significant negative trends.
Measures of Variability In addition to knowing where the center of the distribution is, it is often helpful to know the degree to which individual values.
Instrumentation (cont.) February 28 Note: Measurement Plan Due Next Week.
Descriptive Statistics: Numerical Methods
Statistics Measures Chapter 15 Sections
Describing distributions with numbers
Skewness & Kurtosis: Reference
Summary Five numbers summary, percentiles, mean Box plot, modified box plot Robust statistic – mean, median, trimmed mean outlier Measures of variability.
Chapter 3 For Explaining Psychological Statistics, 4th ed. by B. Cohen 1 Chapter 3: Measures of Central Tendency and Variability Imagine that a researcher.
Chapter 2 Means to an End: Computing and Understanding Averages Part II  igma Freud & Descriptive Statistics.
FREQUANCY DISTRIBUTION 8, 24, 18, 5, 6, 12, 4, 3, 3, 2, 3, 23, 9, 18, 16, 1, 2, 3, 5, 11, 13, 15, 9, 11, 11, 7, 10, 6, 5, 16, 20, 4, 3, 3, 3, 10, 3, 2,
EAS31116/B9036: Statistics in Earth & Atmospheric Sciences Lecture 3: Probability Distributions (cont’d) Instructor: Prof. Johnny Luo
Statistics: Unlocking the Power of Data Lock 5 Exam 2 Review STAT 101 Dr. Kari Lock Morgan 11/13/12 Review of Chapters 5-9.
Computational statistics, lecture3 Resampling and the bootstrap  Generating random processes  The bootstrap  Some examples of bootstrap techniques.
Measures of variability: understanding the complexity of natural phenomena.
1 Summarizing Performance Data Confidence Intervals Important Easy to Difficult Warning: some mathematical content.
Chapter 5: Measures of Dispersion. Dispersion or variation in statistics is the degree to which the responses or values obtained from the respondents.
Summarizing Risk Analysis Results To quantify the risk of an output variable, 3 properties must be estimated: A measure of central tendency (e.g. µ ) A.
Sundermeyer MAR 550 Spring Laboratory in Oceanography: Data and Methods MAR550, Spring 2013 Miles A. Sundermeyer Computing Basic Statistics.
BASIC STATISTICAL CONCEPTS Statistical Moments & Probability Density Functions Ocean is not “stationary” “Stationary” - statistical properties remain constant.
Time Series - A collection of measurements recorded at specific intervals of time. 1. Short term features Noise: Spike/Outlier: Minor variation about.
Histograms, Frequency Polygons, and Ogives. What is a histogram?  A graphic representation of the frequency distribution of a continuous variable. Rectangles.
LIS 570 Summarising and presenting data - Univariate analysis.
Descriptive Statistics for one Variable. Variables and measurements A variable is a characteristic of an individual or object in which the researcher.
Variability Introduction to Statistics Chapter 4 Jan 22, 2009 Class #4.
S TATISTICAL R EASONING IN E VERYDAY L IFE. In descriptive, correlational, and experimental research, statistics are tools that help us see and interpret.
Synthesis and Review 2/20/12 Hypothesis Tests: the big picture Randomization distributions Connecting intervals and tests Review of major topics Open Q+A.
Statistical Concepts Basic Principles An Overview of Today’s Class What: Inductive inference on characterizing a population Why : How will doing this allow.
Probability and Statistics for Computer Scientists Second Edition, By: Michael Baron Chapter 8: Introduction to Statistics CIS Computational Probability.
Descriptive Statistics (Part 2)
STATISTICS Random Variables and Distribution Functions
Extreme precipitation changes for the different PDRMIP climate drivers
Statistical Methods Carey Williamson Department of Computer Science
Bootstrap Confidence Intervals using Percentiles
Descriptive and inferential statistics. Confidence interval
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
STA 291 Spring 2008 Lecture 5 Dustin Lueker.
Estimation Goal: Use sample data to make predictions regarding unknown population parameters Point Estimate - Single value that is best guess of true parameter.
Mean, Median, Mode The Mean is the simple average of the data values. Most appropriate for symmetric data. The Median is the middle value. It’s best.
Section 9.2 Variability.
Descriptive Statistics
Advanced Algebra Unit 1 Vocabulary
Introductory Statistics
Presentation transcript:

Verification of clustering properties of extreme daily temperatures in winter and summer using the extremal index in five downscaled climate models José.A. LÓPEZ Climatological Techniques Unit AEMET Spain EMMS & ECAM 2011, Berlin

Outline Methodology –Extremal Index θ: definition, example –Estimation of θ –Declustering procedure –Bootstrapping technique for C.I. –Data, deviation index Verification for Dec-Jan lowest daily temperatures –Observed θ values –Statistics of verification of θ for AR4 models –Some results including AR3 models Verification for Jul-Aug highest daily temperatures –.... –... EMMS & ECAM 2011, Berlin

3 The Extremal Index: Definition The Extremal Index θ is a statistical measure of the clustering in a stationary series. It varies between 0 and 1, with 1 corresponding to absence of clustering (Poisson process) Formal definition: Let X(i), i=1,..,n be a stationary series of r.v. with cdf F (with F*= 1-F); define M(n)= max(X(i): 1 ≤ i ≤ n). We say that the process X(i) has extremal index θ ε [0, 1] if for each τ > 0 there is a succession u(n) such that for n -> ∞, a) n F* (u(n)) -> τ (mean nº of exceedances = τ) b) P ( M(n) ≤ u(n) ) -> exp (- θ τ) If θ = 1 the exceedances of progressively higher thresholds u(n) occur independently, i.e. They for a Poisson process (this is the case of independent r.v X(i)

EMMS & ECAM 2011, Berlin4 The Extremal Index : interexceedance times The extremal index is the proportion of interexceedance times that may be regarded as intercluster times. This fact is used for declustering.

EMMS & ECAM 2011, Berlin5 The Extremal Index : simulation with an ARMAXprocess The ARMAX process is defined by: where de Z’s are standard independent Fechet variables, i.e. prob(Z < x) = Exp (-1/x) This process has an extremal index: Θ = 1 - α

EMMS & ECAM 2011, Berlin6 The Extremal Index : simulation for θ = 1 (Poisson)

EMMS & ECAM 2011, Berlin7 The Extremal Index : simulation for θ = 0.8

EMMS & ECAM 2011, Berlin8 The Extremal Index : simulation for θ = 0.5

EMMS & ECAM 2011, Berlin9 The Extremal Index : simulation for θ = 0.2

EMMS & ECAM 2011, Berlin10 If the T i are the successive times between exceedances of the high threshold u the Extremal Index is estimates by: (Ferro, C.A.T. “Inference for cluster of extreme values”, J.R.Statist.Soc. B(2003), 65, Part 2, ). Estimation of the Extremal Index

EMMS & ECAM 2011, Berlin11 Objective: Define the clusters in a series of exceedances The times between exceedances are classified as inter-cluster times or intra-cluster (belonging to the same cluster) ones according to their length. The criterion used is “objective” and simple, it depends only the Extremal Index θ. More specifically the longest θ N inter-exceedance times are assigned an inter-cluster character, the rest are assigned an intra-cluster character. Between two successive inter-cluster times there is a set (which may be void) of intra-cluster times Declustering procedure

EMMS & ECAM 2011, Berlin12 In order to build confidence intervals for the θ of a series, a “bootstrapping” technique was used: a)Sample with replacement successively from the set of inter-cluster times, and then from the set of sets of intra-cluster times to build a fictitious process b)Compute the θ of this fictitious process c)Repeat the above steps the desired nº of times to build the confidence interval Bootstrapping technique

EMMS & ECAM 2011, Berlin13 Data and models used Period: Data used: observed and dowscaled daily temperature at 16 observatories of Spain Models AR4: cccma-cgcm3 (CA), gfdl-cm2 (US), inmcm3 (RU), mpi-echam5 (AL), mri-cgcm2 (JA) Models AR3: ECHAM4, HadAM3, CGCM2 The statistical downscaling technique was analog-based

EMMS & ECAM 2011, Berlin14 The thresholds used to build the exceedances (on 15-day moving windows) –90th percentile for Jul-Aug –10th percentile for Dic-Jan (in this case the values below the threshold are found) In order to assess the differences in θ between observations and downscaled data the following deviation index was used where 1000 bootstrap samples where used to compute the medians and the IQR Verification of the Extremal Index in extreme temperature for downscaled climate models

EMMS & ECAM 2011, Berlin15 Dec-Jan (occurrances below the 10th percentile of daily temperature)

EMMS & ECAM 2011, Berlin16 Observed values of θ Dec-Jan (in percent) Median= 37 Max = 57 Min = 23

EMMS & ECAM 2011, Berlin17 Observed values of θ Dec-Jan : values above (1) and below (-1) the median

EMMS & ECAM 2011, Berlin18 Observed values of θ Dec-Jan : spatial distribution Lowest values of θ (more clustering) in the NE and interior Highest values of θ (less clustering) in the western half

EMMS & ECAM 2011, Berlin19 Verification of θ for AR4 downscaled models Dec-Jan Histogram of absolute deviation index of θ (on the y-axis nº of observatories, on the x-axis accumulated frequencies) Aver. absol. dev. Index: CA (1.3), US(2.3), RU(1.3) AL(0.8) JA (1.2) Aver. dev. Index: CA (0.9), US(2.3), RU(0.2) AL(-0.3) JA (-0.2)

EMMS & ECAM 2011, Berlin20 Verification of θ for AR4 downscaled models Dec-Jan: leading models At each observatory the downscaled model that leads the others in terms of absolute deviation index (in no case by more than 1.0)

EMMS & ECAM 2011, Berlin21 Verification of θ for downscaled models in Dec- Jan: leading models AR4+ 3 AR3 models Aver. dev. Index AR3 : EC (-2.3) HA ( -1.5) CG (-0.5) Aver. dev. Index AR4: CA (0.9), US(2.3), RU(0.2) AL(-0.3) JA (-0.2) Four models of AR4 show little or moderate global bias in θ, whereas with AR3 only one shows little bias (the rest show more clustering)

EMMS & ECAM 2011, Berlin22 Jul-Aug (occurrances above the 90th percentile of daily temperature)

EMMS & ECAM 2011, Berlin23 Observed values of θ Jul-Aug (in percent) Median = 48 Max = 81 Min = 36

EMMS & ECAM 2011, Berlin24 Observed values of θ Jul-Aug : values above (1) and below (-1) the median

EMMS & ECAM 2011, Berlin25 Observed values of θ Jul-Aug : spatial distribution It is more difficult than in the Dic-Jan case to discern spatial patterns of the θ index The northern coast and obsevatories on the Iberian mountain range show above average θ values (less clustering) The contrary (more clustering) happens at the NE extreme (Catalonia)

EMMS & ECAM 2011, Berlin26 Verification of θ for AR4 downcaled models Jul-Aug Histogram of absolute deviation index of θ (on the y-axis nº of observatories, on the x-axis accumulated frequencies) Aver. absol. dev. Index: CA (1.6), US(1.8), RU(3.0) AL(1.6) JA (1.3) Aver. dev. Index: CA (-1.4), US(-1.6), RU(-2.9) AL(-1.2) JA (-0.5)

EMMS & ECAM 2011, Berlin27 Verification of θ for AR4 downcaled models Jul-Aug All the downscaled AR4 models show a bias towards excessive clustering (the Japanese little) in Jul-Aug (though less than in the three AR3 models)

EMMS & ECAM 2011, Berlin28 Verification of θ for AR4 downscaled models Jul-Aug: leading models At each observatory the downscaled model that leads the others in terms of absolute deviation index (with an asterisk when the difference to the others is >1.0)

EMMS & ECAM 2011, Berlin29 Verification of θ for downscaled models in Jul-Aug: leading models AR4+ 3 AR3 models Aver. dev. Index AR3: EC (-2.9) HA ( -4.1) CG (-2.4) Aver. dev. Index AR4: CA (-1.4), US(-1.6), RU(-2.9) AL(-1.2) JA (-0.5) There is a clear decrease in the amount of bias (excess clustering) in AR4 models with respect to AR3·

EMMS & ECAM 2011, Berlin30 END