C O M P U T A T I O N A L R E S E A R C H D I V I S I O N You need how many runs?! Michael F. Wehner Lawrence Berkeley National Laboratory

Slides:



Advertisements
Similar presentations
Chapter 4. Elements of Statistics # brief introduction to some concepts of statistics # descriptive statistics inductive statistics(statistical inference)
Advertisements

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Application of Generalized Extreme Value theory to coupled general circulation models Michael.
Independent Measures T-Test Quantitative Methods in HPELS 440:210.
Sampling: Final and Initial Sample Size Determination
Probability & Statistical Inference Lecture 7 MSc in Computing (Data Analytics)
Ch 6 Introduction to Formal Statistical Inference.
Jump to first page STATISTICAL INFERENCE Statistical Inference uses sample data and statistical procedures to: n Estimate population parameters; or n Test.
Error Propagation. Uncertainty Uncertainty reflects the knowledge that a measured value is related to the mean. Probable error is the range from the mean.
Sample size computations Petter Mostad
Sampling-big picture Want to estimate a characteristic of population (population parameter). Estimate a corresponding sample statistic Sample must be representative.
Characterizing Baseline Water Body Conditions. What? Confirm impairments and identify problems Statistical summary Spatial analysis Temporal analysis.
Analysis of Simulation Input.. Simulation Machine n Simulation can be considered as an Engine with input and output as follows: Simulation Engine Input.
4-1 Statistical Inference The field of statistical inference consists of those methods used to make decisions or draw conclusions about a population.
Chapter 12 Inferring from the Data. Inferring from Data Estimation and Significance testing.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.
Sample size. Ch 132 Sample Size Formula Standard sample size formula for estimating a percentage:
Normal and Sampling Distributions A normal distribution is uniquely determined by its mean, , and variance,  2 The random variable Z = (X-  /  is.
 Deviation is a measure of difference for interval and ratio variables between the observed value and the mean.  The sign of deviation (positive or.
Confidence Intervals. Estimating the difference due to error that we can expect between sample statistics and the population parameter.
Chapter 9 Two-Sample Tests Part II: Introduction to Hypothesis Testing Renee R. Ha, Ph.D. James C. Ha, Ph.D Integrative Statistics for the Social & Behavioral.
AM Recitation 2/10/11.
Introduction to Linear Regression and Correlation Analysis
Aim: How do we find confidence interval? HW#9: complete question on last slide on loose leaf (DO NOT ME THE HW IT WILL NOT BE ACCEPTED)
Chapter 7 Estimation: Single Population
© 2002 Thomson / South-Western Slide 8-1 Chapter 8 Estimation with Single Samples.
There are two main purposes in statistics; (Chapter 1 & 2)  Organization & ummarization of the data [Descriptive Statistics] (Chapter 5)  Answering.
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
Monte Carlo Simulation CWR 6536 Stochastic Subsurface Hydrology.
Measures of Variability Objective: Students should know what a variance and standard deviation are and for what type of data they typically used.
Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:
AP STATISTICS LESSON COMPARING TWO PROPORTIONS.
Delivering Integrated, Sustainable, Water Resources Solutions Monte Carlo Simulation Robert C. Patev North Atlantic Division – Regional Technical Specialist.
STATISTICS INTERVAL ESTIMATION Professor Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Determination of Sample Size: A Review of Statistical Theory
8 Sampling Distribution of the Mean Chapter8 p Sampling Distributions Population mean and standard deviation,  and   unknown Maximal Likelihood.
Business Statistics: A Decision-Making Approach, 6e © 2005 Prentice-Hall, Inc. Chap 7-1 4th Lesson Estimating Population Values part 2.
Inference for 2 Proportions Mean and Standard Deviation.
Selecting Input Probability Distribution. Simulation Machine Simulation can be considered as an Engine with input and output as follows: Simulation Engine.
AP Statistics Section 10.1 C Determining Necessary Sample Size.
What is a Confidence Interval?. Sampling Distribution of the Sample Mean The statistic estimates the population mean We want the sampling distribution.
© Buddy Freeman, 2015 Let X and Y be two normally distributed random variables satisfying the equality of variance assumption both ways. For clarity let.
- We have samples for each of two conditions. We provide an answer for “Are the two sample means significantly different from each other, or could both.
Confidence Intervals for a Population Proportion Excel.
Ch 8 Estimating with Confidence 8.1: Confidence Intervals.
Business Statistics: Contemporary Decision Making, 3e, by Black. © 2001 South-Western/Thomson Learning 8-1 Business Statistics, 3e by Ken Black Chapter.
Lecture 4 Confidence Intervals. Lecture Summary Last lecture, we talked about summary statistics and how “good” they were in estimating the parameters.
AP Statistics Section 10.1 C Determining Necessary Sample Size.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 3 – Slide 1 of 27 Chapter 11 Section 3 Inference about Two Population Proportions.
Ex St 801 Statistical Methods Inference about a Single Population Mean (CI)
Chapter 14 Single-Population Estimation. Population Statistics Population Statistics:  , usually unknown Using Sample Statistics to estimate population.
Hypothesis Testing and Estimation
Confidence Intervals and Sample Size
Chapter 4. Inference about Process Quality
STAT 312 Chapter 7 - Statistical Intervals Based on a Single Sample
Math 4030 – 10b Inferences Concerning Variances: Hypothesis Testing
AP Statistics: Chapter 7
Section 6-4 – Confidence Intervals for the Population Variance and Standard Deviation Estimating Population Parameters.
Random Sampling Population Random sample: Statistics Point estimate
CI for μ When σ is Unknown
STAT 5372: Experimental Statistics
STATISTICS INTERVAL ESTIMATION
Sampling Distribution
Sampling Distribution
Additional notes on random variables
CHAPTER 6 Statistical Inference & Hypothesis Testing
Additional notes on random variables
Section Means and Variances of Random Variables
Testing and Estimating a Single Variance or Standard Deviation
Correlation A measure of the strength of the linear association between two numerical variables.
Presentation transcript:

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N You need how many runs?! Michael F. Wehner Lawrence Berkeley National Laboratory

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N How many runs should we make?  The answer to this question has always been:  As many as you can afford.  A more quantitative reply is possible if the question is more specific.  How many realizations are necessary to know the mean value of a field to within a specified tolerance and statistical certainty?  Or  How many realizations are necessary to know that differences between models are statistically significant?

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N How many runs?  Q. What is the minimum number of realizations (n) required to estimate model mean output within a specified tolerance (E) and statistical confidence (  )?  A. For a Gaussian distributed random variable:  s 2 =sample variance,   =population variance  N=Number of available realizations  Z and  are properties of the Gaussian function and 

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N How many runs?  The number of runs required depends on:  Which fields are deemed important.  How well defined they need to be. Statistical certainty Tolerance  What scale is needed. Temporal Spatial  The internal variability of the model.

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Chickens and Eggs  But how can we use this formula to estimate ensemble size before we perform the integrations?  How to estimate ensemble variance?  If N=20,  =95% then 0.58s 2 <  2 <2.1s 2  The answer lies in postulating ergodicity of the climate system.  For example, the modeled system is considered ergodic if the inter-realization variance of the mean of each decade from an ensemble of transient runs is statistically identically to the variance of the decadal mean from a long stationary control run.  If the model is ergodic, we can use the control run sample variance estimate in the equation for n.

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Is the modeled climate ergodic on decadal time scales?  Nine transient runs (20c3m)  N=9; 0.45s 2 <  2 <3.7s 2  500 years of control run 2 (picntrl)  N=50; 0.69s 2 <  2 <1.5s 2  F-test at 90% confidence  No significant difference

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Is the modeled climate ergodic?  Decadal mean annual surface air temperature  E=0.5K,  =95%  Centered pattern correlation = 0.95 Control Run

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Is the modeled climate ergodic?  Decadal mean annual precipitation  E = 10% of the mean value,  =95%  Centered pattern correlation = 0.96 Control Run

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Strong seasonal dependence  Decadal mean seasonal surface air temperature  E=0.5K,  =95% DJFJJA

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N Strong seasonal dependence  Decadal mean seasonal precipitation  E = 10% of the mean value,  =95% DJFJJA

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N What about interannual time scales?  Pretty hopeless to determine an annual or seasonal mean at these accuracies for single gridpoints.  Either relax the accuracy or spatially average.

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N What about interannual time scales?

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N

Other considerations.  Double these estimates to perform differences between scenarios to the same accuracies.  Extreme events  ~10 to estimate 20 year return value of annual daily extrema  Pair control runs with transient runs  A clever way to account for drift and/or initialize.  Doubles the number of runs.  The variability of the new model may be different than the current model.  PCM variability is considerably larger than CCSM3.0

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N How many runs?  In the absence of a clearly defined set of specifications:  The final answer …

C O M P U T A T I O N A L R E S E A R C H D I V I S I O N  Remains  As many as you can!