Higher-order Confidence Intervals for Stochastic Programming using Bootstrapping Cosmin G. Petra Joint work with Mihai Anitescu Mathematics and Computer.

Slides:

Advertisements

Similar presentations

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 9 Inferences Based on Two Samples.

Advertisements

Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.

Sampling Distributions (§ )

Chapter 11- Confidence Intervals for Univariate Data Math 22 Introductory Statistics.

Scalable Stochastic Programming Cosmin Petra and Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory Informs Computing.

ELEC 303 – Random Signals Lecture 18 – Statistics, Confidence Intervals Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 10, 2009.

Scalable Multi-Stage Stochastic Programming Cosmin Petra and Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory DOE Applied.

Ch 6 Introduction to Formal Statistical Inference.

ESTIMATION AND CONFIDENCE INTERVALS Up to now we assumed that we knew the parameters of the population. Example. Binomial experiment knew probability of.

Descriptive statistics Experiment  Data  Sample Statistics Experiment  Data  Sample Statistics Sample mean Sample mean Sample variance Sample variance.

1 Pertemuan 06 Sebaran Normal dan Sampling Matakuliah: >K0614/ >FISIKA Tahun: >2006.

2008 Chingchun 1 Bootstrap Chingchun Huang ( 黃敬群 ) Vision Lab, NCTU.

Bootstrapping LING 572 Fei Xia 1/31/06.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 6-1 Chapter 6 The Normal Distribution and Other Continuous Distributions.

Chapter 7 Estimation: Single Population

BCOR 1020 Business Statistics

Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Statistical Intervals Based on a Single Sample.

Statistical Theory; Why is the Gaussian Distribution so popular? Rob Nicholls MRC LMB Statistics Course 2014.

Standard error of estimate & Confidence interval.

Bootstrapping applied to t-tests

Chapter 6 Sampling and Sampling Distributions

Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.

Standard Error of the Mean

Continuous Probability Distributions

Chapter 7 Estimation: Single Population

Statistical Intervals for a Single Sample

Scalable Multi-Stage Stochastic Programming

Sampling and Confidence Interval

Estimation of Statistical Parameters

Topic 5 Statistical inference: point and interval estimate

1 SAMPLE MEAN and its distribution. 2 CENTRAL LIMIT THEOREM: If sufficiently large sample is taken from population with any distribution with mean  and.

Chapter 8: Confidence Intervals

Statistics for Data Miners: Part I (continued) S.T. Balke.

Continuous Probability Distributions Continuous random variable –Values from interval of numbers –Absence of gaps Continuous probability distribution –Distribution.

PARAMETRIC STATISTICAL INFERENCE

9 Mar 2007 EMBnet Course – Introduction to Statistics for Biologists Nonparametric tests, Bootstrapping

Estimating Incremental Cost- Effectiveness Ratios from Cluster Randomized Intervention Trials M. Ashraf Chaudhary & M. Shoukri.

Sampling and Confidence Interval Kenneth Kwan Ho Chui, PhD, MPH Department of Public Health and Community Medicine

Chapter 7: Sample Variability Empirical Distribution of Sample Means.

Sampling Error.  When we take a sample, our results will not exactly equal the correct results for the whole population. That is, our results will be.

Lecture 2 Basics of probability in statistical simulation and stochastic programming Leonidas Sakalauskas Institute of Mathematics and Informatics Vilnius,

Section 6-5 The Central Limit Theorem. THE CENTRAL LIMIT THEOREM Given: 1.The random variable x has a distribution (which may or may not be normal) with.

ELEC 303 – Random Signals Lecture 18 – Classical Statistical Inference, Dr. Farinaz Koushanfar ECE Dept., Rice University Nov 4, 2010.

Probability = Relative Frequency. Typical Distribution for a Discrete Variable.

© 2001 Prentice-Hall, Inc.Chap 7-1 BA 201 Lecture 11 Sampling Distributions.

Nonparametric Methods II 1 Henry Horng-Shing Lu Institute of Statistics National Chiao Tung University

Sampling Theory and Some Important Sampling Distributions.

Lecture 4 Confidence Intervals. Lecture Summary Last lecture, we talked about summary statistics and how “good” they were in estimating the parameters.

1 Probability and Statistics Confidence Intervals.

Hypothesis Testing. Suppose we believe the average systolic blood pressure of healthy adults is normally distributed with mean μ = 120 and variance σ.

Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.

The accuracy of averages We learned how to make inference from the sample to the population: Counting the percentages. Here we begin to learn how to make.

Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.

Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Probability & Statistics Review I 1. Normal Distribution 2. Sampling Distribution 3. Inference - Confidence Interval.

Introduction For inference on the difference between the means of two populations, we need samples from both populations. The basic assumptions.

Estimating the Value of a Parameter Using Confidence Intervals

Normal Distribution and Parameter Estimation

Hypotheses and test procedures

Other confidence intervals

Sampling distribution

Chapter 7: Sampling Distributions

Statistics in Applied Science and Technology

AP Statistics: Chapter 7

Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:

Determining Which Method to use

Sampling Distributions (§ )

Bootstrapping and Bootstrapping Regression Models

How Confident Are You?.

Fundamental Sampling Distributions and Data Descriptions

Presentation transcript:

Higher-order Confidence Intervals for Stochastic Programming using Bootstrapping Cosmin G. Petra Joint work with Mihai Anitescu Mathematics and Computer Science Division Argonne National Laboratory INFORMS ANNUAL MEETING 2012

Outline  Confidence intervals  Motivation –statistical inference for the stochastic optimization of power grid  Our statistical estimator for the optimal value  Bootstrapping  Second-order bootstrapped confidence intervals  Numerical example 2

Confidence intervals (CIs) for a statistic 3  Want an interval [L,U] where resides with high probability  Need the knowledge of the probability distribution  Example: Confidence intervals for the mean of Gaussian (normal) random variable Normal distribution, also called Gaussian or "bell curve“ distribution. Image source: Wikipedia.

Approximating CIs 4  In many cases the distribution function is not known.  Such intervals are approximated based on the central limit theorem (CLT)  Normal approximation for equal-tailed 95% CI  Notation

Optimal value in stochastic programming  Monotonically shrinking negative bias:  Consistency  Arbitrary slow convergence  Non-normal bias 5 Sample average approximation (SAA) Stochastic programming (SP) problem Properties

Stochastic unit commitment with wind power  Wind Forecast – WRF(Weather Research and Forecasting) Model –Real-time grid-nested 24h simulation –30 samples require 1h on 500 CPUs 6 Slide courtesy of V. Zavala & E. Constantinescu Wind farm Thermal generator

The specific of stochastic optimization of energy systems 7 SAA discrete continuous Sampling Statistical inference uncertainty is expensive Only a small number of samples are available.

Standard methodology for stochastic programming – Linderoth, Shapiro, Wright (2004) 8  Lower bound CI  CI for based on M batches of N samples  Upper bound CI  CI for (obtained similarly)  Needs a relatively large number of samples (2MN)  First-order correct and therefore unreliable for small number of samples Correctness of a CI – order k if

Our approach for SP with low-size samples 9 1. Novel estimator 2. Bootstrapping  Converges one order faster than –Excepting for a set whose measure converges exponentially to 0.  Allows the construction of reliable CIs in the low-size samples situation.  Bootstrap CIs are second-order correct M. Anitescu, C. Petra: “Higher-Order Confidence Intervals for Stochastic Programming using Bootstrapping”, submitted to Math. Prog.

The estimator 10  L is the Lagrangian of SP and J is the Jacobian of the constraints  is the solution of the SAA problem – obtained using N samples  Intended for nonlinear recourse terms Theorem 1: (Anitescu & P.) Under some regularity and smoothness conditions Proof: based on the theory of large deviations.  CIs constructed for are based on a second batch of N samples.  A total of 2N sample needed when using bootstrapping

Bootstrapping – a textbook example  population = 1920 population X mean of the ratios  2. needs the distribution of the ratios - not enough samples -> Bootstrapping –Sample the existing samples (with replacement) –For each sample compute the mean –Bootstrapping distribution is obtained –Build CIs based on the bootstrapping distribution 11 Histogram for the ratio of 1930 and 1920 populations for N=49 US cities “Bootstrapped” distribution clearly not a Gaussian Bootstrap CIs outperform normal CIs. US population known in population of 49 cities known Want 1. estimation of the 1930 population 2. CIs for the estimation Solution

The methodology of bootstrapping  BCa (bias corrected and accelerated) confidence intervals –second-order correct –the method of choice when an accurate estimate of the variance is not available 12

What does bootstrapping do?  Edgeworth expansions for cdfs  Bootstrapping accounts also for the second term in the expansion  The quantiles are also second-order correct (Cornish-Fisher inverse expansions)  (Some) Bootstrapped CIs are second-order correct 13 Reference: Peter Hall, “The Bootstrap and Edgeworth Expansion”, 1994.

Bootstrapping the estimator 14 Theorem 2: (Anitescu & P.) Let be a second order bootstrapping confidence interval for. Then for any

Numerical order of correctness 15 Correctness order 0.32 Correctness order 0.82 Correctness order 1.14 Correctness order 2.11 Observed order of correctness

Coverage for small number of samples 16

Concluding remarks and future work  Proposed and analyzed a novel statistical estimator for the optimal solution of nonlinear stochastic optimization  Almost second order correct confidence intervals using bootstrapping  Theoretical properties confirmed by numerical testing  Some assumptions are rather strict and can/should be relaxed  Parallelization of the CI computations for large problems needed 17

Thank you for your attention! Questions? 18

Bootstrapping - theory 19 Edgeworth expansions for pdfs Bootstrapping also accounts for the second term of in the expansion.  Cornish-Fisher expansion for quantiles (inverting Edgeworth expansion)  Bootstrapped quantiles possess similar expansion  But  (Some) Bootstrap CIs are second-order correct (Hall’s book is really detailed on this)