Statistics for HEP Roger Barlow Manchester University Lecture 4: Confidence Intervals.

Slides:



Advertisements
Similar presentations
C) between 18 and 27. D) between 27 and 50.
Advertisements

Confidence Intervals Chapter 19. Rate your confidence Name Mr. Holloways age within 10 years? within 5 years? within 1 year? Shooting a basketball.
Quantitative Methods Topic 5 Probability Distributions
Estimating Population Values
Research Methods in Politics Chapter 14 1 Research Methods in Politics 14 Understanding Inferential Statistics.
Copyright © by Holt, Rinehart and Winston. All Rights Reserved. Objectives Use z-scores to find percentiles. Thinking Skill: Explicitly assess information.
§ 6.6 Rational Equations.
EuroCondens SGB E.
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Addition and Subtraction Equations
BUS 220: ELEMENTARY STATISTICS
Statistical inference Ian Jolliffe University of Aberdeen CLIPS module 3.4b.
Copyright © 2010 Pearson Education, Inc. Slide
Confidence Intervals Chapter 9.
Multiple-choice example
7.1 confidence Intervals for the Mean When SD is Known
Chapter 25 Paired Samples and Blocks
St. Edward’s University
Chapter 8 Interval Estimation
Samples The means of these samples
Frequency Tables and Stem-and-Leaf Plots 1-3
1 Confidence Interval for Population Mean The case when the population standard deviation is unknown (the more common case).
Carl Steinhauer Consultant
Experimental Measurements and their Uncertainties
Confidence Intervals Objectives: Students should know how to calculate a standard error, given a sample mean, standard deviation, and sample size Students.
Confidence Intervals for a Mean when you have a “large” sample…
Chapter 4 Inference About Process Quality
Populations & Samples Objectives:
3/2003 Rev 1 I.2.12 – slide 1 of 18 Part I Review of Fundamentals Module 2Basic Physics and Mathematics Used in Radiation Protection Session 12Statistics.
The Sample Mean. A Formula: page 6
Statistics Review – Part I
Statistical Sampling.
Hours Listening To Music In A Week! David Burgueño, Nestor Garcia, Rodrigo Martinez.
Marks out of 100 Mrs Smith’s Class Median Lower Quartile Upper Quartile Minimum Maximum.
Healey Chapter 7 Estimation Procedures
LESSON 4-1 Preparing a Chart of Accounts
 With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe.
Confidence intervals for means and proportions FETP India
Before Between After.
Standard Deviation and Z score
Benjamin Banneker Charter Academy of Technology Making AYP Benjamin Banneker Charter Academy of Technology Making AYP.
Multiplying Up. Category 1 4  10 4  5 4  4 56 ÷ 4.
Copyright © 2011 Pearson Education, Inc. Putting Statistics to Work.
Ch. 8: Confidence Interval Estimation
Confidence Intervals with Proportions
Confidence Intervals. Rate your confidence Name my age within 10 years? within 5 years? within 1 year? Shooting a basketball at a wading pool,
Introduction to Inference
Statistics and Data (Algebraic)
Static Equilibrium; Elasticity and Fracture
Lial/Hungerford/Holcomb/Mullins: Mathematics with Applications 11e Finite Mathematics with Applications 11e Copyright ©2015 Pearson Education, Inc. All.
CHAPTER 20: Inference About a Population Proportion
CHAPTER 14: Confidence Intervals: The Basics
January Structure of the book Section 1 (Ch 1 – 10) Basic concepts and techniques Section 2 (Ch 11 – 15): Inference for quantitative outcomes Section.
Slide Slide 1 Copyright © 2007 Pearson Education, Inc Publishing as Pearson Addison-Wesley. Section 7-2 Estimating a Population Proportion Created by Erin.
Chart Deception Main Source: How to Lie with Charts, by Gerald E. Jones Dr. Michael R. Hyman, NMSU.
Jump to first page STATISTICAL INFERENCE Statistical Inference uses sample data and statistical procedures to: n Estimate population parameters; or n Test.
Personal review of 2006 Roger Barlow. Manchester Christmas Meeting 2006 Roger Barlow Review Committee 266 ISR e + e -   f 0  KK  Structure at 2200.
1 Confidence Intervals for Means. 2 When the sample size n< 30 case1-1. the underlying distribution is normal with known variance case1-2. the underlying.
ENGR 610 Applied Statistics Fall Week 5 Marshall University CITE Jack Smith.
Normal Distribution.
Mystery 1Mystery 2Mystery 3.
Lecture 4 Confidence Intervals. Lecture Summary Last lecture, we talked about summary statistics and how “good” they were in estimating the parameters.
Chapter 7 Estimation. Chapter 7 ESTIMATION What if it is impossible or impractical to use a large sample? Apply the Student ’ s t distribution.
Introductory Statistics: Exploring the World through Data, 1e
F - Ratio Table Degrees of Freedom for the Factor
Chapter 10 Inferences on Two Samples
Tutorial 9 Suppose that a random sample of size 10 is drawn from a normal distribution with mean 10 and variance 4. Find the following probabilities:
Weighted Interval Scheduling
Weighted Interval Scheduling
Statistics for HEP Roger Barlow Manchester University
Presentation transcript:

Statistics for HEP Roger Barlow Manchester University Lecture 4: Confidence Intervals

Slide 2 The Straightforward Example Apples of different weights Need to describe the distribution  = 68g  = 17 g All weights between 24 and 167 g (Tolerance) 90% lie between 50 and 100 g 94% are less than 100 g 96% are more than 50 g Confidence level statements

Slide 3 Confidence Levels Can quote at any level (68%, 95%, 99%…) Upper or lower or twosided (x<U x<L L<x<U) Two-sided has further choice (central, shortest…) U L U’

Slide 4 The Frequentist Twist Particles of the same weight Distribution spread by measurement errors What can we say about M?  = 68  = 17 “M 55” or CL These are each always true or always false Solution: Refer to ensemble of statements

Slide 5 Frequentist CL in detail You have a meter: no bias, Gaussian error 0.1. For a value X T it gives a value X M according to a Gaussian Distribution X M is within 0.1 of X T 68% of the time X T is within 0.1 of X M 68% of the time Can state X M -0.1<X T <X M CL

Slide 6 Confidence Belts For more complicated distributions it isn’t quite so easy But the principle is the same Construct Horizontally Read Vertically Measured X True X

Slide 7 Coverage 95% confidence (or “x>L” or “x<U”) This statement belongs to an ensemble of similar statements of which at least* 95% are true 95% is the coverage This is a statement about U and L, not about x. *Maybe more. Overcoverage EG composite hypotheses. Meter with resolution  0.1 Think about: the difference between a 90% upper limit and the upper limit of a 90% central interval.

Slide 8 Discrete Distributions CL belt edges become steps True continuous p or Measured discrete N May be unable to select (say) 5% region Play safe. Gives overcoverage Binomial: see tables Poisson Given 2 events, if the true mean is 6.3 (or more) then the chance of getting a fluctuation this low (or lower) is only 5% (or less)

Slide 9 Problems for Frequentists Weigh object+container with some Gaussian precision Get reading R R-  < M+C < R+ R-C-  < M < R-C+ E.g. C=50, R=141,  =10 E.g. C=50, R=55,  =10 -5 < M < E.g. C=50, R=31,  = < M < Poisson: Signal + Background Background mean 2.50 Detect 3 events: Total < 95% Detect 0 events Total < 95% Signal < 95% These statements are OK. We are allowed to get 32% / 5% wrong. But they are stupid

Slide 10 Bayes to the rescue Standard (Gaussian) measurement No prior knowledge of true value No prior knowledge of measurement result P(Data|Theory) is Gaussian P(Theory|Data) is Gaussian Interpret this with Probability statements in any way you please x true Gives same limits as Frequentist method for simple Gaussian

Slide 11 Bayesian Confidence Intervals (contd) Observe (say) 2 events P( ;2)  P(2; )=e - 2 Normalise and interpret If you know background mean is 1.7, then you know >1.7 Multiply, normalise and interpret 2

Slide 12 Bayes: words of caution Taking prior in as flat is not justified Can argue for prior flat in ln or 1/  or whatever Good practice to try a couple of priors to see if it matters

Slide 13 Feldman-Cousins Unified Method Physicists are human Ideal Physicist 1.Choose Strategy 2.Examine data 3.Quote result Real Physicist 1.Examine data 2.Choose Strategy 3.Quote Result Example: You have a background of 3.2 Observe 5 events? Quote one-sided upper limit ( Observe 25 events? Quote two-sided limits

Slide 14 “Flip-Flopping” S N Allowed S N 1 sided 2 sided S N This is not a true confidence belt! Coverage varies.

Slide 15 Solution: Construct belt that does the flip-flopping S N For 90% CL For every S select set of N-values in belt Total probability must sum to 90% (or more): there are many strategies for doing this Crow & Gardner strategy (almost right): Select N-values with highest probability  shortest interval

Slide 16 Better Strategy N is Poisson from S+B B known, may be large E.g. B=9.2,S=0 and N=1 P=.1% - not in C-G band But any S>0 will be worse To construct band for a given S: For all N: Find P(N;S+B) and P best =P(N;N) if (N>B) else P(N;B) Rank on P/P best Accept N into band until  P(N;S+B)  90% Fair comparison of P is with best P for this N Either at S=N-B or S=0

Slide 17 Feldman and Cousins Summary Makes us more honest (a bit) Avoids forbidden regions in a Frequentist way Not easy to calculate Has to be done separately for each value of B Can lead to 2-tailed limits where you don’t want to claim a discovery Weird effects for N=0; larger B gives lower (=better) upper limit

Slide 18 Maximum Likelihood and Confidence Levels ML estimator (large N) has variance given by MVB At peak For large N Ln L is a parabola (L is a Gaussian) a Ln L Falls by ½ at Falls by 2 at Read off 68%, 95% confidence regions

Slide 19 MVB example N Gaussian measurements: estimate  Ln L given by Differentiate twice wrt  Take expectation value – but it’s a constant Invert and negate:

Slide 20 Another MVB example N Gaussian measurements: estimate  Ln L still given by Differentiate twice wrt  Take expectation value =  2  i Gives Invert and negate:

Slide 21 ML for small N ln L is not a parabola Argue: we could (invariance) transform to some a’ for which it is a parabola We could/should then get limits on a’ using standard L max -½ technique These would translate to limits on a These limits would be at the values of a for which L= L max -½ So just do it directly a a’

Slide 22 Multidimensional ML L is multidimensional Gaussian a b For 2-d 39.3% lies within 1  i.e. within region bounded by L=L max -½ For 68% need L=L max Construct region(s) to taste using numbers from integrated  2 distribution

Slide 23 Confidence Intervals Descriptive Frequentist –Feldman-Cousins technique Bayesian Maximum Likelihood –Standard –Asymmetric –Multidimensional