Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

DO’S AND DONT’S WITH LIKELIHOODS Louis Lyons Oxford (CDF)
Welcome to PHYS 225a Lab Introduction, class rules, error analysis Julia Velkovska.
Fundamentals of Data Analysis Lecture 12 Methods of parametric estimation.
POINT ESTIMATION AND INTERVAL ESTIMATION
Statistics for HEP Roger Barlow Manchester University Lecture 5: Errors.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #21.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #22.
Probability & Statistical Inference Lecture 3
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.
Evaluating Hypotheses
Statistics.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Statistical Background
Inference about a Mean Part II
Inferences About Process Quality
Lehrstuhl für Informatik 2 Gabriella Kókai: Maschine Learning 1 Evaluating Hypotheses.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Sampling Distributions & Point Estimation. Questions What is a sampling distribution? What is the standard error? What is the principle of maximum likelihood?
Standard error of estimate & Confidence interval.
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
Statistical Analysis of Systematic Errors and Small Signals Reinhard Schwienhorst University of Minnesota 10/26/99.
Error Analysis Accuracy Closeness to the true value Measurement Accuracy – determines the closeness of the measured value to the true value Instrument.
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
STA Lecture 161 STA 291 Lecture 16 Normal distributions: ( mean and SD ) use table or web page. The sampling distribution of and are both (approximately)
1 Probability and Statistics  What is probability?  What is statistics?
PROBABILITY & STATISTICAL INFERENCE LECTURE 3 MSc in Computing (Data Analytics)
Population All members of a set which have a given characteristic. Population Data Data associated with a certain population. Population Parameter A measure.
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
Statistics for Data Miners: Part I (continued) S.T. Balke.
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
Lecture 14 Dustin Lueker. 2  Inferential statistical methods provide predictions about characteristics of a population, based on information in a sample.
Random Sampling, Point Estimation and Maximum Likelihood.
Lecture 12 Statistical Inference (Estimation) Point and Interval estimation By Aziza Munir.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Physics 270 – Experimental Physics. Standard Deviation of the Mean (Standard Error) When we report the average value of n measurements, the uncertainty.
LECTURER PROF.Dr. DEMIR BAYKA AUTOMOTIVE ENGINEERING LABORATORY I.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Chapter 2 Statistical Background. 2.3 Random Variables and Probability Distributions A variable X is said to be a random variable (rv) if for every real.
ME Mechanical and Thermal Systems Lab Fall 2011 Chapter 3: Assessing and Presenting Experimental Data Professor: Sam Kassegne, PhD, PE.
Random Variable The outcome of an experiment need not be a number, for example, the outcome when a coin is tossed can be 'heads' or 'tails'. However, we.
Sampling and estimation Petter Mostad
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #24.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 3 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
ES 07 These slides can be found at optimized for Windows)
R.Kass/F02 P416 Lecture 1 1 Lecture 1 Probability and Statistics Introduction: l The understanding of many physical phenomena depend on statistical and.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
CHAPTER- 3.1 ERROR ANALYSIS.  Now we shall further consider  how to estimate uncertainties in our measurements,  the sources of the uncertainties,
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate its.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 7 Inferences Concerning Means.
THE NORMAL DISTRIBUTION
Evaluating Hypotheses. Outline Empirically evaluating the accuracy of hypotheses is fundamental to machine learning – How well does this estimate accuracy.
Fundamentals of Data Analysis Lecture 11 Methods of parametric estimation.
Statistics (2) Fitting Roger Barlow Manchester University IDPASC school Sesimbra 14 th December 2010.
Introductory Statistics and Data Analysis
SUR-2250 Error Theory.
Probability and Statistics for Particle Physics
Introduction, class rules, error analysis Julia Velkovska
Lecture Slides Elementary Statistics Twelfth Edition
Statistics for HEP Roger Barlow Manchester University
Presentation transcript:

Probability, Statistics and Errors in High Energy Physics Wen-Chen Chang Institute of Physics, Academia Sinica 章文箴 中央研究院 物理研究所

Outline Errors Probability distribution: Binomial, Poisson, Gaussian Confidence Level Monte Carlo Method

Why do we do experiments? 1.Parameter determination: determine the numerical value of some physical quantity. 2.Hypothesis testing: test whether a particular theory is consistent with our data.

Why estimate errors? We are concerned not only with the answer but also with its accuracy. For example, speed of light 2.998x10 8 m/sec –(3.09  0.15) x10 8 : –(3.09  0.01) x10 8 : –(3.09  2) x10 8 :

Source of Errors Random (Statistic) error: the inability of any measuring device to give infinitely accurate answers. Systematic error: uncertainty.

Systematic Errors Systematic effects is a general category which includes effects such as background, scanning efficiency, energy resolution, angle resolution, variation of counter efficiency with beam position and energy, dead time, etc. The uncertainty in the estimation of such as systematic effect is called a systematic error Orear Systematic Error: reproducible inaccuracy introduced by faulty equipment, calibration, or technique Bevington Error=mistake? Error=uncertainty?

Experimental Examples Energy in a calorimeter E=aD+b a & b determined by calibration expt Branching ratio B=N/(  N T )  found from Monte Carlo studies Steel rule calibrated at 15C but used in warm lab If not spotted, this is a mistake If temp. measured, not a problem If temp. not measured guess  uncertainty Repeating measurements doesn’t help

The Binomial n trials r successes Individual success probability p Variance V   = = - 2 =np(1-p) Mean  = =  rP( r ) = np 1-p  p  q A random process with exactly two possible outcomes which occur with fixed probabilities.

Binomial Examples n=10 p=0.2 p=0.5p=0.8 p=0.1 n=20 n=50 n=5

Poisson ‘ Events in a continuum ’ The probability of observing r independent events in a time interval t, when the counting rate is  and the expected number events in the time interval is. Mean  = =  rP( r ) = Variance V   = = - 2 = =2.5

More about Poisson The approach of the binomial to the Poisson distribution as N increases. The mean value of r for a variable with a Poisson distribution is and so is the variance. This is the basis of the well known n  n formula that applies to statistical errors in many situations involving the counting of independent events during a fixed interval. As , the Poisson distribution tends to a Gaussian one.

Poisson Examples =25 =10 =5.0 =0.5 =2.0 =1.0

Examples The number of particles detected by a counter in a time t, in a situation where the particle flux  and detector are independent of time, and where counter dead-time  is such that   <<1. The number of interactions produced in a thin target when an intense pulse of N beam particles is incident on it. The number of entries in a given bin of a histogram when the data are accumulated over a fixed time interval.

Binomial and Poisson From an exam paper A student is standing by the road, hoping to hitch a lift. Cars pass according to a Poisson distribution with a mean frequency of 1 per minute. The probability of an individual car giving a lift is 1%. Calculate the probability that the student is still waiting for a lift (a) After 60 cars have passed (b) After 1 hour a) =0.5472b) e -0.6 * /0! =0.5488

Gaussian (Normal) Probability Density Mean  = =  xP( x ) dx =  Variance V   = = - 2 =  

Different Gaussians There’s only one! Normalisation (if required) Location change  Width scaling factor Falls to 1/e of peak at x= 

Probability Contents 68.27% within 1  95.45% within 2  99.73% within 3  90% within  95% within  99% within  99.9% within  These numbers apply to Gaussians and only Gaussians Other distributions have equivalent values which you could use of you wanted

Central Limit Theorem Or: why is the Gaussian Normal? If a variable x is produced by the convolution of variables x 1,x 2 … x N I) =  1 +  2 + …  N II)V(x)=V 1 +V 2 + … V N III) P(x) becomes Gaussian for large N

Multidimensional Gaussian

Chi squared Sum of squared discrepancies, scaled by expected error Integrate all but 1-D of multi-D Gaussian

About Estimation Theory Data Statistical Inference Theory Data Probability Calculus Given these distribution parameters, what can we say about the data? Given this data, what can we say about the properties or parameters or correctness of the distribution functions?

What is an estimator? An estimator (written with a hat) is a function of the data whose value, the estimate, is intended as a meaningful guess for the value of the parameter. (from PDG)

Minimum Variance Bound What is a good estimator? A perfect estimator is: Consistent Unbiassed Efficient minimum One often has to work with less-than- perfect estimators

The Likelihood Function Set of data {x 1, x 2, x 3, …x N } Each x may be multidimensional – never mind Probability depends on some parameter a a may be multidimensional – never mind Total probability (density) P(x 1 ;a) P(x 2 ;a) P(x 3 ;a) …P(x N ;a)=L(x 1, x 2, x 3, …x N ;a) The Likelihood

Maximum Likelihood Estimation In practice usually maximise ln L as it’s easier to calculate and handle; just add the ln P(x i ) ML has lots of nice properties Given data {x 1, x 2, x 3, …x N } estimate a by maximising the likelihood L(x 1, x 2, x 3, …x N ;a) a Ln L â

Properties of ML estimation It ’ s consistent (no big deal) It ’ s biased for small N May need to worry It is efficient for large N Saturates the Minimum Variance Bound It is invariant If you switch to using u(a), then û =u(â) a Ln L â u û

More about ML It is not ‘ right ’. Just sensible. It does not give the ‘ most likely value of a ’. It ’ s the value of a for which this data is most likely. Numerical Methods are often needed Maximisation / Minimisation in >1 variable is not easy Use MINUIT but remember the minus sign

ML does not give goodness-of-fit ML will not complain if your assumed P(x;a) is rubbish The value of L tells you nothing Fit P(x)=a 1 x+a 0 will give a 1 =0; constant P L= a 0 N Just like you get from fitting

Least Squares Measurements of y at various x with errors  and prediction f(x;a) Probability Ln L To maximise ln L, minimise  2 x y So ML ‘proves’ Least Squares. But what ‘proves’ ML? Nothing

Least Squares: The Really nice thing Should get  2  1 per data point Minimise  2 makes it smaller – effect is 1 unit of  2 for each variable adjusted. (Dimensionality of MultiD Gaussian decreased by 1.) N degrees Of Freedom =N data pts – N parameters Provides ‘ Goodness of agreement ’ figure which allows for credibility check

Chi Squared Results Large  2 comes from 1.Bad Measurements 2.Bad Theory 3.Underestimated errors 4.Bad luck Small  2 comes from 1. Overestimated errors 2. Good luck

Fitting Histograms Often put {x i } into bins Data is then {n j } n j given by Poisson, mean f(x j ) =P(x j )  x 4 Techniques Full ML Binned ML Proper  2 Simple  2 x x

What you maximise/minimise Full ML Binned ML Proper  2 Simple  2

Confidence Level: Meaning of Error Estimates How often we expect to include “the true fixed value of our paramter” P 0, within our quoted range, p  p, for a repeated series of experiments? For the actual value P 0, the probability that a measurement will give us an answer in a specific range of p is given by the area under the relevant part of Gaussian curve. A conventional choice of this probability is 68%.

The Straightforward Example Apples of different weights Need to describe the distribution  = 68g  = 17 g All weights between 24 and 167 g (Tolerance) 90% lie between 50 and 100 g 94% are less than 100 g 96% are more than 50 g Confidence level statements

Confidence Levels Can quote at any level (68%, 95%, 99% … ) Upper or lower or two-sided (x<U x<L L<x<U) Two-sided has further choice (central, shortest … ) U L U’

Maximum Likelihood and Confidence Levels ML estimator (large N) has variance given by MVB At peak For large N Ln L is a parabola (L is a Gaussian) a Ln L Falls by ½ at Falls by 2 at Read off 68%, 95% confidence regions

Monte Carlo Calculations The Monte Carlo approach provides a method of solving probability theory problems in situations where the necessary integrals are too difficult to perform. Crucial element: random number generator.

An Example

References Lectures and Notes on Statistics in HEP, Lecture notes of Prof. Roger Barlow, Louis Lyons, “Statistics for Nuclear and Particle Physicists”, Cambridge Particle Data Group,