1 Practical Statistics for Physicists CERN Latin American School March 2015 Louis Lyons Imperial College and Oxford CMS expt at LHC

Slides:



Advertisements
Similar presentations
Experimental Measurements and their Uncertainties
Advertisements

DO’S AND DONT’S WITH LIKELIHOODS Louis Lyons Oxford (CDF)
CHAPTER 13: Binomial Distributions
1 χ 2 and Goodness of Fit Louis Lyons IC and Oxford SLAC Lecture 2’, Sept 2008.
1 Practical Statistics for Physicists Sardinia Lectures Oct 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
1 χ 2 and Goodness of Fit Louis Lyons Oxford Mexico, November 2006 Lecture 3.
Chapter 1 Probability Theory (i) : One Random Variable
1 Practical Statistics for Physicists LBL January 2008 Louis Lyons Oxford
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF Mexico, November 2006.
1 Practical Statistics for Physicists Dresden March 2010 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
Level 1 Laboratories University of Surrey, Physics Dept, Level 1 Labs, Oct 2007 Handling & Propagation of Errors : A simple approach 1.
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF CERN, October 2006.
Introduction to experimental errors
1 Practical Statistics for Physicists SLUO Lectures February 2007 Louis Lyons Oxford
1 Practical Statistics for Particle Physicists CERN Academic Lectures October 2006 Louis Lyons Oxford
Probability and Probability Distributions
Lec 6, Ch.5, pp90-105: Statistics (Objectives) Understand basic principles of statistics through reading these pages, especially… Know well about the normal.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview Parameters and Statistics Probabilities The Binomial Probability Test.
Copyright (c) Bani K. Mallick1 STAT 651 Lecture #15.
Lecture 6: Descriptive Statistics: Probability, Distribution, Univariate Data.
1 Practical Statistics for Physicists Stockholm Lectures Sept 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
The Binomial Distribution. Introduction # correct TallyFrequencyP(experiment)P(theory) Mix the cards, select one & guess the type. Repeat 3 times.
TOPLHCWG. Introduction The ATLAS+CMS combination of single-top production cross-section measurements in the t channel was performed using the BLUE (Best.
Problem A newly married couple plans to have four children and would like to have three girls and a boy. What are the chances (probability) their desire.
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Chapter 5 Sampling Distributions
PBG 650 Advanced Plant Breeding
1 Probability and Statistics  What is probability?  What is statistics?
Probability theory 2 Tron Anders Moger September 13th 2006.
June 10, 2008Stat Lecture 9 - Proportions1 Introduction to Inference Sampling Distributions for Counts and Proportions Statistics Lecture 9.
Binomial Distributions Calculating the Probability of Success.
Theory of Probability Statistics for Business and Economics.
Sullivan – Fundamentals of Statistics – 2 nd Edition – Chapter 11 Section 1 – Slide 1 of 34 Chapter 11 Section 1 Random Variables.
 A probability function is a function which assigns probabilities to the values of a random variable.  Individual probability values may be denoted by.
OPIM 5103-Lecture #3 Jose M. Cruz Assistant Professor.
1 Methods of Experimental Particle Physics Alexei Safonov Lecture #23.
BINOMIALDISTRIBUTION AND ITS APPLICATION. Binomial Distribution  The binomial probability density function –f(x) = n C x p x q n-x for x=0,1,2,3…,n for.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 6 Random Variables 6.3 Binomial and Geometric.
Lab 3b: Distribution of the mean
Statistics Lectures: Questions for discussion Louis Lyons Imperial College and Oxford CERN Summer Students July
June 11, 2008Stat Lecture 10 - Review1 Midterm review Chapters 1-5 Statistics Lecture 10.
1 2 nd Pre-Lab Quiz 3 rd Pre-Lab Quiz 4 th Pre-Lab Quiz.
Physics 270. o Experiment with one die o m – frequency of a given result (say rolling a 4) o m/n – relative frequency of this result o All throws are.
Conditional Probability Mass Function. Introduction P[A|B] is the probability of an event A, giving that we know that some other event B has occurred.
BAYES and FREQUENTISM: The Return of an Old Controversy 1 Louis Lyons Imperial College and Oxford University CERN Summer Students July 2014.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
Chapter 5 Joint Probability Distributions and Random Samples  Jointly Distributed Random Variables.2 - Expected Values, Covariance, and Correlation.3.
R.Kass/F02 P416 Lecture 1 1 Lecture 1 Probability and Statistics Introduction: l The understanding of many physical phenomena depend on statistical and.
Business Statistics, A First Course (4e) © 2006 Prentice-Hall, Inc. Chap 5-1 Chapter 5 Some Important Discrete Probability Distributions Business Statistics,
1 Practical Statistics for Physicists CERN Summer Students July 2014 Louis Lyons Imperial College and Oxford CMS expt at LHC
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
THE NORMAL DISTRIBUTION
Probability Distributions ( 확률분포 ) Chapter 5. 2 모든 가능한 ( 확률 ) 변수의 값에 대해 확률을 할당하는 체계 X 가 1, 2, …, 6 의 값을 가진다면 이 6 개 변수 값에 확률을 할당하는 함수 Definition.
Stats 242.3(02) Statistical Theory and Methodology.
Practical Statistics for Physicists
CHAPTER 6 Random Variables
Statistical Modelling
Lecture 1 Probability and Statistics
χ2 and Goodness of Fit & Likelihood for Parameters
Probability and Statistics for Particle Physics
Practical Statistics for Physicists
Practical Statistics for Physicists
BAYES and FREQUENTISM: The Return of an Old Controversy
Stat 31, Section 1, Last Time Sampling Distributions
Discrete Probability Distributions
Chapter 5 Joint Probability Distributions and Random Samples
Probability & Statistics Probability Theory Mathematical Probability Models Event Relationships Distributions of Random Variables Continuous Random.
Chapter 5 Sampling Distributions
Lecture 2 Binomial and Poisson Probability Distributions
Presentation transcript:

1 Practical Statistics for Physicists CERN Latin American School March 2015 Louis Lyons Imperial College and Oxford CMS expt at LHC

2 Books Statistics for Nuclear and Particle Physicists Cambridge University Press, 1986 Available from CUP Errata in these lectures

3 Topics 1) Introduction 2) Bayes and Frequentism 3) χ 2 and L ikelihoods 4) Higgs: Example of Search for New Physics Time for discussion

4 Introductory remarks What is Statistics? Probability and Statistics Why uncertainties? Random and systematic uncertainties Combining uncertainties Combining experiments Binomial, Poisson and Gaussian distributions

5 What do we do with Statistics? Parameter Determination (best value and range) e.g. Mass of Higgs = 80  2 Goodness of Fit Does data agree with our theory? Hypothesis Testing Does data prefer Theory 1 to Theory 2? (Decision Making What experiment shall I do next?) Why bother? HEP is expensive and time-consuming so Worth investing effort in statistical analysis  better information from data

6 Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its unceretainty? If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Or is P(evens) =2/3? THEORY  DATA DATA  THEORY

7 Probability and Statistics Example: Dice Given P(5) = 1/6, what is Given 20 5’s in 100 trials, P(20 5’s in 100 trials)? what is P(5)? And its uncertainty? Parameter Determination If unbiassed, what is Given 60 evens in 100 trials, P(n evens in 100 trials)? is it unbiassed? Goodness of Fit Or is P(evens) =2/3? Hypothesis Testing N.B. Parameter values not sensible if goodness of fit is poor/bad

8 Why do we need uncertainties? Affects conclusion about our result e.g. Result / Theory = If ± 0.050, data compatible with theory If ± 0.005, data incompatible with theory If ± 0.7, need better experiment Historical experiment at Harwell testing General Relativity

9 Random + Systematic Uncertainties Random/Statistical: Limited accuracy, Poisson counts Spread of answers on repetition (Method of estimating) Systematics: May cause shift, but not spread e.g. Pendulum g = 4π 2 L/  2,  = T/n Statistical uncertainties: T, L Systematics: T, L Calibrate: Systematic  Statistical More systematics: Formula for undamped, small amplitude, rigid, simple pendulum Might want to correct to g at sea level: Different correction formulae Ratio of g at different locations: Possible systematics might cancel. Correlations relevant

10 Presenting result Quote result as g ± σ stat ± σ syst Or combine uncertainties in quadrature  g ± σ Other extreme: Show all systematic contributions separately Useful for assessing correlations with other measurements Needed for using: improved outside information, combining results using measurements to calculate something else.

11 Combining uncertainties z = x - y δz = δx – δy [1] Why σ z 2 = σ x 2 + σ y 2 ? [2]

12 Combining uncertainties z = x - y δz = δx – δy [1] Why σ z 2 = σ x 2 + σ y 2 ? [2] 1)[1] is for specific δx, δy Could be so on average ? N.B. Mneumonic, not proof 2) σ z 2 = δz 2 = δx 2 + δy 2 – 2 δx δy = σ x 2 + σ y 2 provided…………..

13 3) Averaging is good for you: N measurements x i ± σ [1] x i ± σ or [2] x i ± σ/ √N ? 4) Tossing a coin: Score 0 for tails, 2 for heads (1 ± 1 ) After 100 tosses, [1] 100 ± 100 or [2] 100 ± 10 ? Prob(0 or 200) = (1/2) 99 ~ Compare age of Universe ~ seconds (Prob ~ 1% for 100 people doing it 1/μsec since Big Bang)

14 Rules for different functions 1)Linear: z = k 1 x 1 + k 2 x 2 + ……. σ z = k 1 σ 1 & k 2 σ 2 & means “combine in quadrature” N.B. Fractional errors NOT relevant e.g. z = x – y z = your height x = position of head wrt moon y = position of feet wrt moon x and y measured to 0.1% z could be -30 miles

15 Rules for different functions 2) Products and quotients z = x α y β ……. σ z /z = α σ x /x & β σ y /y Useful for x 2, xy, x/√y,…….

16 3) Anything else: z = z(x 1, x 2, …..) σ z = ∂z/∂x 1 σ 1 & ∂z/∂x 2 σ 2 & ……. OR numerically: z 0 = z(x 1, x 2, x 3 ….) z 1 = z(x 1 +σ 1, x 2, x 3 ….) z 2 = z(x 1, x 2 + σ 2, x 3 ….) σ z = (z 1 -z 0 ) & (z 2 -z 0 ) & …. N.B. All formulae approx (except 1)) – assumes small uncertainties

17 N.B. Better to combine data! BEWARE 100±10 2±1? 1±1 or 50.5±5?

18 Difference between averaging and adding Isolated island with conservative inhabitants How many married people ? Number of married men = 100 ± 5 K Number of married women = 80 ± 30 K Total = 180 ± 30 K Wtd average = 99 ± 5 K CONTRAST Total = 198 ± 10 K GENERAL POINT: Adding (uncontroversial) theoretical input can improve precision of answer Compare “kinematic fitting”

19 Binomial Distribution Fixed N independent trials, each with same prob of success p What is prob of s successes? e.g. Throw dice 100 times. Success = ‘6’. What is prob of 0, 1,…. 49, 50, 51,… 99, 100 successes? Effic of track reconstrn = 98%. For 500 tracks, prob that 490, 491, , 500 reconstructed. Ang dist is cos θ? Prob of 52/70 events with cosθ > 0 ? (More interesting is statistics question)

20 P s = N! p s (1-p) N-s, as is obvious (N-s)! s! Expected number of successes = ΣsP s = Np, as is obvious Variance of no. of successes = Np(1-p) Variance ~ Np, for p~0 ~ N(1-p) for p~1 NOT Np in general. NOT s ±√s e.g. 100 trials, 99 successes, NOT 99 ± 10

21 Statistics: Estimate p and σ p from s (and N) p = s/N σ p 2 = 1/N s/N (1 – s/N) If s = 0, p = 0 ± 0 ? If s = 1, p = 1.0 ± 0 ? Limiting cases: ● p = const, N  ∞: Binomial  Gaussian μ = Np, σ 2 = Np(1-p) ● N  ∞, p  0, Np = const: Binomial  Poisson μ = Np, σ 2 = Np {N.B. Gaussian continuous and extends to -∞}

22 Binomial Distributions

23 Poisson Distribution Prob of n independent events occurring in time t when rate is r (constant) e.g. events in bin of histogram NOT Radioactive decay for t ~ τ Limit of Binomial (N  ∞, p  0, Np  μ) P n = e -r t (r t) n /n! = e - μ μ n /n! (μ = r t) = r t = μ (No surprise!) σ 2 n = μ “n ±√n” BEWARE 0 ± 0 ? μ  ∞: Poisson  Gaussian, with mean = μ, variance =μ Important for χ 2

24 For your thought Poisson P n = e - μ μ n /n! P 0 = e – μ P 1 = μ e –μ P 2 = μ 2 /2 e -μ For small μ, P 1 ~ μ, P 2 ~ μ 2 /2 If probability of 1 rare event ~ μ, why isn’t probability of 2 events ~ μ 2 ?

25 Approximately Gaussian Poisson Distributions

26 Gaussian or Normal Significance of σ i) RMS of Gaussian = σ (hence factor of 2 in definition of Gaussian) ii) At x = μ±σ, y = y max /√e ~0.606 y max (i.e. σ = half-width at ‘half’-height) iii) Fractional area within μ±σ = 68% iv) Height at max = 1/( σ√2 π)

Area in tail(s) of Gaussian

28 Relevant for Goodness of Fit

29