1 χ 2 and Goodness of Fit Louis Lyons Oxford Mexico, November 2006 Lecture 3.

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

DO’S AND DONT’S WITH LIKELIHOODS Louis Lyons Oxford (CDF)
Probability and Statistics Basic concepts II (from a physicist point of view) Benoit CLEMENT – Université J. Fourier / LPSC
1 p-values and Discovery Louis Lyons Oxford SLUO Lecture 4, February 2007.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
Simple Linear Regression
Probability distribution functions Normal distribution Lognormal distribution Mean, median and mode Tails Extreme value distributions.
1 χ 2 and Goodness of Fit Louis Lyons IC and Oxford SLAC Lecture 2’, Sept 2008.
1 Practical Statistics for Physicists Sardinia Lectures Oct 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
DISTRIBUTION FITTING.
1 Practical Statistics for Physicists LBL January 2008 Louis Lyons Oxford
1 Do’s and Dont’s with L ikelihoods Louis Lyons Oxford CDF Mexico, November 2006.
1 Is there evidence for a peak in this data?. 2 “Observation of an Exotic S=+1 Baryon in Exclusive Photoproduction from the Deuteron” S. Stepanyan et.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
1 Choice of Distribution 1.Theoretical Basis e.g. CLT, Extreme value 2.Simplify calculations e.g. Normal or Log Normal 3.Based on data: - Histogram - Probability.
Statistics: Data Analysis and Presentation Fr Clinic II.
1 Practical Statistics for Physicists Dresden March 2010 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
Statistical Image Modelling and Particle Physics Comments on talk by D.M. Titterington Glen Cowan RHUL Physics PHYSTAT05 Glen Cowan Royal Holloway, University.
1 Practical Statistics for Particle Physicists CERN Academic Lectures October 2006 Louis Lyons Oxford
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
OMS 201 Review. Range The range of a data set is the difference between the largest and smallest data values. It is the simplest measure of dispersion.
1 Is there evidence for a peak in this data?. 2 “Observation of an Exotic S=+1 Baryon in Exclusive Photoproduction from the Deuteron” S. Stepanyan et.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
1 Statistical Inference Problems in High Energy Physics and Astronomy Louis Lyons Particle Physics, Oxford BIRS Workshop Banff.
1 Practical Statistics for Physicists Stockholm Lectures Sept 2008 Louis Lyons Imperial College and Oxford CDF experiment at FNAL CMS expt at LHC
Physics 114: Lecture 15 Probability Tests & Linear Fitting Dale E. Gary NJIT Physics Department.
Probability distribution functions
Introduction to Error Analysis
Atmospheric Neutrino Oscillations in Soudan 2
G. Cowan 2009 CERN Summer Student Lectures on Statistics1 Introduction to Statistics − Day 4 Lecture 1 Probability Random variables, probability densities,
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Managerial Economics Demand Estimation. Scatter Diagram Regression Analysis.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
1 Statistical Distribution Fitting Dr. Jason Merrick.
CS433: Modeling and Simulation Dr. Anis Koubâa Al-Imam Mohammad bin Saud University 15 October 2010 Lecture 05: Statistical Analysis Tools.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
Statistics Lectures: Questions for discussion Louis Lyons Imperial College and Oxford CERN Summer Students July
Goodness-of-Fit Chi-Square Test: 1- Select intervals, k=number of intervals 2- Count number of observations in each interval O i 3- Guess the fitted distribution.
CHEMISTRY ANALYTICAL CHEMISTRY Fall Lecture 6.
1 Practical Statistics for Physicists CERN Latin American School March 2015 Louis Lyons Imperial College and Oxford CMS expt at LHC
Environmental Modeling Basic Testing Methods - Statistics III.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Statistical Data Analysis: Lecture 4 1Probability, Bayes’ theorem 2Random variables and.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 4 page 1 Lecture 4 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
Environmental Modeling Basic Testing Methods - Statistics II.
1 χ 2 and Goodness of Fit & L ikelihood for Parameters Louis Lyons Imperial College and Oxford CERN Summer Students July 2014.
G. Cowan Lectures on Statistical Data Analysis Lecture 12 page 1 Statistical Data Analysis: Lecture 12 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
MEGN 537 – Probabilistic Biomechanics Ch.5 – Determining Distributions and Parameters from Observed Data Anthony J Petrella, PhD.
1 Practical Statistics for Physicists CERN Summer Students July 2014 Louis Lyons Imperial College and Oxford CMS expt at LHC
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
Louis Lyons Imperial College
Practical Statistics for Physicists
χ2 and Goodness of Fit & Likelihood for Parameters
Practical Statistics for Physicists
p-values and Discovery
Lecture 4 1 Probability (90 min.)
Environmental Modeling Basic Testing Methods - Statistics
CONCEPTS OF ESTIMATION
I. Statistical Tests: Why do we use them? What do they involve?
CHAPTER 12 More About Regression
Simple Linear Regression
Ch 4.1 & 4.2 Two dimensions concept
Regression and Correlation of Data
Statistical Inference for the Mean: t-test
Probabilistic Surrogate Models
Presentation transcript:

1 χ 2 and Goodness of Fit Louis Lyons Oxford Mexico, November 2006 Lecture 3

2 Least squares best fit Resume of straight line Correlated errors Errors in x and in y Goodness of fit with χ 2 Errors of first and second kind Kinematic fitting Toy example THE paradox

3

4

5

6

7 Straight Line Fit N.B. L.S.B.F. passes through (, )

8 Error on intercept and gradient That is why track parameters specified at track ‘centre’

9 a b x y See Lecture 1

10 If no errors specified on y i (!)

11 Asymptotically

12 Measurements with correlated errors e.g. systematics?

13 STRAIGHT LINE: Errors on x and on y

14

15

16

17 ‘Goodness of Fit’ by parameter testing? 1+(b/a) cos 2 θ Is b/a = 0 ? ‘Distribution testing’ is better

18 Goodness of Fit Works asymptotically, otherwise MC

19

20

21 χ 2 with ν degrees of freedom? ν = data – free parameters ? Why asymptotic (apart from Poisson  Gaussian) ? a) Fit flatish histogram with y = N { cos(x-x 0 )} x 0 = free param b) Neutrino oscillations: almost degenerate parameters y ~ 1 – A sin 2 (1.27 Δm 2 L/E) 2 parameters 1 – A (1.27 Δm 2 L/E) 2 1 parameter Small Δm 2

22

23 Goodness of Fit: Kolmogorov-Smirnov Compares data and model cumulative plots Uses largest discrepancy between dists. Model can be analytic or MC sample Uses individual data points Not so sensitive to deviations in tails (so variants of K-S exist) Not readily extendible to more dimensions Distribution-free conversion to p; depends on n (but not when free parameters involved – needs MC)

24 Goodness of fit: ‘Energy’ test Assign +ve charge to data ; -ve charge to M.C. Calculate ‘electrostatic energy E’ of charges If distributions agree, E ~ 0 If distributions don’t overlap, E is positive v 2 Assess significance of magnitude of E by MC N.B. v 1 1)Works in many dimensions 2)Needs metric for each variable (make variances similar?) 3)E ~ Σ q i q j f( Δr = |r i – r j |), f = 1/(Δr + ε) or –ln( Δr + ε) Performance insensitive to choice of small ε See Aslan and Zech’s paper at:

25 Wrong Decisions

26

27 Goodness of Fit: = Pattern Recognition = Find hits that belong to track Parameter Determination = Estimate track parameters (and error matrix)

28

29 Kinematic Fitting: Why do it?

30 Kinematic Fitting: Why do it?

31

32 Toy example of Kinematic Fit

33

34 PARADOX Histogram with 100 bins Fit with 1 parameter S min : χ 2 with NDF = 99 (Expected χ 2 = 99 ± 14) For our data, S min (p 0 ) = 90 Is p 1 acceptable if S(p 1 ) = 115? 1)YES. Very acceptable χ 2 probability 2) NO. σ p from S(p 0 +σ p ) = S min +1 = 91 But S(p 1 ) – S(p 0 ) = 25 So p 1 is 5σ away from best value

35

36 Next time: Bayes and Frequentism: the return of an old controversy The ideologies, with examples Upper limits Feldman and Cousins Summary