Download presentation
Presentation is loading. Please wait.
Published byArabella Cobb Modified over 9 years ago
1
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics: from the TeVatron to the LHC October 19, 2007
2
Harrison B. Prosper Workshop on Top Physics, Grenoble 2 Outline Introduction Inference Model Selection Summary
3
Harrison B. Prosper Workshop on Top Physics, Grenoble 3 Introduction Blaise Pascal 1670 Thomas Bayes 1763 Pierre Simon de Laplace 1812
4
Harrison B. Prosper Workshop on Top Physics, Grenoble 4 Introduction A B AB Let P(A) and P(B) be probabilities, assigned to statements, or events, A and B and let P(AB) be the probability assigned to the joint statement AB, then the conditional probability of A given B is defined by P(A) is the probability of A without the restriction specified by B. P(A|B) is the probability of A when we restrict to the conditions specified by statement B
5
Harrison B. Prosper Workshop on Top Physics, Grenoble 5 From we deduce immediately Bayes’ Theorem Bayes’ Theorem: Introduction Bayesian statistics is the application of Bayes’ theorem to problems of inference
6
Harrison B. Prosper Workshop on Top Physics, Grenoble Inference
7
7 Inference The Bayesian approach to inference is conceptually simple and always the same: Compute DataModel Pr(Data|Model) Compute ModelDataDataModelModelData Pr(Model|Data) = Pr(Data|Model) Pr(Model)/Pr(Data) Model Pr(Model)is called the prior. It is the probability Model assigned to the Model irrespective of theData DataModel Pr(Data|Model)is called the likelihood ModelData Pr(Model|Data)is called the posterior probability
8
Harrison B. Prosper Workshop on Top Physics, Grenoble 8 posterior density prior density marginalization likelihood In practice, inference is done using the continuous form of Bayes’ theorem: are the parameters of interest denote all other parameters of the problem, which are referred to as nuisance parameters Inference
9
Harrison B. Prosper Workshop on Top Physics, Grenoble 9 Model Datum Likelihood s is the mean signal count b is the mean background count Task: Infer s, given N Prior information Example – 1
10
Harrison B. Prosper Workshop on Top Physics, Grenoble 10 Apply Bayes’ theorem: (s,b) is the prior density for s and b, which encodes our prior knowledge of the signal and background means. The encoding is often difficult and can be controversial. priorlikelihoodposterior Example – 1
11
Harrison B. Prosper Workshop on Top Physics, Grenoble 11 First factor the prior Define the marginal likelihood and write the posterior density for the signal as Example – 1
12
Harrison B. Prosper Workshop on Top Physics, Grenoble 12 The Background Prior Density Suppose that the background has been estimated from a Monte Carlo simulation of the background process, yielding B events that pass the cuts. Assume that the probability for the count B is given by P(B| ) = Poisson(B ), where is the (unknown) mean count of the Monte Carlo sample. We can infer the value of by applying Bayes’ theorem to the Monte Carlo background experiment Example – 1
13
Harrison B. Prosper Workshop on Top Physics, Grenoble 13 The Background Prior Density Assuming a flat prior prior ( ) = constant, we find p( |B) = Gamma (, 1, B+1) (= B exp(– )/B!). Often the mean background count b in the real experiment is related to the mean count in the Monte Carlo experiment linearly, b = k, where k is an accurately known scale factor, for example, the ratio of the data to Monte Carlo integrated luminosities. The background can be estimated as follows Example – 1
14
Harrison B. Prosper Workshop on Top Physics, Grenoble 14 The Background Prior Density The posterior density p( |B) now serves as the prior density for the background b in the real experiment (b) = p( |B), where b = k. We can write and Example – 1
15
Harrison B. Prosper Workshop on Top Physics, Grenoble 15 The calculation of the marginal likelihood yields: Example – 1
16
Harrison B. Prosper Workshop on Top Physics, Grenoble 16 Data partitioned into K bins and modeled by a sum of N sources of strength p. The numbers A are the source distributions for model M. Each M corresponds to a different top signal + background model Example – 2: Top Mass – Run I prior likelihood model posterior
17
Harrison B. Prosper Workshop on Top Physics, Grenoble 130140150160170180190200210220230 0 0.1 0.2 0.3 Probability of Model M Top Quark Mass (GeV/c**2) P(M|d) m top = 173.5 ± 4.5 GeV s = 33 ± 8 events b = 50.8 ± 8.3 events Example – 2: Top Mass – Run I
18
Harrison B. Prosper Workshop on Top Physics, Grenoble 18 To Bin Or Not To Bin Binned – Pros Likelihood can be modeled accurately Bins with low counts can be handled exactly Statistical uncertainties handled exactly Binned – Cons Information loss can be severe Suffers from the curse of dimensionality
19
Harrison B. Prosper Workshop on Top Physics, Grenoble 19 December 8, 2006- Binned likelihoods do work! To Bin Or Not To Bin
20
Harrison B. Prosper Workshop on Top Physics, Grenoble 20 To Bin Or Not To Bin Un-Binned – Pros No loss of information (in principle) Un-Binned – Cons Can be difficult to model likelihood accurately. Requires fitting (either parametric or KDE) Error in likelihood grows approximately linearly with the sample size. So at LHC, large sample sizes could become an issue.
21
Harrison B. Prosper Workshop on Top Physics, Grenoble 21 likelihood Start with the standard binned likelihood over K bins model Un-binned Likelihood Functions
22
Harrison B. Prosper Workshop on Top Physics, Grenoble 22 Make the bins smaller and smaller the likelihood becomes where K is now the number of events and a(x) and b(x) are the effective luminosity and background densities, respectively, and A and B are their integrals Un-binned Likelihood Functions
23
Harrison B. Prosper Workshop on Top Physics, Grenoble 23 The un-binned likelihood function is an example of a marked Poisson likelihood. Each event is marked by the discriminating variable x i, which could be multi-dimensional. The various methods for measuring the top cross section and mass differ in the choice of discriminating variables x. Un-binned Likelihood Functions
24
Harrison B. Prosper Workshop on Top Physics, Grenoble 24 Note: Since the functions a(x) and b(x) have to be modeled, they will depend on sets of modeling parameters and , respectively. Therefore, in general, the un-binned likelihood function is which must be combined with a prior density to compute the posterior density for the cross section Un-binned Likelihood Functions
25
Harrison B. Prosper Workshop on Top Physics, Grenoble 25 If we write s(x) = a(x) , and S = A we can re-write the un-binned likelihood function as Computing the Un-binned Likelihood Function Since a likelihood function is defined only to within a scaling by a parameter-independent quantity, we are free to scale it by, for example, the observed distribution d(x)
26
Harrison B. Prosper Workshop on Top Physics, Grenoble 26 One way to approximate the ratio [s(x)+ b(x)]/d(x) is with a neural network function trained with an admixture of data, signal and background in the ratio 2:1:1. If the training can be done accurately enough, the network will approximate n(x) = [s(x)+ b(x)]/[ s(x)+b(x)+d(x)] in which case we can then write Computing the Un-binned Likelihood Function
27
Harrison B. Prosper Workshop on Top Physics, Grenoble Model Selection
28
Harrison B. Prosper Workshop on Top Physics, Grenoble 28 posteriorpriorevidence Model Selection Model selection can also be addressed using Bayes’ theorem. It requires computing where the evidence for model M is defined by
29
Harrison B. Prosper Workshop on Top Physics, Grenoble 29 posterior odds prior odds Bayes factor Model Selection The Bayes Factor, B MN, or any one-to-one function thereof, can be used to choose between two competing models M and N, e.g., signal + background versus background only. However, one must be careful to use proper priors.
30
Harrison B. Prosper Workshop on Top Physics, Grenoble 30 Model 1 Model 2 Model Selection – Example Consider the following two prototypical models The Bayes factor for these models is given by
31
Harrison B. Prosper Workshop on Top Physics, Grenoble 31 Model Selection – Example Calibration of Bayes Factors Consider the quantity (called the Kullback-Leibler divergence) For the simple Poisson models with known signal and background, it is easy to show that For s << b, we get √k(2||1) ≈ s /√b. That is, roughly speaking, for s << b, √ ln B 12 ≈ s /√b
32
Harrison B. Prosper Workshop on Top Physics, Grenoble 32 Summary Bayesian statistics is a well-founded and general framework for thinking about and solving analysis problems, including: Analysis design Modeling uncertainty Parameter estimation Interval estimation (limit setting) Model selection Signal/background discrimination etc. It well worth learning how to think this way!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.