Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bayesian Within The Gates A View From Particle Physics

Similar presentations


Presentation on theme: "Bayesian Within The Gates A View From Particle Physics"— Presentation transcript:

1 Bayesian Within The Gates A View From Particle Physics
Harrison B. Prosper Florida State University SAMSI 24 January, 2006 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006 Multivariate Analysis Harrison B. Prosper Durham, UK 2002

2 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Outline Measuring Zero as Precisely as Possible! Signal/Background Discrimination 1-D Example 14-D Example Some Open Issues Summary Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

3 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Measuring Zero! Diamonds may not be forever Neutron <-> anti-neutron transitions, CRISP Experiment (1982 – 1985), Institut Laue Langevin Grenoble, France Method Fire gas of cold neutrons onto a graphite foil. Look for annihilation of anti-neutron component. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

4 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Measuring Zero! Count number of signal + background events N. Suppress putative signal and count background events B, independently. Results: N = 3 B = 7 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

5 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Measuring Zero! Classic 2-Parameter Counting Experiment N ~ Poisson(s+b) B ~ Poisson(b) Wanted: A statement like s < 90% CL Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

6 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Measuring Zero! In 1984, no exact solution existed in the particle physics literature! But, surely it must have been solved by statisticians. Alas, from Kendal and Stuart I learnt that calculating exact confidence intervals is “a matter of very considerable difficulty”. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

7 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Measuring Zero! Exact in what way? Over the ensemble of statements of the form s є [0, u) at least 90% of them should be true whatever the true value of the signal s AND whatever the true value of the background parameter b. blame… Neyman (1937) Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

8 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
“Keep it simple, but no simpler” Albert Einstein Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

9 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
the Gate (1984) Solution: p(N,B|s,b) = Poisson(s+b) Poisson(b) the likelihood p(s,b) = uniform(s,b) the prior Compute the posterior density p(s,b|N,B) p(s,b|N,B) = p(N,B|s,b) p(s,b)/p(N,B) Marginalize over b p(s|N,B) = ∫p(s,b|N,B) db This reasoning was compelling to me then, and is much more so now! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

10 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Particle Physics Data proton + anti-proton -> positron (e+) neutrino (n) Jet1 Jet2 Jet3 Jet4 This event “lives” in x 4 = 17 dimensions. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

11 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Particle Physics Data CDF/Dzero Discovery of top quark (1995) Data red Signal green Background blue, magenta Dzero: 17-D -> 2-D Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

12 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
But that was then, and now is now! Today we have 2 GHz laptops, with 2 GB of memory! It is fun to deploy huge, sometimes unreliable, computational resources, that is, brains, to reduce the dimensionality of data. But perhaps it is now feasible to work directly in the original high-dimensional space, using hardware! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

13 Signal/Background Discrimination
The optimal solution is to compute p(S|x) = p(x|s) p(s) / [p(x|s) p(s) + p(x|B) p(B)] Every signal/background discrimination method is ultimately an algorithm to approximate this solution, or a mapping thereof. Therefore, if a method is already at the Bayes limit, no other method, however sophisticated, can do better! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

14 Signal/Background Discrimination
Given D = x, y x = {x1,…xN}, y = {y1,…yN} of N training examples Infer A discriminant function f(x, w), with parameters w p(w|x, y) = p(x, y|w) p(w) / p(x, y) = p(y|x, w) p(x|w) p(w) / p(y|x) p(x) = p(y|x, w) p(w) / p(y|x) assuming p(x|w) -> p(x) Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

15 Signal/Background Discrimination
A typical likelihood for classification: p(y|x, w) = Pi f(xi, w)y [1 – f(xi, w)]1-y where y = 0 for background events y = 1 for signal events If f(x, w) flexible enough, then maximizing p(y|x, w) with respect to w yields f = p(S|x), asymptotically. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

16 Signal/Background Discrimination
However, in a full Bayesian calculation one usually averages with respect to the posterior density y(x) = ∫ f(x, w) p(w|D) dw Questions: 1. Do suitably flexible functions f(x, w) exist? 2. Is there a feasible way to do the integral? Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

17 Answer 1: Hilbert’s 13th Problem!
Prove that the following is impossible y(x,y,z) = F( A(x), B(y), C(z) ) In 1957, Kolmogorov proved the contrary conjecture y(x1,..,xn) = F( f1(x1),…,fn(xn) ) I’ll call such functions, F, Kolmogorov functions Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

18 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Kolmogorov Functions n(x,w) x1 x2 u, a v, b A neural network is an example of a Kolmogorov function, that is, a function capable of approximating arbitrary mappings f:RN -> U The parameters w = (u, a, v, b) are called weights Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

19 Answer 2: Use Hybrid MCMC
Computational Method Generate a Markov chain (MC) of N points {w} drawn from the posterior density p(w|D) and average over the last M points. Each point corresponds to a network. Software Flexible Bayesian Modeling by Radford Neal Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

20 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
A 1-D Example Signal p+pbar -> t q b Background p+pbar -> W b b NN Model Class (1, 15, 1) MCMC 500 tqb + Wbb events Use last 20 networks in a MC chain of 500. Wbb tqb x Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

21 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
A 1-D Example Dots p(S|x) = HS/(HS+HB) HS, HB, 1-D histograms Curves Individual NNs n(x, wk) Black curve < n(x, w) > x Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

22 A 14-D Example (Finding Susy!)
Transverse momentum spectra Signal: black curve Signal/Noise 1/100,000 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

23 A 14-D Example (Finding Susy!)
Missing transverse momentum spectrum (caused by escape of neutrinos and Susy particles) Variable count 4 x (ET, h, f) + (ET, f) = 14 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

24 A 14-D Example (Finding Susy!)
Signal 250 p+pbar -> top + anti-top (MC) events Background 250 p+pbar -> gluino gluino (MC) events NN Model Class (14, 40, 1) (641-D parameter space!) MCMC Use last 100 networks in a Markov chain of 10,000, skipping every 20. Likelihood Prior Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

25 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
But does it Work? Signal to noise can reach 1/1 with an acceptable signal strength Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

26 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
But does it Work? Let d(x) = N p(x|S) + N p(x|B) be the density of the data, containing 2N events, assuming, for simplicity, p(S) = p(B). A properly trained classifier y(x) approximates p(S|x) = p(x|S)/[p(x|S) + p(x|B)] Therefore, if the signal and background events are weighted with y(x), we should recover the signal density. Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

27 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
But does it Work? Amazingly well ! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

28 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Some Open Issues Why does this insane function p(w1,…,w641|x1,…,x500) behave so well? 641 parameters > 500 events! How should one verify that an n-D (n ~ 14) swarm of simulated background events matches the n-D swarm of observed events (in the background region)? How should one verify that y(x) is indeed a reasonable approximation to the Bayes discriminant, p(S|x)? Bayesian within the Gates Harrison B. Prosper SAMSI, 2006

29 Bayesian within the Gates Harrison B. Prosper SAMSI, 2006
Summary Bayesian methods have been, and are being, used with considerable success by particle physicists. Happily, the frequentist/Bayesian Cold War is abating! The application of Bayesian methods to highly flexible functions, e.g., neural networks, is very promising and should be broadly applicable. Needed: A powerful way to compare high-dimensional swarms of points. Agree, or not agree, that is the question! Bayesian within the Gates Harrison B. Prosper SAMSI, 2006


Download ppt "Bayesian Within The Gates A View From Particle Physics"

Similar presentations


Ads by Google