Presentation is loading. Please wait.

Presentation is loading. Please wait.

G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.

Similar presentations


Presentation on theme: "G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16."— Presentation transcript:

1 G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16 日 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

2 G. Cowan Statistical Methods in Particle Physics2 Outline of lectures Day #1: Introduction Review of probability and Monte Carlo Review of statistics: parameter estimation Day #2: Multivariate methods (I) Event selection as a statistical test Cut-based, linear discriminant, neural networks Day #3: Multivariate methods (II) More multivariate classifiers: BDT, SVM,... Day #4: Significance tests for discovery and limits Including systematics using profile likelihood Day #5: Bayesian methods Bayesian parameter estimation and model selection

3 G. Cowan Statistical Methods in Particle Physics3 Day #3: outline Nonparametric probability density estimation methods histograms kernel density estimation, K nearest neighbour Boosted Decision Trees Support Vector Machines

4 G. Cowan Statistical Methods in Particle Physics4 Review of the problem Suppose we have events from Monte Carlo of two types, signal and background (each instance of x is multivariate): Goal is to use these training events to adjust the parameters of a test statistic (“classifier”) y(x), that we can then use on real data to distinguish between the two types. Yesterday we saw linear classifiers, neural networks and started talking about probability density estimation methods, where we find the classifier from the ratio of estimated pdfs.

5 G. Cowan Statistical Methods in Particle Physics5

6 G. Cowan Statistical Methods in Particle Physics6

7 G. Cowan Statistical Methods in Particle Physics7

8 G. Cowan Statistical Methods in Particle Physics8

9 G. Cowan Statistical Methods in Particle Physics9

10 G. Cowan Statistical Methods in Particle Physics10

11 G. Cowan Statistical Methods in Particle Physics11

12 G. Cowan Statistical Methods in Particle Physics12

13 G. Cowan Statistical Methods in Particle Physics13

14 G. Cowan Statistical Methods in Particle Physics14

15 G. Cowan Statistical Methods in Particle Physics15

16 G. Cowan Statistical Methods in Particle Physics16

17 G. Cowan Statistical Methods in Particle Physics17

18 G. Cowan Statistical Methods in Particle Physics18

19 G. Cowan Statistical Methods in Particle Physics19

20 G. Cowan Statistical Methods in Particle Physics20

21 G. Cowan Statistical Methods in Particle Physics21

22 G. Cowan Statistical Methods in Particle Physics22

23 G. Cowan Statistical Methods in Particle Physics23

24 G. Cowan Statistical Methods in Particle Physics24

25 G. Cowan Statistical Methods in Particle Physics25 Boosted decision trees

26 G. Cowan Statistical Methods in Particle Physicspage 26 Particle i.d. in MiniBooNE Detector is a 12-m diameter tank of mineral oil exposed to a beam of neutrinos and viewed by 1520 photomultiplier tubes: H.J. Yang, MiniBooNE PID, DNP06 Search for  to e oscillations required particle i.d. using information from the PMTs.

27 G. Cowan Statistical Methods in Particle Physicspage 27 Decision trees Out of all the input variables, find the one for which with a single cut gives best improvement in signal purity: Example by MiniBooNE experiment, B. Roe et al., NIM 543 (2005) 577 where w i. is the weight of the ith event. Resulting nodes classified as either signal/background. Iterate until stop criterion reached based on e.g. purity or minimum number of events in a node. The set of cuts defines the decision boundary.

28 G. Cowan Statistical Methods in Particle Physicspage 28

29 G. Cowan Statistical Methods in Particle Physicspage 29

30 G. Cowan Statistical Methods in Particle Physicspage 30

31 G. Cowan Statistical Methods in Particle Physicspage 31

32 G. Cowan Statistical Methods in Particle Physicspage 32 BDT example from MiniBooNE ~200 input variables for each event ( interaction producing e,  or  Each individual tree is relatively weak, with a misclassification error rate ~ 0.4 – 0.45 B. Roe et al., NIM 543 (2005) 577

33 G. Cowan Statistical Methods in Particle Physicspage 33 Monitoring overtraining From MiniBooNE example: Performance stable after a few hundred trees.

34 G. Cowan Statistical Methods in Particle Physicspage 34 Monitoring overtraining (2)

35 G. Cowan Statistical Methods in Particle Physicspage 35 Comparison of boosting algorithms A number of boosting algorithms on the market; differ in the update rule for the weights.

36 G. Cowan Statistical Methods in Particle Physicspage 36

37 G. Cowan Statistical Methods in Particle Physicspage 37 Single top quark production (CDF/D0) Top quark discovered in pairs, but SM predicts single top production. Use many inputs based on jet properties, particle i.d.,... signal (blue + green) Pair-produced tops are now a background process.

38 G. Cowan Statistical Methods in Particle Physicspage 38 Different classifiers for single top Also Naive Bayes and various approximations to likelihood ratio,.... Final combined result is statistically significant (>5  level) but not easy to understand classifier outputs.

39 G. Cowan Statistical Methods in Particle Physicspage 39

40 G. Cowan Statistical Methods in Particle Physicspage 40

41 G. Cowan Statistical Methods in Particle Physicspage 41

42 G. Cowan Statistical Methods in Particle Physicspage 42

43 G. Cowan Statistical Methods in Particle Physicspage 43

44 G. Cowan Statistical Methods in Particle Physicspage 44

45 G. Cowan Statistical Methods in Particle Physicspage 45

46 G. Cowan Statistical Methods in Particle Physicspage 46

47 G. Cowan Statistical Methods in Particle Physicspage 47

48 G. Cowan Statistical Methods in Particle Physicspage 48

49 G. Cowan Statistical Methods in Particle Physicspage 49

50 G. Cowan Statistical Methods in Particle Physicspage 50

51 G. Cowan Statistical Methods in Particle Physicspage 51 Using an SVM To use an SVM the user must as a minimum choose a kernel function (e.g. Gaussian) any free parameters in the kernel (e.g. the  of the Gaussian) a cost parameter C (plays role of regularization parameter) The training is relatively straightforward because, in contrast to neural networks, the function to be minimized has a single global minimum. Furthermore evaluating the classifier only requires that one retain and sum over the support vectors, a relatively small number of points. The advantages/disadvantages and rationale behind the choices above is not always clear to the particle physicist -- help needed here.

52 G. Cowan Statistical Methods in Particle Physicspage 52 SVM in particle physics SVMs are very popular in the Machine Learning community but have yet to find wide application in HEP. Here is an early example from a CDF top quark anlaysis (A. Vaiciulis, contribution to PHYSTAT02). signal eff.

53 G. Cowan Statistical Methods in Particle Physicspage 53 Summary on multivariate methods Particle physics has used several multivariate methods for many years: linear (Fisher) discriminant neural networks naive Bayes and has in the last several years started to use a few more k-nearest neighbour boosted decision trees support vector machines The emphasis is often on controlling systematic uncertainties between the modeled training data and Nature to avoid false discovery. Although many classifier outputs are "black boxes", a discovery at 5  significance with a sophisticated (opaque) method will win the competition if backed up by, say, 4  evidence from a cut-based method.

54 Quotes I like “If you believe in something you don't understand, you suffer,...” – Stevie Wonder “Keep it simple. As simple as possible. Not any simpler.” – A. Einstein G. Cowan page 54Statistical Methods in Particle Physics

55 G. Cowan Statistical Methods in Particle Physicspage 55 Extra slides

56 G. Cowan Statistical Methods in Particle Physicspage 56

57 G. Cowan Statistical Methods in Particle Physicspage 57

58 G. Cowan Statistical Methods in Particle Physicspage 58

59 G. Cowan Statistical Methods in Particle Physicspage 59


Download ppt "G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16."

Similar presentations


Ads by Google