Download presentation
Presentation is loading. Please wait.
Published bySuzanna Holland Modified over 5 years ago
1
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
London Postgraduate Lectures on Particle Physics; University of London MSci course PH4515 Glen Cowan Physics Department Royal Holloway, University of London Course web page: G. Cowan Computing and Statistical Data Analysis / Stat 5
2
Finding an optimal decision boundary
H0 In particle physics usually start by making simple “cuts”: xi < ci xj < cj H1 Maybe later try some other type of decision boundary: H0 H0 H1 H1 G. Cowan Computing and Statistical Data Analysis / Stat 5
3
Computing and Statistical Data Analysis / Stat 5
Multivariate methods Many new (and some old) methods: Fisher discriminant Neural networks Kernel density methods Support Vector Machines Decision trees Boosting Bagging New software for HEP, e.g., TMVA , Höcker, Stelzer, Tegenfeldt, Voss, Voss, physics/ StatPatternRecognition, I. Narsky, physics/ G. Cowan Computing and Statistical Data Analysis / Stat 5
4
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
5
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
6
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
7
Computing and Statistical Data Analysis / Stat 5
2 G. Cowan Computing and Statistical Data Analysis / Stat 5
8
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
9
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
10
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
11
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
12
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
13
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
14
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
15
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
16
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
17
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
18
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
19
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
20
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
21
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
22
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
23
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
24
Computing and Statistical Data Analysis / Stat 5
Overtraining If decision boundary is too flexible it will conform too closely to the training points → overtraining. Monitor by applying classifier to independent validation sample. training sample independent validation sample G. Cowan Computing and Statistical Data Analysis / Stat 5
25
Computing and Statistical Data Analysis / Stat 5
Choose classifier that minimizes error function for validation sample. G. Cowan Computing and Statistical Data Analysis / Stat 5
26
Neural network example from LEP II
Signal: e+e- → W+W- (often 4 well separated hadron jets) Background: e+e- → qqgg (4 less well separated hadron jets) ← input variables based on jet structure, event shape, ... none by itself gives much separation. Neural network output: (Garrido, Juste and Martinez, ALEPH ) G. Cowan Computing and Statistical Data Analysis / Stat 5
27
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
28
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
29
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
30
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
31
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
32
Kernel-based PDE (KDE, Parzen window)
Consider d dimensions, N training events, x1, ..., xN, estimate f (x) with bandwidth (smoothing parameter) kernel Use e.g. Gaussian kernel: Need to sum N terms to evaluate function (slow); faster algorithms only count events in vicinity of x (k-nearest neighbor, range search). G. Cowan Computing and Statistical Data Analysis / Stat 5
33
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
34
Computing and Statistical Data Analysis / Stat 5
G. Cowan Computing and Statistical Data Analysis / Stat 5
35
Computing and Statistical Data Analysis / Stat 5
Find these on next homework assignment. G. Cowan Computing and Statistical Data Analysis / Stat 5
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.