Download presentation
Presentation is loading. Please wait.
Published byAnna Waters Modified over 9 years ago
1
פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב 2007 4
2
Simplest variable combination: diagonal cut
3
Combining variables Many variables that weakly separate signal from background Often correlated distributions Complicated to deal with or to use in a fit Easiest to combine into one simple variable Fisher discriminant:
4
Neural networks Continuum MC BB MC BB & qq Background MC Signal MC
5
Input variables for neural net Legendre Fisher Log( z)cos T Log(K-D DOCA) Signal BB bgd cc+uds (BtgElectronTag & BtgMuonTag) Lepton tagging
6
Uncorrelated, (approximately) Gaussian- distributed variables “Gaussian-distributed” means the distribution of v is How to combine the information? Option 1: V = v 1 + v 2 Option 2: V = v 1 – v 2 Option 3: V = 1 v 1 + 2 v 2 What are the best weights i ? How about i = ( – ) = difference between the signal & background means Signal Background Signal Background v1v1 v2v2
7
Incorporating spreads in v i – > –, but v 2 has a smaller spreads and more actual separation between S and B i = ( – )/(( i s ) 2 + ( i b ) 2 ) where ( i s ) 2 = ) 2 > = e (v ie s – ) 2 / N is the RMS spread in the v i distribution of a pure signal sample (similarly defined for i b ) You may be familiar with the form ) 2 > = + 2 > – 2 > = 2 Signal Background Signal Background v1v1 v2v2
8
Linearly correlated, Gaussian-distributed variables Linear correlation: – = 0 + c v 2 –( 1 ) 2 independent of v 2 i = ( – ) / (( i s ) 2 + ( i b ) 2 ) doesn’t account for the correlation Recall ( i s ) 2 = ) 2 > Replace it with the covariance matrix C ij s = ) (v j s – ) > i = j ( – ) (C ij s + C ij b ) Fisher discriminant: F j i v i Inverse of the sum of the S+B covariance matrices
9
Fisher discriminant properties Best S-B separation for a linearly correlated set of Gaussian-distributed variables Non-Gaussian-ness of v is usually not a problem… There must be a mean difference – 0 Need to calculate i coefficients using (correctly simulated) Monte Carlo (MC) signal and background samples Should validate using control samples (true for any discriminant) Take abs value
10
More properties F is more Gaussian than its inputs (virtual calorimeter example) Central limit theorem: –If x j (j=1, …n) are independent random variables with means and variances j 2, then for large n, the sum j x j is a Gaussian-distributed variable with mean j and variance j j 2 F can usually be fit with 2 Gaussians or a bifurcated Gaussian A cut on F corresponds to an (n-1)-diemensional plane cut through the n- dimensional variable space
11
Nonlinear correlations Linear methods (Fisher) are not optimal for such cases May fail altogether if there is no S-B mean difference
12
Artificial neural networks “Complex nonlinearity” Each neuron –takes many inputs –outputs a response function value The output of each neuron serves as input for the others Neurons divided among layers for efficiency The weight w ij l between neuron i in layer l and neuron j in layer l+1 is calculated using a MC “training sample”
13
Response functions Neuron output = (inputs, weights) = ( (inputs, weights))
14
Common usage = sum in hidden & output layer = linear in output layer = tanh in hidden layer
15
Training (calculating weights) Event a (a=1…N) has input variable vector x = (x 1 …x n var ) For each event, calculate the deviation from the desired value (0 for background, 1 for signal) Calculate the error function for random values w of the weights
16
… Training Change the weights so as to cause the most steep decline in E: “online learning”: remove the sums –Requires a randomized training sample
17
What architecture to use? Weierstrass theorem: for a multilayer perceptron, 1 hidden layer is sufficient to approximate a continuous correlation function to any precision, if the number of neurons in the layer is high enough Alternatively: several hidden layers and less neurons may converge faster and be more stable Instability problems: –output distribution changes with different samples
18
What variables to use? Improvement with added variables: Importance of variable i:
19
More info A cut on a NN output = non-linear slice through n-dimensional space NN output shape can be (approximately) Gaussianized: q q’ = tanh (q – ½ (q max +q min ) / ½(q max – q min )]
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.