Download presentation
Presentation is loading. Please wait.
Published byGeorgiana Loraine Lee Modified over 9 years ago
1
9/9/2003PHYSTAT20031 Application of Adaptive Mixtures and Fractal Dimension Analysis Adaptive Mixtures KDELM Fractal Dimension Sang-Joon Lee (Rice University)
2
9/9/2003PHYSTAT20032 Adaptive Mixtures * Accepts strengths of Kernel Estimation and Finite Mixtures and discards their weaknesses. Kernel Estimation: Robust Needs intensive CPU power Finite Mixtures: Advantage in the computing time Strong assumptions on the underlying density & initial state Algorithm determines the number of kernels. For a new data point x i, a kernel is added only when “Mahalanobis” distance is greater than a pre-defined threshold T c. * Priebe, Carey (1994), “Adaptive Mixtures”, JASA, 89, 796-806
3
9/9/2003PHYSTAT20033 Update/Creation in Adaptive Mixtures Update Rule Create Rule
4
9/9/2003PHYSTAT20034 Performance of Adaptive Mixtures
5
9/9/2003PHYSTAT20035 Conclusion in Adaptive Mixtures For the 1-D Gaussian example, Adaptive Mixtures give “over-fit”. Poor consistency in the 1-D exponential example. Need an algorithm for better iteration preventing “over-fit”
6
9/9/2003PHYSTAT20036 KDELM (Kernel Density Estimation with Likelihood Maximization) Add a kernel only when it results in a better fit The goodness of fit is estimated by comparing the minus log likelihood (MINUIT for minimization ): where and. is a normal probability density function with mean and standard deviation.
7
9/9/2003PHYSTAT20037 Performance of KDELM
8
9/9/2003PHYSTAT20038 Performance of KDELM (2) Discrimination of Tau signal events from generic QCD backgrounds Discriminant Function: At efficiency 50%, S/B = 26.32
9
9/9/2003PHYSTAT20039 Conclusions in KDELM KDELM is robust. KDELM is fast in computation. KDELM gives a good background rejection in the Tau lepton identification. May need a new algorithm for a better fit to an extreme distribution such as the 1-D exponential.
10
9/9/2003PHYSTAT200310 Fractal Dimension Fractal dimension, also called capacity dimension, as defined by Mathworld is n(e)=exp(-D) where n(e) is the minimum number of open sets of diameter e to cover the set. Fractal dimension quantifies the increase in structural definition that magnification yields.
11
9/9/2003PHYSTAT200311 Mandelbrot's Example Consider measuring the length of a coastline, (an example given by Mandelbrot). Using a meterstick, you might get a good estimate of length, yet using a centimeter stick (and with more time of course) you can get an even better measurement. Fractal dimension quantifies this increase in detail that occurs by magnifying or in this case, by switching rulers.
12
9/9/2003PHYSTAT200312 Calculation of Fractal Dimension There are several techniques, yet all involve estimating the dimension from the slope of a log-log power law plot. (from power law relationship on earlier slide) Box Counting Technique Radial Covering Method Fourier Estimator
13
9/9/2003PHYSTAT200313 Box Counting Technique * Grids (boxes) of varying lengths are placed over the data set A count of how many boxes contain data points is made for the power law plot Dimension derived from least squares fit of slope *The specific implementation used was coded by John Sarraille and Peter DiFalco of CSU, who based their algorithm on "A Fast Algorithm To Determine Fractal Dimensions By Box Counting", by Liebovitch and Toth.
14
9/9/2003PHYSTAT200314 Box Counting Technique (2) Goal: to find the combination of variables would help create a clearer distinction between signal and background ttbar MC data composed of 13 different variables was used. There were 96 background events and 158 signal events. Fractal dimension was calculated for pairs of variables for varying combinations of signal and background events to see if the fractal dimension value really helped indicating signal or background.
15
9/9/2003PHYSTAT200315 Results Fractal dimension was calculated for the full signal sample, the full background sample and a sample composed of 96 events from each Of the 78 distinct combinations, 37 appear to show some significance in indicating signal or background These 37 combinations show fractal dimension values which, in the mixed case, interpolate between the pure sample values
16
9/9/2003PHYSTAT200316 Results (2) Fractal dimension was also calculated for mixtures composed of 96 events of varying proportions of signal and background (0%-100%,25%-75%,50%- 50%,75%-25%,100%-0%) 15 of the combinations continue to interpolate across the mixes Many others appear to reach a maximum or minimum fractal dimension value at the 50-50 mixture, yet those which share more signal (75-25,100-0) have similar fractal dimension values, whereas those with more background (25-75,0-100) also share similar fractal dimension values
17
9/9/2003PHYSTAT200317 Difference of Fractal Dimension Signal-background Fractal dimension differences in 2-D 13 variables => Possible number of variable-pairs =(13 2 -13)/2 = 78 0.4 difference is significant comparing to typical 2-D fractal dimension 1.0.
18
9/9/2003PHYSTAT200318 Conclusions in Fractal Dimension Fractal dimension appears to be useful for some combinations of variables as a discriminating feature. The next step would be to take those pairs of variables which show promise and use them as features for one of the classifier techniques (kernel density estimation, decision tree, etc.).
19
9/9/2003PHYSTAT200319 Acknowledgements I would like to thank to: Professor Paul Padley (Rice University) Professor David Scott (Rice University) Professor Bruce Knuteson (MIT) Bradley Chase (Rice University)
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.