Overview of Non-Parametric Probability Density Estimation Methods Sherry Towers State University of New York at Stony Brook
S.Towers All kernal PDF estimation methods (PDE’s) are developed from a simple idea… If a data point lies in a region where clustering of signal MC is tight, and bkgnd MC is loose, the point is likely to be signal
S.Towers To estimate a PDF, PDE’s use the idea that any continuous function can be modelled by sum of some “kernal” function Gaussian kernals are a good choice for particle physics So, a PDF can be estimated by sum of multi-dimensional Gaussians centred about MC generated points
S.Towers Best form of Gaussian kernal is a matter of debate: Static-kernal PDE method uses a kernal with covariance matrix obtained from entire sample The Gaussian Expansion Method (GEM), uses an adaptive kernal; the covariance matrix used for the Gaussian at each MC point comes from “local” covariance matrix.
S.Towers
GEM vs Static-Kernal PDE GEM gives unbiased estimate of PDF, but slower to use because local covariance must be calculated for each MC point Static-kernal PDE methods have smaller variance, and are faster to use, but yield biased estimates of the PDF
S.Towers Comparison of GEM and static-kernal PDE:
S.Towers PDE vs Neural Networks Both PDE’s and Neural Networks can take into account non-linear correlations in parameter space Both methods are, in principle, equally powerful For most part they perform similarly in an “average” analysis
S.Towers PDE vs Neural Networks But, PDE’s have far fewer parameters, and algorithm is more intuitive in nature (easier to understand)
S.Towers Plus, PDE estimate of PDF can be visually examined:
S.Towers PDE’s vs Neural Nets… There are some problems that are particularly well suited to PDE’s:
S.Towers PDE’s vs Neural Nets…
S.Towers PDE’s vs Neural Nets…
S.Towers PDE’s vs Neural Nets…
S.Towers Summary PDE methods are as powerful as neural networks, and offer an interesting alternative Very few parameters, easy to use, easy to understand, and yield unbinned estimate of PDF that user can examine in the multidimensional parameter space!