8th December 2004Tim Adye1 Proposal for a general-purpose unfolding framework in ROOT Tim Adye Rutherford Appleton Laboratory BaBar Statistics Working Group BaBar Collaboration Meeting 8 th December 2004
Tim Adye2 Outline What is Unfolding? and why might you want to do it? Overview of a few techniques Regularised unfolding Iterative method Idea for a ROOT package … but not much code yet! References
8th December 2004Tim Adye3 Unfolding In other fields known as “deconvolution” or “unsmearing” Given a “true” PDF, μ, that is corrupted by detector effects, described by a response function, R, we measure a distribution ν. In terms of histograms This may involve 1.inefficiencies: lost events 2.smearing: events moving between bins (off-diagonal R ij ) With infinite statistics, it would be possible to recover the original PDF by inverting the response matrix
8th December 2004Tim Adye4 Not so simple… Unfortunately, if there are statistical fluctuations between bins this information is destroyed Since R washes out statistical fluctuations, R -1 cannot distinguish between wildly fluctuating and smooth PDFs Obtain large negative correlations between adjacent bins Large fluctuations in reconstructed bin contents Need some procedure to remove wildly fluctuating solutions 1.Give added weight to “smoother” solutions 2.Solve for µ iteratively, starting with a reasonable guess and truncate iteration before it gets out of hand
8th December 2004Tim Adye5 What happens if you don’t smooth
8th December 2004Tim Adye6
8th December 2004Tim Adye7 So why don’t we always do this? If the true PDF and resolution function can be parameterised, then a ML fit is usually more convenient Directly returns parameters of interest Does not require binning If the response function doesn’t include smearing (ie. it’s diagonal), then apply bin-by-bin efficiency correction directly If result is just needed for comparison (eg. with MC), could apply response function to MC simpler than un-applying response to data
8th December 2004Tim Adye8 When to use unfolding Use unfolding to recover theoretical distribution where there is no a-priori parameterisation this is needed for the result and not just comparison with MC there is significant bin-to-bin migration of events
8th December 2004Tim Adye9 Where could we use unfolding? Traditionally used to extract structure functions Widely used outside PP for image reconstruction Dalitz plots Cross-feed between bins due to misreconstruction “True” decay momentum distributions Theory at parton level, we measure hadrons Correct for hadronisation as well as detector effects Maybe could use smoothing for standard ML fits?
8th December 2004Tim Adye10 1. Regularised Unfolding Use ML to fit smeared bin contents to measured data, but include regularisation function where the regularisation parameter, α, controls the degree of smoothness. various criteria are used to select α eg. minimise mean squared error Various choices of regularisation function, S, are used Tikhonov regularisation: minimise curvature for some definition of curvature, eg. RUN by Volker Blobel GURU by Andreas Höcker and Vakhtang Kartvelishvili using Singular Value Decomposition Maximum entropy:
8th December 2004Tim Adye11 2. Iterative method Uses Bayes’ theorem to invert and using an initial set of probabilities, p i (eg. flat) obtain an improved estimate Repeating with new p i from these new bin contents converges quite rapidly Truncating the iteration prevents us seeing the bad effects of statistical fluctuations Fergus Wilson has implemented this method in ROOT/C++
8th December 2004Tim Adye12 A ROOT framework It would be nice if these different methods could be made available as ROOT/C++ classes a la RooFit Could have a common interface to specify unfolding method and parameters response matrix or it could calculate it from MC samples measured histogram and return reconstructed truth histogram Handle 1D and 2D at least
8th December 2004Tim Adye13 A ROOT framework (2) Should also handle “simple” cases correction factors direct inversion of response matrix allowing to easily see whether full unfolding required Could also be useful outside BaBar!
8th December 2004Tim Adye14 Progress so far I have played around with Fergus’ code Can now be used in interactive ROOT Having a few problems right now extending beyond simple example Need to understand more options before designing an interface Also started to look at Andreas’ GURU package Is this a good idea? Or would I better spend my time with yet another tweak to the bookkeeping system I am still just learning, so pointers, suggestions, and ideas are most welcome…
8th December 2004Tim Adye15 References G. Cowan, A Survey of Unfolding Methods for Particle Physics, Proc. Advanced Statistical Techniques in Particle Physics, Durham (2002) G. Cowan, Statistical Data Analysis, Oxford University Press (1998) R. Barlow, SLUO Lectures on Numerical Methods in HEP (2000), Lecture 9: Unfolding www-group.slac.stanford.edu/sluo/Lectures/Stat_Lectures.html V. Blobel, Unfolding Methods in High Energy Physics, DESY (1984); also CERN A. Höcker and V. Kartvelishvili, SVD Approach to Data Unfolding, NIM A 372 (1996) 469