RooUnfold unfolding framework and algorithms

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

S.Towers TerraFerMA TerraFerMA A Suite of Multivariate Analysis tools Sherry Towers SUNY-SB Version 1.0 has been released! useable by anyone with access.
Transverse momentum of Z bosons in Zee and Zmm decays Daniel Beecher 12 December 2005.
CS479/679 Pattern Recognition Dr. George Bebis
JET CHARGE AT LHC Scott Lieberman New Jersey Institute of Technology TUNL REU 2013 Duke High Energy Physics Group Working with: Professor Ayana Arce Dave.
G. Cowan RHUL Physics Profile likelihood for systematic uncertainties page 1 Use of profile likelihood to determine systematic uncertainties ATLAS Top.
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem, random variables, pdfs 2Functions.
Lecture 5: Learning models using EM
Slide 1 Statistics for HEP Roger Barlow Manchester University Lecture 3: Estimation.
Statistical Image Modelling and Particle Physics Comments on talk by D.M. Titterington Glen Cowan RHUL Physics PHYSTAT05 Glen Cowan Royal Holloway, University.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
7. Least squares 7.1 Method of least squares K. Desch – Statistical methods of data analysis SS10 Another important method to estimate parameters Connection.
Linear and generalised linear models
Chi Square Distribution (c2) and Least Squares Fitting
Basics of regression analysis
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
Binary Variables (1) Coin flipping: heads=1, tails=0 Bernoulli Distribution.
Sung-Won Lee 1 Study of Jets Production Association with a Z boson in pp Collision at 7 and 8 TeV with the CMS Detector Kittikul Kovitanggoon Ph. D. Thesis.
Unfolding algorithms and tests using RooUnfold
880.P20 Winter 2006 Richard Kass 1 Confidence Intervals and Upper Limits Confidence intervals (CI) are related to confidence limits (CL). To calculate.
Analysis Meeting vol Shun University.
1 Probability and Statistics  What is probability?  What is statistics?
G. Cowan Lectures on Statistical Data Analysis Lecture 3 page 1 Lecture 3 1 Probability (90 min.) Definition, Bayes’ theorem, probability densities and.
R. Kass/W03P416/Lecture 7 1 Lecture 7 Some Advanced Topics using Propagation of Errors and Least Squares Fitting Error on the mean (review from Lecture.
Unfolding jet multiplicity and leading jet p T spectra in jet production in association with W and Z Bosons Christos Lazaridis University of Wisconsin-Madison.
Lab 3b: Distribution of the mean
SUPA Advanced Data Analysis Course, Jan 6th – 7th 2009 Advanced Data Analysis for the Physical Sciences Dr Martin Hendry Dept of Physics and Astronomy.
RooUnfold unfolding framework and algorithms Tim Adye Rutherford Appleton Laboratory ATLAS RAL Physics Meeting 20 th May 2008.
RooUnfold unfolding framework and algorithms Tim Adye Rutherford Appleton Laboratory Oxford ATLAS Group Meeting 13 th May 2008.
1 Iterative dynamically stabilized (IDS) method of data unfolding (*) (*arXiv: ) Bogdan MALAESCU CERN PHYSTAT 2011 Workshop on unfolding.
V0 analytical selection Marian Ivanov, Alexander Kalweit.
Bayes Theorem The most likely value of x derived from this posterior pdf therefore represents our inverse solution. Our knowledge contained in is explicitly.
Unfolding in ALICE Jan Fiete Grosse-Oetringhaus, CERN for the ALICE collaboration PHYSTAT 2011 CERN, January 2011.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
JET CHARGE AT LHC Scott Lieberman New Jersey Institute of Technology TUNL REU 2013 Duke High Energy Physics Group Working with: Professor Ayana Arce Dave.
Beam Extrapolation Fit Peter Litchfield  An update on the method I described at the September meeting  Objective;  To fit all data, nc and cc combined,
A bin-free Extended Maximum Likelihood Fit + Feldman-Cousins error analysis Peter Litchfield  A bin free Extended Maximum Likelihood method of fitting.
Mitchell Naisbit University of Manchester A study of the decay using the BaBar detector Mitchell Naisbit – Elba.
8th December 2004Tim Adye1 Proposal for a general-purpose unfolding framework in ROOT Tim Adye Rutherford Appleton Laboratory BaBar Statistics Working.
Spectrum Reconstruction of Atmospheric Neutrinos with Unfolding Techniques Juande Zornoza UW Madison.
1 Introduction to Statistics − Day 2 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan CERN Academic Training 2012 / Statistics for HEP / Lecture 41 Statistics for HEP Lecture 4: Unfolding Academic Training Lectures CERN, 2–5 April,
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1 Statistical Data Analysis: Lecture 5 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
Status of QEL Analysis ● QEL-like Event Selection and Sample ● ND Flux Extraction ● Fitting for MINOS Collaboration Meeting FNAL, 7 th -10 th December.
R. Kass/W03 P416 Lecture 5 l Suppose we are trying to measure the true value of some quantity (x T ). u We make repeated measurements of this quantity.
Richard Kass/F02P416 Lecture 6 1 Lecture 6 Chi Square Distribution (  2 ) and Least Squares Fitting Chi Square Distribution (  2 ) (See Taylor Ch 8,
Charm Mixing and D Dalitz analysis at BESIII SUN Shengsen Institute of High Energy Physics, Beijing (for BESIII Collaboration) 37 th International Conference.
Status of physics analysis Fabrizio Cei On Behalf of the Physics Analysis Group PSI BVR presentation, February 9, /02/2015Fabrizio Cei1.
Extrapolation Techniques  Four different techniques have been used to extrapolate near detector data to the far detector to predict the neutrino energy.
I'm concerned that the OS requirement for the signal is inefficient as the charge of the TeV scale leptons can be easily mis-assigned. As a result we do.
R. Kass/Sp07P416/Lecture 71 More on Least Squares Fit (LSQF) In Lec 5, we discussed how we can fit our data points to a linear function (straight line)
1 Unfolding Study 07/11/12. 2 RooUnfold Using the RooUnfold In this study we make response.
Statistics (2) Fitting Roger Barlow Manchester University IDPASC school Sesimbra 14 th December 2010.
ASEN 5070: Statistical Orbit Determination I Fall 2014
Data Analysis in Particle Physics
Unfolding Problem: A Machine Learning Approach
Unfolding atmospheric neutrino spectrum with IC9 data (second update)
Chi Square Distribution (c2) and Least Squares Fitting
Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.
Statistical Methods for Particle Physics Lecture 4: unfolding
Introduction to Unfolding
Dilepton Mass. Progress report.
Juande Zornoza UW Madison
Unfolding with system identification
Presentation transcript:

RooUnfold unfolding framework and algorithms Tim Adye Rutherford Appleton Laboratory BaBar Statistics Working Group BaBar Collaboration Meeting 13th December 2005

Outline What is Unfolding? Overview of a few techniques and why might you want to do it? Overview of a few techniques Regularised unfolding Iterative method RooUnfold package Currently implements three methods with a common interface Status and Plans References 13th December 2005 Tim Adye

Unfolding In other fields known as “deconvolution”, “unsmearing” Given a “true” PDF in μ, that is corrupted by detector effects, described by a response function, R, we measure a distribution in ν. In terms of histograms This may involve inefficiencies: lost events bias and smearing: events moving between bins (off-diagonal Rij) With infinite statistics, it would be possible to recover the original PDF by inverting the response matrix 13th December 2005 Tim Adye

Not so simple… Unfortunately, if there are statistical fluctuations between bins this information is destroyed Since R washes out statistical fluctuations, R-1 cannot distinguish between wildly fluctuating and smooth PDFs Obtain large negative correlations between adjacent bins Large fluctuations in reconstructed bin contents Need some procedure to remove wildly fluctuating solutions Give added weight to “smoother” solutions Solve for µ iteratively, starting with a reasonable guess and truncate iteration before it gets out of hand Ignore bin-to-bin fluctuations altogether 13th December 2005 Tim Adye

What happens if you don’t smooth 13th December 2005 Tim Adye

True Gaussian, with Gaussian smearing, systematic translation, and variable inefficiency – trained using a different Gaussian 13th December 2005 Tim Adye

Double Breit-Wigner, with Gaussian smearing, systematic translation, and variable inefficiency – trained using a single Gaussian 13th December 2005 Tim Adye

So why don’t we always do this? If the true PDF and resolution function can be parameterised, then a Maximum Likelihood fit is usually more convenient Directly returns parameters of interest Does not require binning If the response function doesn’t include smearing (ie. it’s diagonal), then apply bin-by-bin efficiency correction directly If result is just needed for comparison (eg. with MC), could apply response function to MC simpler than un-applying response to data 13th December 2005 Tim Adye

When to use unfolding Use unfolding to recover theoretical distribution where there is no a-priori parameterisation this is needed for the result and not just comparison with MC there is significant bin-to-bin migration of events 13th December 2005 Tim Adye

Where could we use unfolding? Traditionally used to extract structure functions Widely used outside PP for image reconstruction Dalitz plots Cross-feed between bins due to misreconstruction “True” decay momentum distributions Theory at parton level, we measure hadrons Correct for hadronisation as well as detector effects 13th December 2005 Tim Adye

1. Regularised Unfolding Use Maximum Likelihood to fit smeared bin contents to measured data, but include regularisation function where the regularisation parameter, α, controls the degree of smoothness (select α to, eg., minimise mean squared error) Various choices of regularisation function, S, are used Tikhonov regularisation: minimise curvature for some definition of curvature, eg. RooUnfHistoSvd by Kerstin Tackmann and Heiko Lacker based on GURU by Andreas Höcker and Vakhtang Kartvelishvili uses Singular Value Decomposition RUN by Volker Blobel Maximum entropy: 13th December 2005 Tim Adye

2. Iterative method Uses Bayes’ theorem to invert and using an initial set of probabilities, pi (eg. flat) obtain an improved estimate Repeating with new pi from these new bin contents converges quite rapidly Truncating the iteration prevents us seeing the bad effects of statistical fluctuations Fergus Wilson and I have implemented this method in ROOT/C++ Supports 1D, 2D, and 3D cases 13th December 2005 Tim Adye

2D Unfolding Example 2D Smearing, bias, variable efficiency, and variable rotation 13th December 2005 Tim Adye

RooUnfold Package Make these different methods available as ROOT/C++ classes with a common interface to specify unfolding method and parameters response matrix pass directly or fill from MC sample measured histogram return reconstructed truth histogram and errors full covariance matrix Easy to do with multiple dimensions (when supported) This should make it easy to try and compare different methods in your analysis Could also be useful outside BaBar! 13th December 2005 Tim Adye

RooUnfold Classes RooUnfoldResponse response matrix with various filling and access methods create from MC, use on data (can be stored in a file) RooUnfold – unfolding algorithm base class RooUnfoldBayes – Iterative method RooUnfoldSvd – Inteface to RooUnfHistoSvd package RooUnfoldBinByBin – Simple bin-by-bin method Trivial implementation, but useful to compare with full unfolding RooUnfoldExample – Simple 1D example RooUnfoldTest and RooUnfoldTest2D Test with different training and unfolding distributions 13th December 2005 Tim Adye

RooUnfold Status Available in CVS Announced in Statistics HN See README file for details of building and running Interface can still be adjusted based on comments I already have an idea for simplifying use in multi-dimensional case 13th December 2005 Tim Adye

Plans and possible improvements So far this is mostly a programming exercise Would be interesting to compare the different methods for some real analysis distributions But YMMV Add common tools, useful for all algorithms Inputs and results in different formats already supports histograms and ROOT vectors/matrices Automatic calculation of figures of merit (eg. Â2) can also use standard ROOT functions on histograms Simplify selection of regularisation parameter More algorithms? Maximum entropy regularisation Simple matrix inversion without regularisation perhaps useful with large statistics 13th December 2005 Tim Adye

References - Overview G. Cowan, A Survey of Unfolding Methods for Particle Physics, Proc. Advanced Statistical Techniques in Particle Physics, Durham (2002) http://www.ippp.dur.ac.uk/Workshops/02/statistics/ G. Cowan, Statistical Data Analysis, Oxford University Press (1998), Chapter 11: Unfolding R. Barlow, SLUO Lectures on Numerical Methods in HEP (2000), Lecture 9: Unfolding www-group.slac.stanford.edu/sluo/Lectures/Stat_Lectures.html 13th December 2005 Tim Adye

References - Techniques V. Blobel, Unfolding Methods in High Energy Physics, DESY 84-118 (1984); also CERN 85-02 A. Höcker and V. Kartvelishvili, SVD Approach to Data Unfolding, NIM A 372 (1996) 469 www.lancs.ac.uk/depts/physics/staff/kartvelishvili.html K. Tackmann, H. Lacker, Unfolding the Hadronic Mass Spectrum in B->Xu lν Decays, BAD 894. G. D’Agostini, A multidimensional unfolding method based on Bayes’ theorem, NIM A 362 (1995) 487 13th December 2005 Tim Adye