Tutorial on Multivariate Methods (TMVA)

Slides:



Advertisements
Similar presentations
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Advertisements

G. Cowan TAE 2013 / Statistics Problems1 TAE Statistics Problems Glen Cowan Physics Department Royal Holloway, University of London
Using the Profile Likelihood in Searches for New Physics / PHYSTAT 2011 G. Cowan 1 Using the Profile Likelihood in Searches for New Physics arXiv:
G. Cowan RHUL Physics Profile likelihood for systematic uncertainties page 1 Use of profile likelihood to determine systematic uncertainties ATLAS Top.
G. Cowan RHUL Physics Comment on use of LR for limits page 1 Comment on definition of likelihood ratio for limits ATLAS Statistics Forum CERN, 2 September,
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Higgs combination note status page 1 Status of Higgs Combination Note ATLAS Statistics/Higgs Meeting Phone, 7 April, 2008 Glen Cowan.
G. Cowan Statistics for HEP / NIKHEF, December 2011 / Lecture 2 1 Statistical Methods for Particle Physics Lecture 2: Tests based on likelihood ratios.
G. Cowan Discovery and limits / DESY, 4-7 October 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits Lecture 2: Tests based on likelihood.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
G. Cowan RHUL Physics Bayesian Higgs combination page 1 Bayesian Higgs combination using shapes ATLAS Statistics Meeting CERN, 19 December, 2007 Glen Cowan.
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.
G. Cowan CLASHEP 2011 / Topics in Statistical Data Analysis / Lecture 21 Topics in Statistical Data Analysis for HEP Lecture 2: Statistical Tests CERN.
G. Cowan Weizmann Statistics Workshop, 2015 / GDC Lecture 31 Statistical Methods for Particle Physics Lecture 3: asymptotics I; Asimov data set Statistical.
G. Cowan RHUL Physics page 1 Status of search procedures for ATLAS ATLAS-CMS Joint Statistics Meeting CERN, 15 October, 2009 Glen Cowan Physics Department.
G. Cowan, RHUL Physics Discussion on significance page 1 Discussion on significance ATLAS Statistics Forum CERN/Phone, 2 December, 2009 Glen Cowan Physics.
G. Cowan RHUL Physics LR test to determine number of parameters page 1 Likelihood ratio test to determine best number of parameters ATLAS Statistics Forum.
G. Cowan RHUL Physics Input from Statistics Forum for Exotics page 1 Input from Statistics Forum for Exotics ATLAS Exotics Meeting CERN/phone, 22 January,
G. Cowan S0S 2010 / Statistical Tests and Limits 1 Statistical Tests and Limits Lecture 1: general formalism IN2P3 School of Statistics Autrans, France.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
G. Cowan Lectures on Statistical Data Analysis Lecture 8 page 1 Statistical Data Analysis: Lecture 8 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan NExT Workshop, 2015 / GDC Lecture 11 Statistical Methods for Particle Physics Lecture 1: introduction & statistical tests Fifth NExT PhD Workshop:
G. Cowan Computing and Statistical Data Analysis / Stat 9 1 Computing and Statistical Data Analysis Stat 9: Parameter Estimation, Limits London Postgraduate.
G. Cowan IDPASC School of Flavour Physics, Valencia, 2-7 May 2013 / Statistical Analysis Tools 1 Statistical Analysis Tools for Particle Physics IDPASC.
G. Cowan Terascale Statistics School 2015 / Statistics for Run II1 Statistics for LHC Run II Terascale Statistics School DESY, Hamburg March 23-27, 2015.
G. Cowan RHUL Physics Status of Higgs combination page 1 Status of Higgs Combination ATLAS Higgs Meeting CERN/phone, 7 November, 2008 Glen Cowan, RHUL.
G. Cowan Statistical methods for HEP / Freiburg June 2011 / Lecture 2 1 Statistical Methods for Discovery and Limits in HEP Experiments Day 2: Discovery.
G. Cowan IHEP seminar / 19 August Some Developments in Statistical Methods for Particle Physics Particle Physics Seminar IHEP, Beijing 19 August,
G. Cowan CERN Academic Training 2010 / Statistics for the LHC / Lecture 21 Statistics for the LHC Lecture 2: Discovery Academic Training Lectures CERN,
G. Cowan SLAC Statistics Meeting / 4-6 June 2012 / Two Developments 1 Two developments in discovery tests: use of weighted Monte Carlo events and an improved.
G. Cowan CERN Academic Training 2012 / Statistics for HEP / Lecture 21 Statistics for HEP Lecture 2: Discovery and Limits Academic Training Lectures CERN,
Discussion on significance
Computing and Statistical Data Analysis / Stat 11
iSTEP 2016 Tsinghua University, Beijing July 10-20, 2016
Tutorial on Statistics TRISEP School 27, 28 June 2016 Glen Cowan
Estimating Statistical Significance
Statistics for the LHC Lecture 3: Setting limits
Some Statistical Tools for Particle Physics
LAL Orsay, 2016 / Lectures on Statistics
Comment on Event Quality Variables for Multivariate Analyses
ISTEP 2016 Final Project— Project on
TAE 2017 Centro de ciencias Pedro Pascual Benasque, Spain
School on Data Science in (Astro)particle Physics
Graduierten-Kolleg RWTH Aachen February 2014 Glen Cowan
Computing and Statistical Data Analysis / Stat 8
Statistical Methods for the LHC Discovering New Physics
TAE 2017 / Statistics Lecture 3
Lectures on Statistics TRISEP School 27, 28 June 2016 Glen Cowan
Project on H →ττ and multivariate methods
TAE 2018 / Statistics Lecture 3
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Statistical Methods for the LHC
TAE 2018 Benasque, Spain 3-15 Sept 2018 Glen Cowan Physics Department
TAE 2018 / Statistics Lecture 4
Computing and Statistical Data Analysis / Stat 6
Some Prospects for Statistical Data Analysis in High Energy Physics
Computing and Statistical Data Analysis / Stat 7
TRISEP 2016 / Statistics Lecture 1
TAE 2017 / Statistics Lecture 2
TRISEP 2016 / Statistics Lecture 2
Statistical Methods for HEP Lecture 3: Discovery and Limits
Statistical Analysis Tools for Particle Physics
Application of CLs Method to ATLAS Higgs Searches
iSTEP 2016 Tsinghua University, Beijing July 10-20, 2016
Decomposition of Stat/Sys Errors
TAE 2018 Centro de ciencias Pedro Pascual Benasque, Spain
Project on H→ττ and multivariate methods
Statistical Methods for Particle Physics Lecture 3: further topics
Presentation transcript:

Tutorial on Multivariate Methods (TMVA) iSTEP 2014 IHEP, Beijing August 20-29, 2014 Glen Cowan (谷林·科恩) Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan G. Cowan iSTEP 2014, Beijing / TMVA Tutorial TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA

Introduction Exercise: use samples of “toy” signal/background MC training data to train a multivariate classifier using the program TMVA. Problem sheet on: www.pp.rhul.ac.uk/~cowan/stat/beijing14/ istep_tmva_problem.pdf Code for exercise: www.pp.rhul.ac.uk/~cowan/stat/root/tmva/ tmvaExamples.tar Download to local directory and unpack: tar –xvf tmvaExamples.tar Setup for Root/TMVA at IHEP: source /home/atlas01/bin/root/bin/thisroot.sh Read the file readme.txt for more instructions. G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Each event characterized by 3 variables, x, y, z: G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Test example (x, y, z) y y no cut on z z < 0.75 x x y y z < 0.25 G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Test example results Fisher discriminant Multilayer perceptron Naive Bayes, no decor- relation Naive Bayes with decor- relation G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Extension of ATLAS Project select For the ATLAS Group Project, you defined a test statistic t to separate between signal (ttH) and background events. t tcut You selected events with t > tcut, calculated s and b, and estimated the expected discovery significance using s/√b. This is OK for a start, but does not use all of the available information from each events value of the statistic t. G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Likelihood ratio statistic for discovery test In bin i of test statistic t, expected numbers of signal/background: Likelihood function for strength parameter μ with data n1,..., nN Statistic for test of μ = 0: (Asimov Paper: CCGV EPJC 71 (2011) 1554; arXiv:1007.1727) G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Background-only distribution of q0 For background-only (μ = 0) toy MC, generate ni ~ Poisson(bi). Large-sample asymptotic formula is “half-chi-square”. G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Background-only cumulative distribution of q0 p-value is probability, assuming μ = 0, to find q0 even higher than the one observed (one minus cumulative distribution). From p-value, equivalent significance: Z = Φ-1(1 – p) Φ-1 = standard normal quantile G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Discovery sensitivity Good agreement between toy MC and large-sample formulae, so OK to use asymptotic formula for significance Z, Median significance of test of background-only hypothesis under assumption of signal+background from “Asimov data set”: You can use the Asimov data set to evaluate q0 and use this with the formula Z = √q0 to estimate the median discovery significance. G. Cowan iSTEP 2014, Beijing / TMVA Tutorial

Multinomial model Nominal ATLAS analysis uses the information from the distribution of the MVA ouput (“shape information”). No info is taken from the total observed number of events (presumably because the systematic uncertainty on b is large). This corresponds to using a multinomial model for the observed histogram of MVA output values: G. Cowan iSTEP 2014, Beijing / TMVA Tutorial