1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA.

Slides:



Advertisements
Similar presentations
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
Advertisements

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Continuous simulation of Beyond-Standard-Model processes with multiple parameters Jiahang Zhong (University of Oxford * ) Shih-Chang Lee (Academia Sinica)
TMVA – Toolkit for Multivariate Analysis
Lecture 14 – Neural Networks
Introduction to Statistics and Machine Learning 1 How do we: understandunderstand interpretinterpret our measurements How do we get the data for our measurements.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Sparse vs. Ensemble Approaches to Supervised Learning
DESY Computing Seminar Hamburg
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Chapter 6: Multilayer Neural Networks
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Sparse vs. Ensemble Approaches to Supervised Learning
Face Processing System Presented by: Harvest Jang Group meeting Fall 2002.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
This week: overview on pattern recognition (related to machine learning)
TMVA Andreas Höcker (CERN) CERN meeting, Oct 6, 2006 Toolkit for Multivariate Data Analysis.
1 TMVA Workshop, 21 Jan 2011 Andreas Hoecker – Introduction TMVA Workshop – Introduction Andreas Hoecker (CERN) TMVA Workshop, 21 January 2011, CERN, Switzerland.
G. Cowan Lectures on Statistical Data Analysis Lecture 7 page 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem 2Random variables and.
Machine Learning with TMVA A ROOT based Tool for Multivariate Data Analysis PANDA Computing Workshop Groningen The TMVA developer team: The TMVA.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Artificial Neural Nets and AI Connectionism Sub symbolic reasoning.
11 Top Workshop, LPSC, Oct 18–20, 2007A. Hoecker: Multivariate Analysis with TMVA 1 CERN, April 14., 2010P. Speckmayer ― Multivariate Data Analysis with.
1 Helge Voss Nikhef 23 rd - 27 th April 2007TMVA Toolkit for Multivariate Data Analysis: ACAT 2007 TMVA Toolkit for Multivariate Data Analysis with ROOT.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Michigan REU Final Presentations, August 10, 2006Matt Jachowski 1 Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.
Artificial Intelligence Techniques Multilayer Perceptrons.
TMVA Jörg Stelzer: Machine Learning withCHEP 2007, Victoria, Sep 5 Machine Learning Techniques for HEP Data Analysis with TMVA Toolkit for Multivariate.
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
Matthew Schwartz Harvard University with J. Gallicchio, PRL, 105:022001,2010 (superstructure) with K. Black, J. Gallicchio, J. Huth, M. Kagan and B. Tweedie.
1 / 41 LAL Seminar, June 21, 2007A. Hoecker: Machine Learning with TMVA Machine Learning Techniques for HEP Data Analysis with TMVA Andreas Hoecker ( *
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
1 Helge Voss Genvea 28 th March 2007ROOT 2007: TMVA Toolkit for MultiVariate Analysis TMVA A Toolkit for MultiVariate Data Analysis with ROOT Andreas Höcker.
Multivariate Data Analysis with TMVA Andreas Hoecker ( * ) (CERN) Seminar at Edinburgh, Scotland, Dec 5, 2008 ( * ) On behalf of the present core developer.
1 Top Workshop, LPSC, Oct 18–20, 2007A. Hoecker: Multivariate Analysis with TMVA Machine Learning Techniques for HEP Data Analysis with TMVA Andreas Hoecker.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
1 Krakow Seminar, Feb 27, 2007A. Hoecker: Data Mining with TMVA HEP Data Mining with TMVA  ToolKit for Multivariate Analysis with ROOT  Andreas Hoecker.
פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב
1 Teilchenseminar, April 26, 2007K. Voss: TMVA toolkit TMVA  ToolKit for Multivariate Analysis with ROOT  Kai Voss ( * ) Teilchenseminar, Bonn, April.
Kalanand Mishra BaBar Coll. Meeting February, /8 Development of New Kaon Selectors Kalanand Mishra University of Cincinnati.
G. Cowan IDPASC School of Flavour Physics, Valencia, 2-7 May 2013 / Statistical Analysis Tools 1 Statistical Analysis Tools for Particle Physics IDPASC.
Analysis of H  WW  l l Based on Boosted Decision Trees Hai-Jun Yang University of Michigan (with T.S. Dai, X.F. Li, B. Zhou) ATLAS Higgs Meeting September.
1 / 44 LPNHE Seminar, June 20, 2007A. Hoecker: Machine Learning with TMVA Machine Learning Techniques for HEP Data Analysis with TMVA Andreas Hoecker (
Multivariate Classifiers or “Machine Learning” in TMVA
C. Kiesling, MPI for Physics, Munich - ACAT03 Workshop, KEK, Japan, Dec Jens Zimmermann, Christian Kiesling Max-Planck-Institut für Physik, München.
G. Cowan Lectures on Statistical Data Analysis Lecture 6 page 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem 2Random variables and.
Search for H  WW*  l l Based on Boosted Decision Trees Hai-Jun Yang University of Michigan LHC Physics Signature Workshop January 5-11, 2008.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Neural Network Analysis of Dimuon Data within CMS Shannon Massey University of Notre Dame Shannon Massey1.
One framework for most common MVA-techniques, available in ROOT Have a common platform/interface for all MVA classification and regression-methods: Have.
Multivariate Data Analysis with TMVA4 Jan Therhaag ( * ) (University of Bonn) ICCMSE09 Rhodes, 29. September 2009 ( * ) On behalf of the present core developer.
Helge VossAdvanced Scientific Computing Workshop ETH Multivariate Methods of data analysis Helge Voss Advanced Scientific Computing Workshop ETH.
Combining Models Foundations of Algorithms and Machine Learning (CS60020), IIT KGP, 2017: Indrajit Bhattacharya.
Multivariate Methods of
Machine Learning Developments in ROOT Sergei Gleyzer, Lorenzo Moneta
Multivariate Methods of
Multivariate Data Analysis with TMVA
COMP61011 : Machine Learning Ensemble Models
Christoph Rosemann and Helge Voss DESY FLC
Comment on Event Quality Variables for Multivariate Analyses
Multivariate Data Analysis
10th China HEPS Particle Physics Meeting
ISTEP 2016 Final Project— Project on
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Toolkit for Multivariate Data Analysis Helge Voss, MPI-K Heidelberg
Presentation transcript:

1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA core developer team: A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E.v.T., H. Voss

2/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Outline Overview New developments Recent physics results that use TMVA –Web-Site: –See also: "TMVA - Toolkit for Multivariate Data Analysis, A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E.v.Toerne, H. Voss et al., arXiv:physics/ v5 [physics.data-an]arXiv:physics/ v5 [physics.data-an]

3/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 What is TMVA Supervised learning Classification and Regression tasks Easy to train, evaluate and compare various MVA methods Various preprocessing methods (Decorr.,PCA, Gauss...) Integrated in ROOT MVA output

4/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Training: –Classification: Learn the features of the different event classes from a sample with known signal/background composition –Regression: Learn the functional dependence between input variables and targets Testing: –Evaluate the performance of the trained classifier/regressor on an independent test sample –Compare different methods Application: –Apply the classifier/regressor to real data TMVA workflow

5/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 H1H1 H0H0 x1x1 x2x2 Classification of signal/background – How to find best decision boundary? Regression – How to determine the correct model? Classification/Regression

6/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 If you have a training sample with only few events?  Number of „parameters“ must be limited  Use Linear classifier or FDA, small BDT, small MLP Variables are uncorrelated (or only linear corrs)  likelihood I just want something simple  use Cuts, LD, Fisher Methods for complex problems  use BDT, MLP, SVM How to choose a method? List of acronyms: BDT = boosted decision tree, see manual page 103 ANN = articifical neural network MLP = multi-layer perceptron, a specific form of ANN, also the name of our flagship ANN, manual p. 92 FDA = functional discriminant analysis, see manual p. 87 LD = linear discriminant, manual p. 85 SVM = support vector machine, manual p. 98, SVM currently available only for classification Cuts = like in “cut selection“, manual p. 56 Fisher = Ronald A. Fisher, classifier similar to LD, manual p. 83 List of acronyms: BDT = boosted decision tree, see manual page 103 ANN = articifical neural network MLP = multi-layer perceptron, a specific form of ANN, also the name of our flagship ANN, manual p. 92 FDA = functional discriminant analysis, see manual p. 87 LD = linear discriminant, manual p. 85 SVM = support vector machine, manual p. 98, SVM currently available only for classification Cuts = like in “cut selection“, manual p. 56 Fisher = Ronald A. Fisher, classifier similar to LD, manual p. 83

7/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 PerformanceSpeedRobustnessCurse of Dim. TransparencyRegression No/linear correlations Nonlinear correlations TrainingResponseOvertrainingWeak input vars 1Dmulti D   1 i N 1 input layerk hidden layers1 ouput layer 1 j M1M MkMk 2 output classes (signal and background) N var discriminating input variables (“Activation” function) with: Feed-forward Multilayer Perceptron Modelling of arbitrary nonlinear functions as a nonlinear combination of simple „neuron activation functions“ Advantages: –very flexible, no assumption about the function necessary Disadvantages: –„black box“ –needs tuning –seed dependent Artificial Neural Networks

8/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 PerformanceSpeedRobustnessCurse of Dim. TransparencyRegression No/linear correlations Nonlinear correlations TrainingResponseOvertrainingWeak input vars 1Dmulti D  // //  Grow a forest of decision trees and determine the event class/target by majority vote Weights of misclassified events are increased in the next iteration Advantages: –ignores weak variables –works out of the box Disadvantages: –vulnerable to overtraining Boosted Decision Trees

9/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 No Single Best Classifier… Criteria Classifiers Cuts Likeli- hood PDERS / k-NN H-MatrixFisherMLPBDTRuleFitSVM Perfor- mance no / linear correlations  nonlinear correlations  Speed Training  Response //  Robust -ness Overtraining   Weak input variables   Curse of dimensionality  Transparency   The properties of the Function discriminant (FDA) depend on the chosen function

10/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Neyman-Pearson: The Likelihood ratio used as “selection criterion” y(x) gives for each selection efficiency the best possible background rejection. i.e. it maximizes the area under the “Receiver Operation Characteristics” (ROC) curve Neyman-Pearson: The Likelihood ratio used as “selection criterion” y(x) gives for each selection efficiency the best possible background rejection. i.e. it maximizes the area under the “Receiver Operation Characteristics” (ROC) curve Varying y(x)>“cut” moves working point (efficiency and purity) along ROC curve How to choose “cut”?  need to know prior probabilities (S, B abundances) Measurement of signal cross section: maximum of S/√(S+B) or equiv. √(  ·p) Discovery of a signal : maximum of S/√(B) Precision measurement: high purity (p) Trigger selection: high efficiency (   backgr.  signal random guessing good classification better classification “limit” in ROC curve given by likelihood ratio Type-1 error small Type-2 error large Type-1 error large Type-2 error small Neyman-Pearson Lemma

11/21 ACAT 2011Eckhard von Toerne 06.Sept dimensional distribution Signal: sum of Gaussians Background=flat Theoretical limit calculated using Neyman-Pearson Lemma Neural net (MLP) with two hidden layers and backpropagation training. Bayesian option has little influence on high statistics training TMVA-ANN converges towards theoretical limit for sufficient Ntrain (~100k) Performance with toy data

12/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Recent developments Current version: TMVA version in root release 5.30 Unit test framework for daily software and method performance validation (C. Rosemann, E.v.T.) Multiclass classification for MLP, BDTG, FDA BDT automatic parameter optimization for building the tree architecture new method to treat data with distinct sub-populations (Method Category) Optional Bayesian treatment of ANN weights in MLP with back- propagation (Jiahang Zhong) Extended PDEFoam functionality (A. Voigt) Variable transformations on a user-defined subset of variables

13/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Unit test automated framework to verify functionality and performance (ours based on B. Eckel‘s description) slimmed version runs every night on various OS C. Rosemann, E.v.T. *************************************************************** * TMVA - U N I T test : Summary * *************************************************************** Test 0 : Event [107/107] OK Test 1 : VariableInfo [31/31] OK Test 2 : DataSetInfo [20/20] OK Test 3 : DataSet [15/15] OK Test 4 : Factory [16/16] OK Test 7 : LDN_selVar_Gauss [4/4] OK.... Test 107 : BoostedPDEFoam [4/4] OK Test 108 : BoostedDTPDEFoam [4/4] OK Total number of failures: 0 ***************************************************************

14/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 And now: switching from statistics to physics …acknowledging the hard work of our users

15/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results x BDT trained on individual m H samples with 10 variables. Expect 1.47 ev signal events at mH=160GeV.. compared to 1.27ev with cut-based analysis (on 4 variables) and same bgd. Phys.Lett.B699:25-47,2011.

16/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Super-Kamiokande Coll., “Kinematic reconstruction of atmospheric neutrino events in a large water Cherenkov detector with proton identification“ Phys.Rev.D79:112010, input variables MLP with one hidden layer SignalBackground Review of recent results

17/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 BDT analysis with ~20 input variables lepton + E T-mis + jets Results for s+t-channel CDF Coll., “First Observation of Electroweak Single Top Quark Production“, Phys.Rev.Lett.103:092002,2009. Review of recent results

18/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results using TMVA CDF+D0 combined higgs working group, hep-ex/ (SVM) CMS Coll., H-->WW search, Phys.Lett.B699:25-47,2011. (BDT) IceCube Coll., astro-ph/ (MLP) D0 Coll., top-pairs, Phys.Rev.D84:012008,2011. (BDT) IceCube Coll., Phys.Rev.D83:012001,2011. (BDT) IceCube Coll., Phys.Rev.D82:112003,2010. (BDT) D0 Coll., Higgs search, Phys.Rev.Lett.105:251801,2010. (BDT) CDF Coll., single top, Phys.Rev.D82:112005,2010. (BDT) D0 Coll., single top, Phys.Lett.B690:5-14,2010. (BDT) D0 Coll., top pairs, Phys.Rev.D82:032002,2010. (Likelihood) CDF Coll., single top obs., Phys.Rev.Lett.103:092002,2009. (BDT) Super-Kamiokande Coll., Phys.Rev.D79:112010,2009. (MLP) BABAR Coll., Phys.Rev.D79:051101,2009. (BDT) + other papers + several ATLAS papers with TMVA about to come out… + many ATLAS results about to be published…

19/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results using TMVA CDF+D0 combined higgs working group, hep-ex/ (SVM) CMS Coll., H-->WW search, Phys.Lett.B699:25-47,2011. (BDT) IceCube Coll., astro-ph/ (MLP) D0 Coll., top-pairs, Phys.Rev.D84:012008,2011. (BDT) IceCube Coll., Phys.Rev.D83:012001,2011. (BDT) IceCube Coll., Phys.Rev.D82:112003,2010. (BDT) D0 Coll., Higgs search, Phys.Rev.Lett.105:251801,2010. (BDT) CDF Coll., single top, Phys.Rev.D82:112005,2010. (BDT) D0 Coll., single top, Phys.Lett.B690:5-14,2010. (BDT) D0 Coll., top pairs, Phys.Rev.D82:032002,2010. (Likelihood) CDF Coll., single top obs., Phys.Rev.Lett.103:092002,2009. (BDT) Super-Kamiokande Coll., Phys.Rev.D79:112010,2009. (MLP) BABAR Coll., Phys.Rev.D79:051101,2009. (BDT) + other papers + several ATLAS papers with TMVA about to come out… + many ATLAS results about to be published… Thank you for using TMVA !

20/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Summary TMVA versatile package for classification and regression tasks Integrated into ROOT Easy to train classifiers/regression methods A multitude of physics results based on TMVA are coming out Thank you for your attention!

21/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Several similar data mining efforts with rising importance in most fields of science and industry TMVA is open source software Use & redistribution of source permitted according to terms in BSD licenseBSD license Contributed to TMVA have: Andreas Hoecker (CERN, Switzerland), Jörg Stelzer (CERN, Switzerland), Peter Speckmayer (CERN, Switzerland), Jan Therhaag (Universität Bonn, Germany), Eckhard von Toerne (Universität Bonn, Germany), Helge Voss (MPI für Kernphysik Heidelberg, Germany), Moritz Backes (Geneva University, Switzerland), Tancredi Carli (CERN, Switzerland), Asen Christov (Universität Freiburg, Germany), Or Cohen (CERN, Switzerland and Weizmann, Israel), Krzysztof Danielowski (IFJ and AGH/UJ, Krakow, Poland), Dominik Dannheim (CERN, Switzerland), Sophie Henrot-Versille (LAL Orsay, France), Matthew Jachowski (Stanford University, USA), Kamil Kraszewski (IFJ and AGH/UJ, Krakow, Poland), Attila Krasznahorkay Jr. (CERN, Switzerland, and Manchester U., UK), Maciej Kruk (IFJ and AGH/UJ, Krakow, Poland), Yair Mahalalel (Tel Aviv University, Israel), Rustem Ospanov (University of Texas, USA), Xavier Prudent (LAPP Annecy, France), Arnaud Robert (LPNHE Paris, France), Christoph Rosemann (DESY), Doug Schouten (S. Fraser U., Canada), Fredrik Tegenfeldt (Iowa University, USA, until Aug 2007), Alexander Voigt (CERN, Switzerland), Kai Voss (University of Victoria, Canada), Marcin Wolter (IFJ PAN Krakow, Poland), Andrzej Zemla (IFJ PAN Krakow, Poland), Jiahang Zhong (Academica Sinica, Taipeh). Credits

22/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Spare Slides

23/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 void TMVAnalysis( ) { TFile* outputFile = TFile::Open( "TMVA.root", "RECREATE" ); TMVA::Factory *factory = new TMVA::Factory( "MVAnalysis", outputFile,"!V"); TFile *input = TFile::Open("tmva_example.root"); factory->AddVariable("var1+var2", 'F'); factory->AddVariable("var1-var2", 'F'); //factory->AddTarget("tarval", 'F'); factory->AddSignalTree ( (TTree*)input->Get("TreeS"), 1.0 ); factory->AddBackgroundTree ( (TTree*)input->Get("TreeB"), 1.0 ); //factory->AddRegressionTree ( (TTree*)input->Get("regTree"), 1.0 ); factory->PrepareTrainingAndTestTree( "", "", "nTrain_Signal=200:nTrain_Background=200:nTest_Signal=200:nTest_Background=200:NormMode=None" ); factory->BookMethod( TMVA::Types::kLikelihood, "Likelihood", "!V:!TransformOutput:Spline=2:NSmooth=5:NAvEvtPerBin=50" ); factory->BookMethod( TMVA::Types::kMLP, "MLP", "!V:NCycles=200:HiddenLayers=N+1,N:TestRate=5" ); factory->TrainAllMethods(); // factory->TrainAllMethodsForRegression(); factory->TestAllMethods(); factory->EvaluateAllMethods(); outputFile->Close(); delete factory; } A complete TMVA training/testing session Create Factory Add variables/ targets Initialize Trees Book MVA methods Train, test and evaluate

24/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 What is a multi-variate analysis? “Combine“ all input variables into one output variable Supervised learning means learning by example: the program extracts patterns from training data Input Variables Classifier Output

25/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Metaclassifiers – Category Classifier and Boosting The category classifier is custom-made for HEP –Use different classifiers for different phase space regions and combine them into a single output TMVA supports boosting for all classifiers –Use a collection of “weak learners“ to improve their performace (boosted Fisher, boosted neural nets with few neurons each…)