Presentation is loading. Please wait.

Presentation is loading. Please wait.

1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA.

Similar presentations


Presentation on theme: "1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA."— Presentation transcript:

1 1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA core developer team: A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E.v.T., H. Voss

2 2/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Outline Overview New developments Recent physics results that use TMVA –Web-Site: http://tmva.sourceforge.net/http://tmva.sourceforge.net/ –See also: "TMVA - Toolkit for Multivariate Data Analysis, A. Hoecker, P. Speckmayer, J. Stelzer, J. Therhaag, E.v.Toerne, H. Voss et al., arXiv:physics/0703039v5 [physics.data-an]arXiv:physics/0703039v5 [physics.data-an]

3 3/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 What is TMVA Supervised learning Classification and Regression tasks Easy to train, evaluate and compare various MVA methods Various preprocessing methods (Decorr.,PCA, Gauss...) Integrated in ROOT MVA output

4 4/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Training: –Classification: Learn the features of the different event classes from a sample with known signal/background composition –Regression: Learn the functional dependence between input variables and targets Testing: –Evaluate the performance of the trained classifier/regressor on an independent test sample –Compare different methods Application: –Apply the classifier/regressor to real data TMVA workflow

5 5/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 H1H1 H0H0 x1x1 x2x2 Classification of signal/background – How to find best decision boundary? Regression – How to determine the correct model? Classification/Regression

6 6/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 If you have a training sample with only few events?  Number of „parameters“ must be limited  Use Linear classifier or FDA, small BDT, small MLP Variables are uncorrelated (or only linear corrs)  likelihood I just want something simple  use Cuts, LD, Fisher Methods for complex problems  use BDT, MLP, SVM How to choose a method? List of acronyms: BDT = boosted decision tree, see manual page 103 ANN = articifical neural network MLP = multi-layer perceptron, a specific form of ANN, also the name of our flagship ANN, manual p. 92 FDA = functional discriminant analysis, see manual p. 87 LD = linear discriminant, manual p. 85 SVM = support vector machine, manual p. 98, SVM currently available only for classification Cuts = like in “cut selection“, manual p. 56 Fisher = Ronald A. Fisher, classifier similar to LD, manual p. 83 List of acronyms: BDT = boosted decision tree, see manual page 103 ANN = articifical neural network MLP = multi-layer perceptron, a specific form of ANN, also the name of our flagship ANN, manual p. 92 FDA = functional discriminant analysis, see manual p. 87 LD = linear discriminant, manual p. 85 SVM = support vector machine, manual p. 98, SVM currently available only for classification Cuts = like in “cut selection“, manual p. 56 Fisher = Ronald A. Fisher, classifier similar to LD, manual p. 83

7 7/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 PerformanceSpeedRobustnessCurse of Dim. TransparencyRegression No/linear correlations Nonlinear correlations TrainingResponseOvertrainingWeak input vars 1Dmulti D   1 i...... N 1 input layerk hidden layers1 ouput layer 1 j M1M1............ 1...... MkMk 2 output classes (signal and background) N var discriminating input variables............ (“Activation” function) with: Feed-forward Multilayer Perceptron Modelling of arbitrary nonlinear functions as a nonlinear combination of simple „neuron activation functions“ Advantages: –very flexible, no assumption about the function necessary Disadvantages: –„black box“ –needs tuning –seed dependent Artificial Neural Networks

8 8/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 PerformanceSpeedRobustnessCurse of Dim. TransparencyRegression No/linear correlations Nonlinear correlations TrainingResponseOvertrainingWeak input vars 1Dmulti D  // //  Grow a forest of decision trees and determine the event class/target by majority vote Weights of misclassified events are increased in the next iteration Advantages: –ignores weak variables –works out of the box Disadvantages: –vulnerable to overtraining Boosted Decision Trees

9 9/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 No Single Best Classifier… Criteria Classifiers Cuts Likeli- hood PDERS / k-NN H-MatrixFisherMLPBDTRuleFitSVM Perfor- mance no / linear correlations  nonlinear correlations  Speed Training  Response //  Robust -ness Overtraining   Weak input variables   Curse of dimensionality  Transparency   The properties of the Function discriminant (FDA) depend on the chosen function

10 10/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Neyman-Pearson: The Likelihood ratio used as “selection criterion” y(x) gives for each selection efficiency the best possible background rejection. i.e. it maximizes the area under the “Receiver Operation Characteristics” (ROC) curve Neyman-Pearson: The Likelihood ratio used as “selection criterion” y(x) gives for each selection efficiency the best possible background rejection. i.e. it maximizes the area under the “Receiver Operation Characteristics” (ROC) curve Varying y(x)>“cut” moves working point (efficiency and purity) along ROC curve How to choose “cut”?  need to know prior probabilities (S, B abundances) Measurement of signal cross section: maximum of S/√(S+B) or equiv. √(  ·p) Discovery of a signal : maximum of S/√(B) Precision measurement: high purity (p) Trigger selection: high efficiency (  0 1 1 0 1-  backgr.  signal random guessing good classification better classification “limit” in ROC curve given by likelihood ratio Type-1 error small Type-2 error large Type-1 error large Type-2 error small Neyman-Pearson Lemma

11 11/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 3-dimensional distribution Signal: sum of Gaussians Background=flat Theoretical limit calculated using Neyman-Pearson Lemma Neural net (MLP) with two hidden layers and backpropagation training. Bayesian option has little influence on high statistics training TMVA-ANN converges towards theoretical limit for sufficient Ntrain (~100k) Performance with toy data

12 12/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Recent developments Current version: TMVA version 4.1.2 in root release 5.30 Unit test framework for daily software and method performance validation (C. Rosemann, E.v.T.) Multiclass classification for MLP, BDTG, FDA BDT automatic parameter optimization for building the tree architecture new method to treat data with distinct sub-populations (Method Category) Optional Bayesian treatment of ANN weights in MLP with back- propagation (Jiahang Zhong) Extended PDEFoam functionality (A. Voigt) Variable transformations on a user-defined subset of variables

13 13/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Unit test automated framework to verify functionality and performance (ours based on B. Eckel‘s description) slimmed version runs every night on various OS C. Rosemann, E.v.T. *************************************************************** * TMVA - U N I T test : Summary * *************************************************************** Test 0 : Event [107/107]....................................OK Test 1 : VariableInfo [31/31]...............................OK Test 2 : DataSetInfo [20/20]................................OK Test 3 : DataSet [15/15]....................................OK Test 4 : Factory [16/16]....................................OK Test 7 : LDN_selVar_Gauss [4/4].............................OK.... Test 107 : BoostedPDEFoam [4/4]..............................OK Test 108 : BoostedDTPDEFoam [4/4]............................OK Total number of failures: 0 ***************************************************************

14 14/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 And now: switching from statistics to physics …acknowledging the hard work of our users

15 15/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results x BDT trained on individual m H samples with 10 variables. Expect 1.47 ev signal events at mH=160GeV.. compared to 1.27ev with cut-based analysis (on 4 variables) and same bgd. Phys.Lett.B699:25-47,2011.

16 16/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Super-Kamiokande Coll., “Kinematic reconstruction of atmospheric neutrino events in a large water Cherenkov detector with proton identification“ Phys.Rev.D79:112010,2009. 7 input variables MLP with one hidden layer SignalBackground Review of recent results

17 17/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 BDT analysis with ~20 input variables lepton + E T-mis + jets Results for s+t-channel CDF Coll., “First Observation of Electroweak Single Top Quark Production“, Phys.Rev.Lett.103:092002,2009. Review of recent results

18 18/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results using TMVA CDF+D0 combined higgs working group, hep-ex/1107.5518. (SVM) CMS Coll., H-->WW search, Phys.Lett.B699:25-47,2011. (BDT) IceCube Coll., astro-ph/1101.1692. (MLP) D0 Coll., top-pairs, Phys.Rev.D84:012008,2011. (BDT) IceCube Coll., Phys.Rev.D83:012001,2011. (BDT) IceCube Coll., Phys.Rev.D82:112003,2010. (BDT) D0 Coll., Higgs search, Phys.Rev.Lett.105:251801,2010. (BDT) CDF Coll., single top, Phys.Rev.D82:112005,2010. (BDT) D0 Coll., single top, Phys.Lett.B690:5-14,2010. (BDT) D0 Coll., top pairs, Phys.Rev.D82:032002,2010. (Likelihood) CDF Coll., single top obs., Phys.Rev.Lett.103:092002,2009. (BDT) Super-Kamiokande Coll., Phys.Rev.D79:112010,2009. (MLP) BABAR Coll., Phys.Rev.D79:051101,2009. (BDT) + other papers + several ATLAS papers with TMVA about to come out… + many ATLAS results about to be published…

19 19/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Review of recent results using TMVA CDF+D0 combined higgs working group, hep-ex/1107.5518. (SVM) CMS Coll., H-->WW search, Phys.Lett.B699:25-47,2011. (BDT) IceCube Coll., astro-ph/1101.1692. (MLP) D0 Coll., top-pairs, Phys.Rev.D84:012008,2011. (BDT) IceCube Coll., Phys.Rev.D83:012001,2011. (BDT) IceCube Coll., Phys.Rev.D82:112003,2010. (BDT) D0 Coll., Higgs search, Phys.Rev.Lett.105:251801,2010. (BDT) CDF Coll., single top, Phys.Rev.D82:112005,2010. (BDT) D0 Coll., single top, Phys.Lett.B690:5-14,2010. (BDT) D0 Coll., top pairs, Phys.Rev.D82:032002,2010. (Likelihood) CDF Coll., single top obs., Phys.Rev.Lett.103:092002,2009. (BDT) Super-Kamiokande Coll., Phys.Rev.D79:112010,2009. (MLP) BABAR Coll., Phys.Rev.D79:051101,2009. (BDT) + other papers + several ATLAS papers with TMVA about to come out… + many ATLAS results about to be published… Thank you for using TMVA !

20 20/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Summary TMVA versatile package for classification and regression tasks Integrated into ROOT Easy to train classifiers/regression methods A multitude of physics results based on TMVA are coming out Thank you for your attention!

21 21/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Several similar data mining efforts with rising importance in most fields of science and industry TMVA is open source software Use & redistribution of source permitted according to terms in BSD licenseBSD license Contributed to TMVA have: Andreas Hoecker (CERN, Switzerland), Jörg Stelzer (CERN, Switzerland), Peter Speckmayer (CERN, Switzerland), Jan Therhaag (Universität Bonn, Germany), Eckhard von Toerne (Universität Bonn, Germany), Helge Voss (MPI für Kernphysik Heidelberg, Germany), Moritz Backes (Geneva University, Switzerland), Tancredi Carli (CERN, Switzerland), Asen Christov (Universität Freiburg, Germany), Or Cohen (CERN, Switzerland and Weizmann, Israel), Krzysztof Danielowski (IFJ and AGH/UJ, Krakow, Poland), Dominik Dannheim (CERN, Switzerland), Sophie Henrot-Versille (LAL Orsay, France), Matthew Jachowski (Stanford University, USA), Kamil Kraszewski (IFJ and AGH/UJ, Krakow, Poland), Attila Krasznahorkay Jr. (CERN, Switzerland, and Manchester U., UK), Maciej Kruk (IFJ and AGH/UJ, Krakow, Poland), Yair Mahalalel (Tel Aviv University, Israel), Rustem Ospanov (University of Texas, USA), Xavier Prudent (LAPP Annecy, France), Arnaud Robert (LPNHE Paris, France), Christoph Rosemann (DESY), Doug Schouten (S. Fraser U., Canada), Fredrik Tegenfeldt (Iowa University, USA, until Aug 2007), Alexander Voigt (CERN, Switzerland), Kai Voss (University of Victoria, Canada), Marcin Wolter (IFJ PAN Krakow, Poland), Andrzej Zemla (IFJ PAN Krakow, Poland), Jiahang Zhong (Academica Sinica, Taipeh). Credits

22 22/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Spare Slides

23 23/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 void TMVAnalysis( ) { TFile* outputFile = TFile::Open( "TMVA.root", "RECREATE" ); TMVA::Factory *factory = new TMVA::Factory( "MVAnalysis", outputFile,"!V"); TFile *input = TFile::Open("tmva_example.root"); factory->AddVariable("var1+var2", 'F'); factory->AddVariable("var1-var2", 'F'); //factory->AddTarget("tarval", 'F'); factory->AddSignalTree ( (TTree*)input->Get("TreeS"), 1.0 ); factory->AddBackgroundTree ( (TTree*)input->Get("TreeB"), 1.0 ); //factory->AddRegressionTree ( (TTree*)input->Get("regTree"), 1.0 ); factory->PrepareTrainingAndTestTree( "", "", "nTrain_Signal=200:nTrain_Background=200:nTest_Signal=200:nTest_Background=200:NormMode=None" ); factory->BookMethod( TMVA::Types::kLikelihood, "Likelihood", "!V:!TransformOutput:Spline=2:NSmooth=5:NAvEvtPerBin=50" ); factory->BookMethod( TMVA::Types::kMLP, "MLP", "!V:NCycles=200:HiddenLayers=N+1,N:TestRate=5" ); factory->TrainAllMethods(); // factory->TrainAllMethodsForRegression(); factory->TestAllMethods(); factory->EvaluateAllMethods(); outputFile->Close(); delete factory; } A complete TMVA training/testing session Create Factory Add variables/ targets Initialize Trees Book MVA methods Train, test and evaluate

24 24/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 What is a multi-variate analysis? “Combine“ all input variables into one output variable Supervised learning means learning by example: the program extracts patterns from training data Input Variables Classifier Output

25 25/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Metaclassifiers – Category Classifier and Boosting The category classifier is custom-made for HEP –Use different classifiers for different phase space regions and combine them into a single output TMVA supports boosting for all classifiers –Use a collection of “weak learners“ to improve their performace (boosted Fisher, boosted neural nets with few neurons each…)


Download ppt "1/21 ACAT 2011Eckhard von Toerne 06.Sept 2011 Status of TMVA the Toolkit for MultiVariate Analysis Eckhard von Toerne (University of Bonn) For the TMVA."

Similar presentations


Ads by Google