Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,

Slides:



Advertisements
Similar presentations
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
Advertisements

Pattern Recognition and Machine Learning
Data Mining Classification: Alternative Techniques
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Matthew Schwartz Harvard University March 24, Boost 2011.
An Overview of Machine Learning
Chapter 4: Linear Models for Classification
Machine Learning Neural Networks
Lecture 14 – Neural Networks
Classification and risk prediction
6/10/ Visual Recognition1 Radial Basis Function Networks Computer Science, KAIST.
Data Mining Techniques Outline
Artificial Neural Networks Artificial Neural Networks are (among other things) another technique for supervised learning k-Nearest Neighbor Decision Tree.
Bayesian Neural Networks Pushpa Bhat Fermilab Harrison Prosper Florida State University.
Neural Networks: A Statistical Pattern Recognition Perspective
Machine Learning CMPT 726 Simon Fraser University
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Multivariate Analysis A Unified Perspective
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
EE513 Audio Signals and Systems Statistical Pattern Classification Kevin D. Donohue Electrical and Computer Engineering University of Kentucky.
ROOT: A Data Mining Tool from CERN Arun Tripathi and Ravi Kumar 2008 CAS Ratemaking Seminar on Ratemaking 17 March 2008 Cambridge, Massachusetts.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Bayesian Networks. Male brain wiring Female brain wiring.
TMVA Andreas Höcker (CERN) CERN meeting, Oct 6, 2006 Toolkit for Multivariate Data Analysis.
Introduction to variable selection I Qi Yu. 2 Problems due to poor variable selection: Input dimension is too large; the curse of dimensionality problem.
G. Cowan Lectures on Statistical Data Analysis Lecture 7 page 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem 2Random variables and.
Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Michigan REU Final Presentations, August 10, 2006Matt Jachowski 1 Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
CS 782 – Machine Learning Lecture 4 Linear Models for Classification  Probabilistic generative models  Probabilistic discriminative models.
1 Generative and Discriminative Models Jie Tang Department of Computer Science & Technology Tsinghua University 2012.
Bayesian Classification. Bayesian Classification: Why? A statistical classifier: performs probabilistic prediction, i.e., predicts class membership probabilities.
Classification Techniques: Bayesian Classification
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
CSE 5331/7331 F'07© Prentice Hall1 CSE 5331/7331 Fall 2007 Machine Learning Margaret H. Dunham Department of Computer Science and Engineering Southern.
Dropout as a Bayesian Approximation
Classification Ensemble Methods 1
5/9/111 Update on TMVA J. Bouchet. 5/9/112 What changed background and signal have increased statistic to recall, signal are (Kpi) pairs taken from single.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
CHAPTER 3: BAYESIAN DECISION THEORY. Making Decision Under Uncertainty Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Naïve Bayes Classifier April 25 th, Classification Methods (1) Manual classification Used by Yahoo!, Looksmart, about.com, ODP Very accurate when.
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
BAYESIAN LEARNING. 2 Bayesian Classifiers Bayesian classifiers are statistical classifiers, and are based on Bayes theorem They can calculate the probability.
Multivariate Methods in Particle Physics Today and Tomorrow Harrison B. Prosper Florida State University 5 November, 2008 ACAT 08, Erice, Sicily.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
Neural Network Analysis of Dimuon Data within CMS Shannon Massey University of Notre Dame Shannon Massey1.
One framework for most common MVA-techniques, available in ROOT Have a common platform/interface for all MVA classification and regression-methods: Have.
Chapter 11 – Neural Nets © Galit Shmueli and Peter Bruce 2010 Data Mining for Business Intelligence Shmueli, Patel & Bruce.
Data Mining: Concepts and Techniques1 Prediction Prediction vs. classification Classification predicts categorical class label Prediction predicts continuous-valued.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION.
Helge VossAdvanced Scientific Computing Workshop ETH Multivariate Methods of data analysis Helge Voss Advanced Scientific Computing Workshop ETH.
CS 9633 Machine Learning Support Vector Machines
Chapter 7. Classification and Prediction
iSTEP 2016 Tsinghua University, Beijing July 10-20, 2016
Tutorial on Statistics TRISEP School 27, 28 June 2016 Glen Cowan
Special Topics In Scientific Computing
Data Mining Lecture 11.
CSCI 5822 Probabilistic Models of Human and Machine Learning
EE513 Audio Signals and Systems
Pattern Recognition and Machine Learning
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Parametric Methods Berlin Chen, 2005 References:
Artificial Intelligence 10. Neural Networks
Statistical Methods for Data Analysis Multivariate discriminators with TMVA Luca Lista INFN Napoli.
Presentation transcript:

Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting, DU, Delhi

CONTENTS ● Multivariate techniques and their advantages ● BNN -Introduction -Algorithm -Software used ● Work done ● Comparison of BNN with some of the classifiers in TMVA

MULTIVARIATE TECHNIQUES AND THEIR ADVANTAGES ● Many statistical techniques focus on just one or two variables ● Multivariate analysis techniques allow more than two variables to be analyzed at once ● Useful when there is correlation among the variables ● More information is analyzed simultaneously, giving greater power ● Variable selection (e.g., to give maximum signal/background discrimination)

TMVA(Toolkit for Multivariate Analysis) The Toolkit for Multivariate Analysis (TMVA) provides a ROOT-integrated machine learning environment for the processing and parallel evaluation of sophisticated multivariate classification techniques The package includes: Rectangular cut optimization Projective likelihood estimation (PDE approach) Multidimensional probability density estimation (PDE – range-search approach) Multidimensional k-nearest neighbour classifier Linear discriminant analysis (H-Matrix and Fisher discriminants) Function discriminant analysis (FDA) Artificial neural networks (three different implementations) Boosted/Bagged decision trees Predictive learning via rule ensembles (RuleFit) Support Vector Machine (SVM)

● An extremely simplified model of the brain ● Transforms inputs into output to the best of its ability ● The basic computational element is often called a node or unit ● Each input has an associated weight w, which can be modified ● The unit computes some function ‘f ’ of the weighted sum of its inputs:       Input nodes Hidden nodes Output node Neural Networks

Bayesian Neural Networks BNN is a model for classification based on Bayesian Statistics The goal of BNN is to approximate the posterior class conditional probabilities BAYES THEOREM P(C k |x) = P(x|C k )P(C k )/P(x) x = (x 1,x 2,....x P ) where P is the no. of variables C k are the classes(signal and background) P(C k |x) = conditional probability, that an object described by x belongs to class C k P(x|C k ) = conditional probability that an object is of the form x given that it is of class C k

FBM(Flexible Bayesian Modelling Package) by Radford M. Neal The package is organized in a modular fashion ● UTIL - specify final portion of a probabilistic model that supports reading of numeric input from data files or other sources,& specify sets of training and test cases ● MC - provides support for MCMC methods ● DIST – contains programs for sampling from a distribution specified by formulas for the prior and likelihood for a Bayesian model ● NET – implements Bayesian learning based on MLP networks

WORK DONE ● Ported BNN package being used in D0 to a local machine as a standalone package - macros modified to remove D0 dependence - library dependence changed for compilation ● No D0 environment specific dependence ● Tested it for independent samples ● Compared the efficiency of signal/background discrimination of BNN with some of the classifers in TMVA (Thanks to : S. Jain, OSU/D0 for valuable inputs on BNN setup/testing)

Make a list of signal and background files and variables for which to be trained Read the root files and write out the variables in a text file Create unweighted events from weighted ones Mix signal and background training files Run FBM code to get the Bayesian probabilities Run Markov Chain Average over the previous iterations BNN filter created, used in final analysis with data STEPS INVOLVED IN MAKING AN ANALYSIS

BNN performance using D0 MC data

Verification plots

BNN performance using a toy example

Plots showing efficiency of the BNN

Efficiency Plots of the classifiers in TMVA

COMPARISON BETWEEN BNN and TMVA CLASSIFIERS 1. BDT LIKELIHOOD PDERS FISHER MLP BNN CLASSIFIER S / √(S+B) CUT

Results and future plan of work ● The efficiency of BNN is comparable to the classifiers in TMVA, therefore good enough to promote BNN for CMS ● Can be used as a cross check to the studies already done using TMVA ● Therefore we plan to test it on data samples available in our group and then use it for physics studies

● The aim is to find an optimum classifier n(x,  ) that minimizes the rate of misclassification ● Minimize the empirical risk function with respect to  E(  )=1/N S ([ti-n(xi,  )]) 2 ● One can show that the error gets minimized when n(x,  )=∫t p(t|x)dt ● This yields one sets of weights  which corresponds to a mapping function n(x,  ) ● Prone to overfitting ● Therefore we will use BNN i.e. training over the entire  space n(x'|t,x) = ∫n(x',  )p(  |t,x)d  ● Will produce a function n that is more stable and is less likely to be overtrained ALGORITHM for NEURAL NETWORKS