S.Towers TerraFerMA TerraFerMA A Suite of Multivariate Analysis tools Sherry Towers SUNY-SB Version 1.0 has been released! useable by anyone with access.

Slides:



Advertisements
Similar presentations
EigenFaces and EigenPatches Useful model of variation in a region –Region must be fixed shape (eg rectangle) Developed for face recognition Generalised.
Advertisements

ECG Signal processing (2)
Pattern Recognition and Machine Learning
Maximum Likelihood And Expectation Maximization Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Matthew Schwartz Harvard University March 24, Boost 2011.
Dimension reduction (1)
What is Statistical Modeling
x – independent variable (input)
Overview of Non-Parametric Probability Density Estimation Methods Sherry Towers State University of New York at Stony Brook.
AGC DSP AGC DSP Professor A G Constantinides© Estimation Theory We seek to determine from a set of data, a set of parameters such that their values would.
Dimensional reduction, PCA
1 TerraFerMA A Suite of Multivariate Analysis Tools Sherry Towers SUNY-SB TerraFerMA is now ROOT-dependent only (ie; it is CLHEP-free) www-d0.fnal.gov/~smjt/multiv.html.
Linear Models Tony Dodd January 2007An Overview of State-of-the-Art Data Modelling Overview Linear models. Parameter estimation. Linear in the.
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
Linear Methods for Classification
Multivariate Analysis A Unified Perspective
Benefits of Minimizing the Number of Discriminators Used in a Multivariate Analysis Sherry Towers State University of New York at Stony Brook.
Arizona State University DMML Kernel Methods – Gaussian Processes Presented by Shankar Bhargav.
Radial Basis Function Networks
1 Linear Methods for Classification Lecture Notes for CMPUT 466/551 Nilanjan Ray.
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Taking Raw Data Towards Analysis 1 iCSC2015, Vince Croft, NIKHEF Exploring EDA, Clustering and Data Preprocessing Lecture 2 Taking Raw Data Towards Analysis.
Kalman filtering techniques for parameter estimation Jared Barber Department of Mathematics, University of Pittsburgh Work with Ivan Yotov and Mark Tronzo.
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
Michigan REU Final Presentations, August 10, 2006Matt Jachowski 1 Multivariate Analysis, TMVA, and Artificial Neural Networks Matt Jachowski
Data Reduction. 1.Overview 2.The Curse of Dimensionality 3.Data Sampling 4.Binning and Reduction of Cardinality.
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
1 Pattern Recognition: Statistical and Neural Lonnie C. Ludeman Lecture 24 Nov 2, 2005 Nanjing University of Science & Technology.
Project 11: Determining the Intrinsic Dimensionality of a Distribution Okke Formsma, Nicolas Roussis and Per Løwenborg.
Jakob Verbeek December 11, 2009
Chapter1: Introduction Chapter2: Overview of Supervised Learning
פרקים נבחרים בפיסיקת החלקיקים אבנר סופר אביב
Speech Lab, ECE, State University of New York at Binghamton  Classification accuracies of neural network (left) and MXL (right) classifiers with various.
Principle Component Analysis and its use in MA clustering Lecture 12.
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 12: Advanced Discriminant Analysis Objectives:
Lecture 8 Source detection NASSP Masters 5003S - Computational Astronomy
Cameron Rowe.  Introduction  Purpose  Implementation  Simple Example Problem  Extended Kalman Filters  Conclusion  Real World Examples.
Various Rupak Mahapatra (for Angela, Joel, Mike & Jeff) Timing Cuts.
1 Introduction to Statistics − Day 2 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Brief catalogue of probability densities.
G. Cowan Lectures on Statistical Data Analysis Lecture 6 page 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem 2Random variables and.
G. Cowan Lectures on Statistical Data Analysis Lecture 10 page 1 Statistical Data Analysis: Lecture 10 1Probability, Bayes’ theorem 2Random variables and.
1 Statistics & R, TiP, 2011/12 Multivariate Methods  Multivariate data  Data display  Principal component analysis Unsupervised learning technique 
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Linear Models Tony Dodd. 21 January 2008Mathematics for Data Modelling: Linear Models Overview Linear models. Parameter estimation. Linear in the parameters.
Part 3: Estimation of Parameters. Estimation of Parameters Most of the time, we have random samples but not the densities given. If the parametric form.
Data statistics and transformation revision Michael J. Watts
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
TNSmooth: Root Multi-dimensional PDFs
Deep Feedforward Networks
LECTURE 11: Advanced Discriminant Analysis
Background on Classification
Multivariate Methods of
Multivariate Analysis Past, Present and Future
Principal Component Analysis (PCA)
Multi-dimensional likelihood
Machine Learning Feature Creation and Selection
ISTEP 2016 Final Project— Project on
Chapter 4a Stochastic Modeling
Project on H →ττ and multivariate methods
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Generally Discriminant Analysis
Analytics – Statistical Approaches
Feature Selection Methods
TerraFerMA A Suite of Multivariate Analysis Tools
Marios Mattheakis and Pavlos Protopapas
Presentation transcript:

S.Towers TerraFerMA TerraFerMA A Suite of Multivariate Analysis tools Sherry Towers SUNY-SB Version 1.0 has been released! useable by anyone with access to the CLHEP and Root libraries www-d0.fnal.gov/~smjt/multiv.html

S.Towers TerraFerMA TerraFerMA=Fermilab Multivariate Analysis (aka FerMA) Convenient interface to various multivariate analysis packages (ex: MLPfit, Jetnet, PDE/GEM, Fisher discriminant,binned likelihood etc) User fills signal and background (and data) Samples, which are then used as input to FerMA methods… Includes method to sort variables to determine which are best discriminators between signal and background.

S.Towers TerraFerMA Also includes useful stats tools (correlations, means, RMSs), and method to detect outliers. Using a multivariate package chosen by user, FerMA will yield prob that data event is signal or background. TerraFerMA makes it trivial to compare performance of different multivariate techniques! Also makes it easy to reduce the number of discriminators used in an analysis!

S.Towers TerraFerMA Simple techniques: Ingore all correlations between discriminators… For example; simple techniques based on square cuts, or likelihood techniques which obtain multi-D likelihood from product of 1-D likelihoods. Advantage: fast, easy understand. Easy to tell if modelling of data is sound. Disadvantage: useful discriminating info is lost if correlations are ignored FerMA includes a method to determine optimal square cuts in a multidimensional parameter space. Overview of common multivariate analysis techniques:

S.Towers TerraFerMA More powerful... More complicated techniques take into account simple (linear) correlations between discriminators u ANOVA/MANCOVA s H-Matrix s Fisher-discriminant * s Principal component analysis * u Projection correlation transformations * u Optimal Observables u and many, many more… Advantage: fast, more powerful Disadvantage: can be a bit harder to understand, systematics can be harder to assess. Harder to tell if modelling of data is sound.

S.Towers TerraFerMA Probability correlation transformations (ProCor) ProCor is default multivariate package in TerraFerMA. u Very fast u (Relatively) easy to understand Essentially, ProCor maps every point in signal (or background) MC onto a multi-dimensional Gaussian PDF. u Mapping is optimal for MC sets with linear correlations between variables u If mapping is not optimal, ProCor tells you!

S.Towers TerraFerMA Most powerful... Analytic/binned likelihood u Advantage: easy to understand u Disadvantage: difficult to implement with many variables Neural Networks u Advantage: powerful, reasonably fast u Disadvantage: Black box! Many parameters of method, and systematics can be difficult to assess Kernel Estimation u (Gaussian Expansion Method=GEM) u (Static-Kernal Probability Density Estimation=PDE) u Advantage: powerful, easy to understand. Unbinned estimate of original PDF. Few parameters of method. u Disadvantage: a bit slow.

S.Towers TerraFerMA Gaussian Expansion Method/ Probability Density Estimation All kernal PDF estimation methods are developed from a very simple idea… If a data point lies in a region where clustering of signal MC points is relatively tight,and bkgnd MC points is relatively loose, then that point is more likely to be signal.

S.Towers TerraFerMA GEM Whether the clustering is relatively tight can be determined from the local covariance matrix, calculated from nearest neighbours to a point

S.Towers TerraFerMA GEM/PDE But we also want estimate of probability density... GEM/PDE uses idea that any continuous function can be modelled from the sum of kernel functions (similar to idea behind Fourier series) GEM/PDE use multi-dimensional Gaussian kernels Each Gaussian kernel is centred about an MC point…widths of Gaussian come from local covariance matrix at that point

S.Towers TerraFerMA GEM: 1-D Gaussian

S.Towers TerraFerMA GEM/PDE: 1-D Gaussian

S.Towers TerraFerMA Boring details...

S.Towers TerraFerMA The case for fewer discriminators… Using a large number of variables indiscriminantly can indicate a lack of forethought in the design and conceptualization of an analysis

S.Towers TerraFerMA The case for fewer discriminators… Also, each added variable makes it more difficult to determine if modelling of data is sound, and makes analysis more difficult to understand And, each added variable adds statistical noise…This can degrade overall discrimination power!

S.Towers TerraFerMA Optimising discrimination… Maximise S/sqrt(S+B), or:

S.Towers TerraFerMA The curse of too many variables Signal 5D Gaussian = (1,0,0,0,0) = (1,1,1,1,1) Bkgnd 5D Gaussian = (0,0,0,0,0) = (1,1,1,1,1) Only difference between signal and background is in first dimension. Other four dimensions are `useless discriminators

S.Towers TerraFerMA The curse of too many variables

S.Towers TerraFerMA The curse of too many variables

S.Towers TerraFerMA A real-world example… A Tevatron RunI analysis used a 7 variable NN to discriminate between signal and background. Were all 7 needed? Ran the signal and background n-tuples through the TerraFerMA interface to the sorting method…

S.Towers TerraFerMA A real-world example…

S.Towers TerraFerMA Another real-world example… A Tevatron physics-object- ID method uses 9 variables in the analysis. How many are actually needed?

S.Towers TerraFerMA Another real-world example…

S.Towers TerraFerMA Summary Careful examination of discriminators used in a multivariate analysis is always a good idea! Reduction of number of variables can simplify analysis considerably, and can even increase discrimination power!

S.Towers TerraFerMA TerraFerMA Version 1.0 TerraFerMA documentation: u www-d0.fnal.gov/~smjt/ferma.ps TerraFerMA users guide: u www-d0.fnal.gov/~smjt/guide.ps TerraFerMA package: u …/ferma.tar.gz u (includes an example program in examples/simple/simple.cpp)

S.Towers TerraFerMA TerraFerMA Version 1.0 Soon to be included: Support Vector Machines Methods to fit for the fraction of signal and bkgrnd in a data sample Ensembles (many samples grouped together) Enhanced ability to write- out/read-in NN weights Want more? Let me know!

S.Towers TerraFerMA Summary TerraFerMA is: A platform of very powerful multivariate analysis tools. In all test applications to- date, TerraFerMA has signficantly improved existing analyses!