ACAT2000 Oct. 16-20, 2000 Pushpa Bhat1 Advanced Analysis Techniques in HEP Pushpa Bhat Fermilab ACAT2000 Fermilab, IL October 2000 A reasonable man adapts.

Slides:



Advertisements
Similar presentations
Applications of one-class classification
Advertisements

Freiburg Seminar, Sept Sascha Caron Finding the Higgs or something else ideas to improve the discovery ideas to improve the discovery potential at.
Lecture Notes for E Alpaydın 2010 Introduction to Machine Learning 2e © The MIT Press (V1.0) ETHEM ALPAYDIN © The MIT Press, 2010
Continuous simulation of Beyond-Standard-Model processes with multiple parameters Jiahang Zhong (University of Oxford * ) Shih-Chang Lee (Academia Sinica)
Lecture 3 Nonparametric density estimation and classification
Chapter 4: Linear Models for Classification
Top Thinkshop-2 Nov , 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November.
Summary of Results and Projected Sensitivity The Lonesome Top Quark Aran Garcia-Bellido, University of Washington Single Top Quark Production By observing.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
MACHINE LEARNING 9. Nonparametric Methods. Introduction Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1) 2 
Summary of Results and Projected Precision Rediscovering the Top Quark Marc-André Pleier, Universität Bonn Top Quark Pair Production and Decay According.
Recent Electroweak Results from the Tevatron Weak Interactions and Neutrinos Workshop Delphi, Greece, 6-11 June, 2005 Dhiman Chakraborty Northern Illinois.
Top Physics at the Tevatron Mike Arov (Louisiana Tech University) for D0 and CDF Collaborations 1.
The new Silicon detector at RunIIb Tevatron II: the world’s highest energy collider What’s new?  Data will be collected from 5 to 15 fb -1 at  s=1.96.
Optimization of Signal Significance by Bagging Decision Trees Ilya Narsky, Caltech presented by Harrison Prosper.
On the Trail of the Higgs Boson Meenakshi Narain.
Bayesian Neural Networks Pushpa Bhat Fermilab Harrison Prosper Florida State University.
Machine Learning CMPT 726 Simon Fraser University
G. Cowan Lectures on Statistical Data Analysis 1 Statistical Data Analysis: Lecture 6 1Probability, Bayes’ theorem, random variables, pdfs 2Functions of.
Multivariate Analysis A Unified Perspective
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
Methods in Medical Image Analysis Statistics of Pattern Recognition: Classification and Clustering Some content provided by Milos Hauskrecht, University.
July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.
Pushpa Bhat Fermilab August 6, Pushpa Bhat DPF2015  Over the past 25 years, Multivariate analysis (MVA) methods have gained gradual acceptance.
G. Cowan Lectures on Statistical Data Analysis Lecture 7 page 1 Statistical Data Analysis: Lecture 7 1Probability, Bayes’ theorem 2Random variables and.
1 g g s Richard E. Hughes The Ohio State University for The CDF and D0 Collaborations Low Mass SM Higgs Search at the Tevatron hunting....
Harrison B. Prosper Workshop on Top Physics, Grenoble Bayesian Statistics in Analysis Harrison B. Prosper Florida State University Workshop on Top Physics:
Bayesian networks Classification, segmentation, time series prediction and more. Website: Twitter:
G. Cowan Statistical Methods in Particle Physics1 Statistical Methods in Particle Physics Day 3: Multivariate Methods (II) 清华大学高能物理研究中心 2010 年 4 月 12—16.
Comparison of Bayesian Neural Networks with TMVA classifiers Richa Sharma, Vipin Bhatnagar Panjab University, Chandigarh India-CMS March, 2009 Meeting,
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Use of Multivariate Analysis (MVA) Technique in Data Analysis Rakshya Khatiwada 08/08/2007.
Jun-Won Suh Intelligent Electronic Systems Human and Systems Engineering Department of Electrical and Computer Engineering Speaker Verification System.
D. M. J. Tax and R. P. W. Duin. Presented by Mihajlo Grbovic Support Vector Data Description.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
October 19, 2000ACAT 2000, Fermilab, Suman B. Beri Top Quark Mass Measurements Using Neural Networks Suman B. Beri, Rajwant Kaur Panjab University, India.
Jakob Verbeek December 11, 2009
Searches for the Standard Model Higgs at the Tevatron presented by Per Jonsson Imperial College London On behalf of the CDF and DØ Collaborations Moriond.
Linear Models for Classification
DPF2000, 8/9-12/00 p. 1Richard E. Hughes, The Ohio State UniversityHiggs Searches in Run II at CDF Prospects for Higgs Searches at CDF in Run II DPF2000.
Measurements of Top Quark Properties at Run II of the Tevatron Erich W.Varnes University of Arizona for the CDF and DØ Collaborations International Workshop.
Chapter1: Introduction Chapter2: Overview of Supervised Learning
Higgs Reach Through VBF with ATLAS Bruce Mellado University of Wisconsin-Madison Recontres de Moriond 2004 QCD and High Energy Hadronic Interactions.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
Chapter 13 (Prototype Methods and Nearest-Neighbors )
Data Mining and Decision Support
Study of pair-produced doubly charged Higgs bosons with a four muon final state at the CMS detector (CMS NOTE 2006/081, Authors : T.Rommerskirchen and.
Computational Intelligence: Methods and Applications Lecture 29 Approximation theory, RBF and SFN networks Włodzisław Duch Dept. of Informatics, UMK Google:
G. Cowan IDPASC School of Flavour Physics, Valencia, 2-7 May 2013 / Statistical Analysis Tools 1 Statistical Analysis Tools for Particle Physics IDPASC.
RECENT RESULTS FROM THE TEVATRON AND LHC Suyong Choi Korea University.
Single top quark physics Peter Dong, UCLA on behalf of the CDF and D0 collaborations Les Rencontres de Physique de la Vallee d’Aoste Wednesday, February.
LCWS06, Bangalore, March 2006, Marcel DemarteauSlide 1 Higgs Searches at DØ LCWS06, Bangalore, India March 9-13, 2006 Marcel Demarteau Fermilab For the.
Axel Naumann, DØ University of Nijmegen, The Netherlands 04/20/2002 APS April Meeting 2002 Prospects of the Multivariate B Quark Tagger for the Level 2.
Search for the Standard Model Higgs in  and  lepton final states P. Grannis, ICHEP 2012 for the DØ Collaboration Tevatron, pp √s = 1.96 TeV -
Jessica Levêque Rencontres de Moriond QCD 2006 Page 1 Measurement of Top Quark Properties at the TeVatron Jessica Levêque University of Arizona on behalf.
La Thuile, March, 15 th, 2003 f Makoto Tomoto ( FNAL ) Prospects for Higgs Searches at DØ Makoto Tomoto Fermi National Accelerator Laboratory (For the.
1 UCSD Meeting Calibration of High Pt Hadronic W Haifeng Pi 10/16/2007 Outline Introduction High Pt Hadronic W in TTbar and Higgs events Reconstruction.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
Viktor Veszpremi Purdue University, CDF Collaboration Tev4LHC Workshop, Oct , Fermilab ZH->vvbb results from CDF.
Studies of the Higgs Boson at the Tevatron Koji Sato On Behalf of CDF and D0 Collaborations 25th Rencontres de Blois Chateau Royal de Blois, May 29, 2013.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Search for Standard Model Higgs in ZH  l + l  bb channel at DØ Shaohua Fu Fermilab For the DØ Collaboration DPF 2006, Oct. 29 – Nov. 3 Honolulu, Hawaii.
Suyong Choi (SKKU) SUSY Standard Model Higgs Searches at DØ Suyong Choi SKKU, Korea for DØ Collaboration.
Low Mass Standard Model Higgs Boson Searches at the Tevatron Andrew Mehta Physics at LHC, Split, Croatia, September 29th 2008 On behalf of the CDF and.
1 C.A.L. Bailer-Jones. Machine Learning. Data exploration and dimensionality reduction Machine learning, pattern recognition and statistical data modelling.
First Evidence for Electroweak Single Top Quark Production
Multivariate Analysis Past, Present and Future
Search for WHlnbb at the Tevatron DPF 2009
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Nonparametric density estimation and classification
Presentation transcript:

ACAT2000 Oct , 2000 Pushpa Bhat1 Advanced Analysis Techniques in HEP Pushpa Bhat Fermilab ACAT2000 Fermilab, IL October 2000 A reasonable man adapts himself to the world. An unreasonable man persists to adapts the world to himself. So, all So, all progress depends on the unreasonable one. - Bernard Shaw

ACAT2000 Oct , 2000 Pushpa Bhat2 Outline Introduction Intelligent Detectors Moving intelligence closer to action Optimal Analysis Methods The Neural Network Revolution New Searches & Precision Measurements Discovery reach for the Higgs Boson Measuring Top quark mass, Higgs mass Sophisticated Approaches Probabilistic Approach to Data Analysis Summary

ACAT2000 Oct , 2000 Pushpa Bhat3 Data Collection World before Experiment World After Experiment Data Transformation Feature Extraction Global Decision Data Interpretation Data Organization Reduction Analysis Data Organization Reduction Analysis Data Collection Express Analysis

ACAT2000 Oct , 2000 Pushpa Bhat4 Intelligent Detectors Data analysis starts when a high energy event occurs Transform electronic data into useful “physics” information in real-time Move intelligence closer to action! Algorithm-specific hardware Neural Networks in Silicon Configurable hardware FPGAs, DSPs – Implement “smart” algorithms in hardware Innovative data management on-line + “smart” algorithms in hardware Data in RAM disk & AI algorithms in FPGAs Expert Systems for Control & Monitoring

ACAT2000 Oct , 2000 Pushpa Bhat5 Data Analysis Tasks Particle Identification e-ID,  -ID, b-ID, e/ , q/g Signal/Background Event Classification Signals of new physics are rare and small (Finding a “jewel” in a hay-stack) Parameter Estimation t mass, H mass, track parameters, for example Function Approximation Correction functions, tag rates, fake rates Data Exploration Knowledge Discovery via data-mining Data-driven extraction of information, latent structure analysis

ACAT2000 Oct , 2000 Pushpa Bhat6 Optimal Analysis Methods The measurements being multivariate, the optimal methods of analyses are necessarily multivariate Discriminant Analysis: Partition multidimensional variable space, identify boundaries Cluster Analysis: Assign objects to groups based on similarity Examples Fisher linear discriminant, Gaussian classifier Kernel-based methods, K-nearest neighbor (clustering) methods Adaptive/AI methods

ACAT2000 Oct , 2000 Pushpa Bhat7 x1 x2 Why Multivariate Methods? x1 x2  Because they are optimal! D(x1,x2)=2.014x x2

ACAT2000 Oct , 2000 Pushpa Bhat8 Also, they need to have optimal flexibility/complexity x1 x2 Mth Order Polynomial Fit M=1M=3M=10 x1 x2 x1 x2 S i mple Flexible Highly flexible

ACAT2000 Oct , 2000 Pushpa Bhat9 The Golden Rule Keep it simple As simple as possible Not any simpler - Einstein

ACAT2000 Oct , 2000 Pushpa Bhat10 Optimal Event Selection defines decision boundaries that minimize the probability of misclassification So, the problem mathematically reduces to that of calculating r(x), the Bayes Discriminant Function or probability densities Posterior probability

ACAT2000 Oct , 2000 Pushpa Bhat11

ACAT2000 Oct , 2000 Pushpa Bhat12

ACAT2000 Oct , 2000 Pushpa Bhat13 Probability Density Estimators Histogramming: The basic problem of non-parametric density estimation is very simple! Histogram data in M bins in each of the d feature variables M d bins  Curse Of Dimensionality In high dimensions, we would either require a huge number of data points or most of the bins would be empty leading to an estimated density of zero. But, the variables are generally correlated and hence tend to be restricted to a sub-space  Intrinsic Dimensionality

ACAT2000 Oct , 2000 Pushpa Bhat14 Kernel-Based Methods Akin to Histogramming but adopts importance sampling Place in d-dimensional space a hypercube of side h centered on each data point x, The estimate will have discontinuities Can be smoothed out using different forms for kernel functions H(u). A common choice is a multivariate kernel N = Number of data points H(u) = 1 if x n in the hypercube = 0 otherwise h=smoothing parameter

ACAT2000 Oct , 2000 Pushpa Bhat15 Place a hyper-sphere centered at each data point x and allow the radius to grow to a volume V until it contains K data points. Then, density at x If our data set contains N k points in class C k and N points in total, then K nearest-neighbor Method N = Number of data points K k = # of points in volume V for class C k V for class C k

ACAT2000 Oct , 2000 Pushpa Bhat16 Discriminant Approximation with Neural Networks Output of a feed forward neural network can approximate the Bayesian posterior probability p(s|x,y) Directly without estimating class-conditional probabilities

ACAT2000 Oct , 2000 Pushpa Bhat17 Calculating the Discriminant Consider the sum Where 1 d i = 1 for signal 0 = 0 for background  = vector of parameters Then in the limit of large data samples and provided that the function n(x,y,  ) is flexible enough.

ACAT2000 Oct , 2000 Pushpa Bhat18  NN estimates a mapping function without requiring a mathematical description of how the output formally depends on the input.  The “hidden” transformation functions, g, adapt themselves to the data as part of the training process. The number of such functions need to grow only as the complexity of the problem grows. x1x1 x2x2 x3x3 x4x4 D NN Neural Networks

ACAT2000 Oct , 2000 Pushpa Bhat19 Measuring the Top Quark Mass The Discriminants Discriminant variables shaded = top

ACAT2000 Oct , 2000 Pushpa Bhat20 Background- rich Signal-rich Measuring the Top Quark Mass m t = ± 5.6(stat.) ± 6.2 (syst.) GeV/c 2 DØ Lepton+jets

Strategy for Discovering the Higgs Boson at the Tevatron P.C. Bhat, R. Gilmartin, H. Prosper, PRD 62 (2000) hep-ph/

ACAT2000 Oct , 2000 Pushpa Bhat22 Hints from the Analysis of Precision Data LEP Electroweak Group, M H = GeV/c 2 M H < 225 GeV/c 2 at 95% C.L.

ACAT2000 Oct , 2000 Pushpa Bhat23 Event Simulation Signal Processes Backgrounds Event generation WH, ZH, ZZ and Top with PYTHIA Wbb, Zbb with CompHEP, fragmentation with PYTHIA Detector modeling SHW ( Trigger, Tracking, Jet-finding b-tagging (double b-tag efficiency ~ 45%) Di-jet mass resolution ~ 14% ( Scaled down to 10% for RunII Higgs Studies )

ACAT2000 Oct , 2000 Pushpa Bhat24 WH Results from NN Analysis M H = 100 GeV/c 2 WH WH vs Wbb

ACAT2000 Oct , 2000 Pushpa Bhat25 WH (110 GeV/c2) NN Distributions

ACAT2000 Oct , 2000 Pushpa Bhat26 Results, Standard vs. NN A good chance of discovery up to M H = 130 GeV/c 2 with 20-30fb - 1

ACAT2000 Oct , 2000 Pushpa Bhat27 Improving the Higgs Mass Resolution 13.8% 12.2% 13.1% 11..3% 13%11% Use m jj and H T (=  E t jets ) to train NNs to predict the Higgs boson mass

ACAT2000 Oct , 2000 Pushpa Bhat28 Newer Approaches Ensembles of Networks Committees of Networks Performance can be better than the best single network Stacks of Networks Control both bias and variance Mixture of Experts Decompose complex problems

ACAT2000 Oct , 2000 Pushpa Bhat29 Exploring Models: Bayesian Approach Provides probabilistic information on each parameter of a model (SUSY, for example) via marginalization over other parameters Bayesian method enables straight-forward and meaningful model comparisons. It also allows treatment of all uncertainties in a consistent manner. Mathematically linked to adaptive algorithms such as Neural Networks (NN) Hybrid methods involving NN for probability density estimation and Bayesian treatement can be very powerful

ACAT2000 Oct , 2000 Pushpa Bhat30 Summary We are building very sophisticated equipment and will record unprecedented amounts of data in the coming decade Use of advanced “optimal” analysis techniques will be crucial to achieve the physics goals Multivariate methods, particularly Neural Network techniques, have already made impact on discoveries and precision measurements and will be the methods of choice in future analyses Hybrid methods combining “intelligent” algorithms and probabilistic approach will be the wave of the future

ACAT2000 Oct , 2000 Pushpa Bhat31 Optimal Event Selection r(x,y) = constant defines an optimal decision boundary r(x,y) = constant defines an optimal decision boundary Feature space S =B = Conventional cuts

Probabilistic Approach to Data Analysis Bayesian Methods (The Wave of the future)

ACAT2000 Oct , 2000 Pushpa Bhat33 Bayesian Analysis M model A uninteresting parameters p interesting parameters d data LikelihoodPriorPosterior Bayesian Analysis of Multi-source Data P.C. Bhat, H. Prosper, S. Snyder, Phys. Lett. B 407(1997) 73

ACAT2000 Oct , 2000 Pushpa Bhat34 Higgs Mass Fits S=80 WH events, assume background distribution described by Wbb. Results S/B = 1/10 M fit = 114 +/- 11GeV/c 2 S/B = 1/10 M fit = 114 +/- 11GeV/c 2 S/B = 1/5 M fit = 114 +/- 7GeV/c 2 S/B = 1/5 M fit = 114 +/- 7GeV/c 2