July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley.

Slides:



Advertisements
Similar presentations
S.Towers TerraFerMA TerraFerMA A Suite of Multivariate Analysis tools Sherry Towers SUNY-SB Version 1.0 has been released! useable by anyone with access.
Advertisements

Introduction to Support Vector Machines (SVM)
Generative Models Thus far we have essentially considered techniques that perform classification indirectly by modeling the training data, optimizing.
CSC321: Introduction to Neural Networks and Machine Learning Lecture 24: Non-linear Support Vector Machines Geoffrey Hinton.
ECG Signal processing (2)
Axel Naumann, DØ University of Nijmegen, The Netherlands June 24, 2002 ACAT02, Moscow 1 Support Vector Regression.
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Classification / Regression Support Vector Machines
Pattern Recognition and Machine Learning

Support Vector Machines Elena Mokshyna Odessa 2014.
Pattern Recognition and Machine Learning
An Introduction of Support Vector Machine
Support Vector Machines
SVM—Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Groundwater 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine A. Smirnoff, E. Boisvert, S. J.Paradis Earth Sciences.
Face Recognition & Biometric Systems Support Vector Machines (part 2)
Software Quality Ranking: Bringing Order to Software Modules in Testing Fei Xing Michael R. Lyu Ping Guo.
Fei Xing1, Ping Guo1,2 and Michael R. Lyu2
Top Thinkshop-2 Nov , 2000 Pushpa Bhat1 Advanced Analysis Algorithms for Top Analysis Pushpa Bhat Fermilab Top Thinkshop 2 Fermilab, IL November.
Lecture 17: Supervised Learning Recap Machine Learning April 6, 2010.
Pattern Recognition and Machine Learning
Support Vector Machines (and Kernel Methods in general)
Support Vector Machines and Kernel Methods
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
CS 4700: Foundations of Artificial Intelligence
SVMs Finalized. Where we are Last time Support vector machines in grungy detail The SVM objective function and QP Today Last details on SVMs Putting it.
Support Vector Machines
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
SVMs, cont’d Intro to Bayesian learning. Quadratic programming Problems of the form Minimize: Subject to: are called “quadratic programming” problems.
CSC2535: 2013 Advanced Machine Learning Lecture 3a: The Origin of Variational Bayes Geoffrey Hinton.
1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Machine Learning Queens College Lecture 13: SVM Again.
Outline Separating Hyperplanes – Separable Case
Support Vector Machine & Image Classification Applications
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
10/18/ Support Vector MachinesM.W. Mak Support Vector Machines 1. Introduction to SVMs 2. Linear SVMs 3. Non-linear SVMs References: 1. S.Y. Kung,
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Support Vector Machines Project מגישים : גיל טל ואורן אגם מנחה : מיקי אלעד נובמבר 1999 הטכניון מכון טכנולוגי לישראל הפקולטה להנדסת חשמל המעבדה לעיבוד וניתוח.
Biointelligence Laboratory, Seoul National University
An Introduction to Support Vector Machine (SVM)
Linear Models for Classification
CSSE463: Image Recognition Day 14 Lab due Weds, 3:25. Lab due Weds, 3:25. My solutions assume that you don't threshold the shapes.ppt image. My solutions.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
Greg GrudicIntro AI1 Support Vector Machine (SVM) Classification Greg Grudic.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
CS 9633 Machine Learning Support Vector Machines
PREDICT 422: Practical Machine Learning
Support Vector Machines and Kernels
Support Vector Machines
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis
Computing and Statistical Data Analysis Stat 5: Multivariate Methods
Support Vector Machines and Kernels
Linear Discrimination
SVMs for Document Ranking
Presentation transcript:

July 11, 2001Daniel Whiteson Support Vector Machines: Get more Higgs out of your data Daniel Whiteson UC Berkeley

July 11, 2001Daniel Whiteson Multivariate algorithms Square cuts may work well for simpler tasks, but as the data are multivariate, the algorithms also must be.

July 11, 2001Daniel Whiteson Multivariate Algorithms HEP overlaps with Computer Science, Mathematics and Statistics in this area:  How can we construct an algorithm that can be taught by example and generalize effectively? We can use solutions from those fields:  Neural Networks  Probability Density Estimators  Support Vector Machines

July 11, 2001Daniel Whiteson Neural Networks Decision function learned using freedom in hidden layers. –Used very effectively as signal discriminators, particle identifiers and parameter estimators –Fast evaluation makes them suited to triggers Constructed from a very simple object, they can learn complex patterns.

July 11, 2001Daniel Whiteson Probability Density Estimation If we knew the distributions of the signal f s (x) and the background f b (x), Then we could calculate And use it to discriminate. Example disc. surface

July 11, 2001Daniel Whiteson Probability Density Estimation Of course we do not know the analytical distributions. Given a set of points drawn from a distribution, put down a kernel centered at each point. With high statistics, this approximates a smooth probability density. Surface with many kernels

July 11, 2001Daniel Whiteson Probability Density Estimation Simple techniques have advanced to more sophisticated approaches: –Adaptive PDE varies the width of the kernel for smoothness –Generalized for regression analysis Measure the value of a continuous parameter – GEM Measures the local covariance and adjusts the individual kernels to give a more accurate estimate.

July 11, 2001Daniel Whiteson Support Vector Machines PDEs must evaluate a kernel at every training point for every classification of a data point. Can we build a decision surface that only uses the relevant bits of information, the points in training set that are near the signal-background boundary? For a linear, separable case, this is not too difficult. We simply need to find the hyperplane that maximizes the separation.

July 11, 2001Daniel Whiteson (x i,y i ) are training data  i are positive Lagrange multipliers (images from applet at Support Vector Machines To find the hyperplane that gives the highest separation (lowest “energy”), we maximize the Lagrangian w.r.t  i : The solution is: Where  i =0 for non support vectors

July 11, 2001Daniel Whiteson Support Vector Machines But not many problems of interest are linear. Map data to higher dimensional space where separation can be made by hyperplanes We want to work in our original space. Replace dot product with kernel function: For these data, we need

July 11, 2001Daniel Whiteson Support Vector Machines Neither are entirely separable problems very difficult. Allow an imperfect decision boundary, but add a penalty. Training errors, points on the wrong side of the boundary, are indicated by crosses.

July 11, 2001Daniel Whiteson Support Vector Machines We are not limited to linear or polynomial kernels.  Gaussian kernel SVMs outperformed PDEs in recognizing handwritten numbers from the USPS database. Gives a highly flexible SVM

July 11, 2001Daniel Whiteson Comparative study for HEP Discriminator Value 2-dimensional discriminant with variables M jj and H t Neural Net PDE SVM Signal: Wh to bb Background: Wbb Background: tt Background: WZ

July 11, 2001Daniel Whiteson Comparative study for HEP Signal to Noise Enhancement Discriminator Threshold All of these methods provide powerful signal enhancement Efficiency 49% Efficiency 50% Efficiency 43%

July 11, 2001Daniel Whiteson Algorithm Comparisons Algorithm AdvantagesDisadvantages Neural Nets Very fast evaluationBuild structure by hand Black box Local optimization PDE Transparent operationSlow evaluation Requires high statistics SVM Fast evaluation Kernel positions chosen automatically Global optimization Complex Training can be time intensive Kernel selection by hand

July 11, 2001Daniel Whiteson Conclusions Difficult problems in HEP overlap with those in other fields. We can take advantage of our colleagues’ years of thought and effort. There are many areas of HEP analysis where intelligent multivariate algorithms like NNs, PDEs and SVMs can help us conduct more powerful searches and make more precise measurements.