Robust Optimization and Applications in Machine Learning

Slides:

Advertisements

Similar presentations

Introduction to Support Vector Machines (SVM)

Advertisements

A KTEC Center of Excellence 1 Pattern Analysis using Convex Optimization: Part 2 of Chapter 7 Discussion Presenter: Brian Quanz.

Pattern Recognition and Machine Learning

Support Vector Machines

Second order cone programming approaches for handing missing and uncertain data P. K. Shivaswamy, C. Bhattacharyya and A. J. Smola Discussion led by Qi.

Support Vector Machines

Boosting CMPUT 615 Boosting Idea We have a weak classifier, i.e., it’s error rate is a little bit better than 0.5. Boosting combines a lot of such weak.

1-norm Support Vector Machines Good for Feature Selection  Solve the quadratic program for some : min s. t.,, denotes where or membership. Equivalent.

Dual Problem of Linear Program subject to Primal LP Dual LP subject to ※ All duality theorems hold and work perfectly!

An Introduction to Kernel-Based Learning Algorithms K.-R. Muller, S. Mika, G. Ratsch, K. Tsuda and B. Scholkopf Presented by: Joanna Giforos CS8980: Topics.

Support Vector Classification (Linearly Separable Case, Primal) The hyperplanethat solves the minimization problem: realizes the maximal margin hyperplane.

Variations of Minimax Probability Machine Huang, Kaizhu

Proximal Support Vector Machine Classifiers KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Data Mining Institute University of.

Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.

MURI Meeting July 2002 Gert Lanckriet ( ) L. El Ghaoui, M. Jordan, C. Bhattacharrya, N. Cristianini, P. Bartlett.

Classification Problem 2-Category Linearly Separable Case A- A+ Malignant Benign.

Announcements  Project teams should be decided today! Otherwise, you will work alone.  If you have any question or uncertainty about the project, talk.

Unconstrained Optimization Problem

Data mining and statistical learning - lecture 13 Separating hyperplane.

Minimal Neural Networks Support vector machines and Bayesian learning for neural networks Peter Andras

What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.

Distance Metric Learning for Large Margin Nearest Neighbor Classification (LMNN) NIPS 2006 Kilian Q. Weinberger, John Blitzer and Lawrence K. Saul.

Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:

Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)

An Introduction to Support Vector Machines Martin Law.

CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.

Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.

ICML2004, Banff, Alberta, Canada Learning Larger Margin Machine Locally and Globally Kaizhu Huang Haiqin Yang, Irwin King, Michael.

CS Statistical Machine learning Lecture 18 Yuan (Alan) Qi Purdue CS Oct

An Introduction to Support Vector Machines (M. Law)

Chien-Cheng Lee, Sz-Han Chen, Hong-Ming Tsai, Pau- Choo Chung, and Yu-Chun Chiang Department of Communications Engineering, Yuan Ze University Chungli,

Machine Learning CUNY Graduate Center Lecture 4: Logistic Regression.

Biointelligence Laboratory, Seoul National University

Robust Optimization and Applications Laurent El Ghaoui IMA Tutorial, March 11, 2003.

Robust Optimization and Applications in Machine Learning.

Robust Optimization and Applications in Machine Learning.

Privacy-Preserving Support Vector Machines via Random Kernels Olvi Mangasarian UW Madison & UCSD La Jolla Edward Wild UW Madison March 3, 2016 TexPoint.

Linear Programming Chapter 1 Introduction.

A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:

Page 1 CS 546 Machine Learning in NLP Review 2: Loss minimization, SVM and Logistic Regression Dan Roth Department of Computer Science University of Illinois.

The Chinese University of Hong Kong Learning Larger Margin Machine Locally and Globally Dept. of Computer Science and Engineering The Chinese University.

Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.

Knowledge-Based Nonlinear Support Vector Machine Classifiers Glenn Fung, Olvi Mangasarian & Jude Shavlik COLT 2003, Washington, DC. August 24-27, 2003.

Non-separable SVM's, and non-linear classification using kernels Jakob Verbeek December 16, 2011 Course website:

Support Vector Machine Slides from Andrew Moore and Mingyue Tan.

Lecture 2. Bayesian Decision Theory

Support vector machines

PREDICT 422: Practical Machine Learning

Tools for Decision Analysis: Analysis of Risky Decisions

Geometrical intuition behind the dual problem

An Introduction to Support Vector Machines

An Introduction to Support Vector Machines

Pawan Lingras and Cory Butz

Classification Discriminant Analysis

Support Vector Machines

Pattern Recognition CS479/679 Pattern Recognition Dr. George Bebis

Statistical Learning Dong Liu Dept. EEIS, USTC.

CSCI B609: “Foundations of Data Science”

Logistic Regression & Parallel SGD

ECE 5424: Introduction to Machine Learning

Minimax Probability Machine (MPM)

Support Vector Machine

Support vector machines

Machine Learning Week 3.

Pattern Recognition and Machine Learning

Support vector machines

Support vector machines

9.3 Linear programming and 2 x 2 games : A geometric approach

CS639: Data Management for Data Science

Primal Sparse Max-Margin Markov Networks

Presentation transcript:

Robust Optimization and Applications in Machine Learning

Part 2: Robust Classification

Data matrix

Classification problems

What is a linear classifier?

Separable data

Non-separable data

Loss functions

Two specific loss functions

Generalization error and regularization

Regularization and Sparsity

Robust classification

Formulation of robustness approach

Non-separable case

Link with worst-case loss minimization

Box uncertainty model

Formulation

Link with worst-case loss minimization

Our findings so far

Part 2: Robust Classification

Classification with interval data

Robust classification: main idea

Main results

Part 2: Robust Classification

Robust classification with hinge loss

Bound on robust SVM

Part 2: Robust Classification

Robust LR classification

Robust LR: dual

Moment matching

Part 2: Robust Classification

Minimax probability machine

Problem statement

Problem formulation

Marhsall and Olkin’s result ? ?

SOCP formulation

Dual problem

Geometric interpretation

Solving the problem

Robustness to estimation errors

Robust MPM

Formulation of Robust MPM Lemma

R-MPM: A Specific Uncertainty Model (1)

R-MPM: A Specific Uncertainty Model (2)

Robust MPM: Estimation Errors in Means

Rost MPM: Estimation Errors in Covariance

R-MPM: putting everything together

Part 2: summary