Tutorial 12 Linear programming Quadratic programming.

Slides:



Advertisements
Similar presentations
3.6 Support Vector Machines
Advertisements

Support Vector Machine
5.4 Basis And Dimension.
Lecture #3; Based on slides by Yinyu Ye
6.1 Vector Spaces-Basic Properties. Euclidean n-space Just like we have ordered pairs (n=2), and ordered triples (n=3), we also have ordered n-tuples.
1 OR II GSLM Outline  some terminology  differences between LP and NLP  basic questions in NLP  gradient and Hessian  quadratic form  contour,
SVM—Support Vector Machines
Introducción a la Optimización de procesos químicos. Curso 2005/2006 BASIC CONCEPTS IN OPTIMIZATION: PART II: Continuous & Unconstrained Important concepts.
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Ch 2. 6 – Solving Systems of Linear Inequalities & Ch 2
Lecture 8 – Nonlinear Programming Models Topics General formulations Local vs. global solutions Solution characteristics Convexity and convex programming.
Separating Hyperplanes
The Most Important Concept in Optimization (minimization)  A point is said to be an optimal solution of a unconstrained minimization if there exists no.
Support Vector Machines
Tutorial 12 Unconstrained optimization Conjugate gradients.
Dual Problem of Linear Program subject to Primal LP Dual LP subject to ※ All duality theorems hold and work perfectly!
Support Vector Machines Formulation  Solve the quadratic program for some : min s. t.,, denotes where or membership.  Different error functions and measures.
Constrained Optimization
Support Vector Machines and Kernel Methods
Unconstrained Optimization Problem
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
Tutorial 10 Iterative Methods and Matrix Norms. 2 In an iterative process, the k+1 step is defined via: Iterative processes Eigenvector decomposition.
Lecture 10: Support Vector Machines
D Nagesh Kumar, IIScOptimization Methods: M2L5 1 Optimization using Calculus Kuhn-Tucker Conditions.
Math for CSLecture 51 Function Optimization. Math for CSLecture 52 There are three main reasons why most problems in robotics, vision, and arguably every.
Math for CSLecture 71 Constrained Optimization Lagrange Multipliers ____________________________________________ Ordinary Differential equations.
KKT Practice and Second Order Conditions from Nash and Sofer
11. Cost minimization Econ 494 Spring 2013.
Breast Cancer Diagnosis via Linear Hyper-plane Classifier Presented by Joseph Maalouf December 14, 2001 December 14, 2001.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
1. The Simplex Method.
ENCI 303 Lecture PS-19 Optimization 2
Simplex method (algebraic interpretation)
Computational Intelligence: Methods and Applications Lecture 23 Logistic discrimination and support vectors Włodzisław Duch Dept. of Informatics, UMK Google:
Linear Fractional Programming. What is LFP? Minimize Subject to p,q are n vectors, b is an m vector, A is an m*n matrix, α,β are scalar.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
L8 Optimal Design concepts pt D
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Support Vector Machines Tao Department of computer science University of Illinois.
OR Simplex method (algebraic interpretation) Add slack variables( 여유변수 ) to each constraint to convert them to equations. (We may refer it as.
Introduction to Optimization
Linear & Nonlinear Programming -- Basic Properties of Solutions and Algorithms.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
Chapter 4 The Maximum Principle: General Inequality Constraints.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
Generalization Error of pac Model  Let be a set of training examples chosen i.i.d. according to  Treat the generalization error as a r.v. depending on.
Approximation Algorithms based on linear programming.
D Nagesh Kumar, IISc Water Resources Systems Planning and Management: M2L2 Introduction to Optimization (ii) Constrained and Unconstrained Optimization.
1 Chapter 4 Geometry of Linear Programming  There are strong relationships between the geometrical and algebraic features of LP problems  Convenient.
Support vector machines
EMGT 6412/MATH 6665 Mathematical Programming Spring 2016
Large Margin classifiers
Computational Optimization
Geometrical intuition behind the dual problem
Support Vector Machines
Lecture 8 – Nonlinear Programming Models
Dr. Arslan Ornek IMPROVING SEARCH
Chap 3. The simplex method
3-3 Optimization with Linear Programming
Support vector machines
Machine Learning Week 3.
I.4 Polyhedral Theory (NW)
Support vector machines
EE 458 Introduction to Optimization
Chapter 7: Systems of Equations and Inequalities; Matrices
Support vector machines
I.4 Polyhedral Theory.
Presentation transcript:

Tutorial 12 Linear programming Quadratic programming

M4CS Tutorial 14 We already discussed that the meaning of the constraints in the optimization is to define search region Ω within the space R n of definition of f(x). Generally, each equality constraint reduced dimensionality by 1, and each inequality constraint defines a region within the space without dimensionality reduction. Now, consider the minimization of the linear function f(x): f(x) = c T x + b, and the search region Ω defined by constraints: Linear case

M4CS Tutorial 14 x*x* c T x + b Ω The figure above illustrates, that in this linear case the minimum is reached on the boundary of the region Ω. We will leave the proof to you, and proceed to more general and a slightly harder case of convex functions. Linear case: Illustration

M4CS Tutorial 14 Definition 1: The region Ω is called convex if Convexity Definition 2: The function f(x) is called convex if If ‘≥’ is replaced with ‘>’ the function will be called ‘strictly convex’ Note, that linear function is convex, but not strictly convex.

M4CS Tutorial 14 Convexity: Illustration The figure above illustrates the relations between linear, convex and strictly convex functions. Convex Linear Strictly Convex General

M4CS Tutorial 14 Linear programming: Definition Minimization of linear function within the convex region Ω is called a linear programming. Note, that with appropriate and sufficiently ‘tall’ matrix B (sufficiently many inequality conditions) arbitrary convex region can be approximated with arbitrarily large accuracy. (Can you prove it?). Many of the problems in linear programming have additional constraint of non-negative components of x: x i ≥0. We will prove another, yet related, claim: A set of linear equality and inequality constraints define a convex region.

M4CS Tutorial 14 Quadratic programming: Definition The only difference between the quadratic programming and linear programming, is that the function can be a quadratic form:

M4CS Tutorial 14 Linear constraints define convex regions 1/3 A region Ω, defined by a set of linear equality and inequality constraints is convex. Proof: 1. If Ω is an empty region, or contains a single point the definition for convexity is satisfied trivially. 2. Consider any. We need to prove that

M4CS Tutorial 14 Proof (equality constraints) 2/3 Assume the contrary -. This means that x 3 violates at least one of the constraints. First, assume that it is one of the equality constraints: Note, that since, they satisfy this constraint: and. Applying the definition of x 3 and linearity on scalar product, we obtain: Thus, x 3 satisfies all the equality constraints, contrary to the assumption.

M4CS Tutorial 14 Proof (inequality constraints) 3/3 Now, assume that x3 violates an inequality constraint. This means, that. On the other hand, therefore Let us write, and Since, we obtain Thus, x 3 satisfies all the inequality constraints, contrary to the assumption. We have proven, that for any x1, x2 satisfying linear equality and inequality constraints, and, also satisfies these constraints. Therefore, linear constraints define a convex region.

M4CS Tutorial 14 Example: Support Vector Machines 1/6 Given the set of labeled points, which is linearly separable, find the vector w which defines the hyperplanes separating the set with maximum margin. xTw=xTw= w

M4CS Tutorial 14 Example: Support Vector Machines 2/6 For the sake of elegant mathematical description, we define the data matrix A and the label matrix D as following: We are looking for the vector w, and appropriate constant  such that: Note, that these two cases can be combined: (1)

M4CS Tutorial 14 Note, that by multiplying w and  by some factor, we seemingly increase the separation between the planes in (1). Therefore, the best separation is has to maintain inequality (1) and simultaneously minimize the length of w: Example: Support Vector Machines 3/6 (2) This is a constrained minimization problem with quadratic function and linear inequality constraints. It is called quadratic programming.

M4CS Tutorial 14 w=QUADPROG(H,f,A,b) attempts to solve the quadratic programming problem: min 0.5*w'*H*w + f'*w subject to: A*w <= b w Example: SVM, solution in Matlab 4/6 We will use the Matlab’s quadprog: >> help quadprog In our case, H=I; f=0; For clarity, we bring the constraint (2) to the form compatible with matlab’s notation:

M4CS Tutorial 14 w=QUADPROG(H,f,A,b) attempts to solve the quadratic programming problem: min 0.5*w'*H*w + f'*w subject to: A*w <= b w Example: SVM, solution in Matlab 5/6 Thus, we have:

M4CS Tutorial 14 Example: SVM, solution in Matlab 6/6 n=2; PN=20; A = [rand(PN,2)+.1; -rand(PN,2) -.1] % the data D = diag([ones(1,PN),-ones(1,PN)]) % the labels plot(A(1:PN,1),A(1:PN,2),'g*'); % plot the data hold on; plot(A(PN+1:2*PN,1),A(PN+1:2*PN,2),'bo'); % adjust the input to quadprog() H = eye(n); f = zeros(n,1); AA = -D*A; b = -ones(PN*2,1); w = quadprog(H,f,AA,b) % quadratic programming – takes milliseconds % Plot the separating plane W_orth = [-.3:.01:.3]'*[w(2),-w(1)]; plot(W_orth(:,1), W_orth(:,2),'k.')