Algorithms in the Real World

Slides:



Advertisements
Similar presentations
Advanced Topics in Algorithms and Data Structures Lecture 7.2, page 1 Merging two upper hulls Suppose, UH ( S 2 ) has s points given in an array according.
Advertisements

Introduction to Algorithms
Optimization of thermal processes2007/2008 Optimization of thermal processes Maciej Marek Czestochowa University of Technology Institute of Thermal Machinery.
Linear Programming Fundamentals Convexity Definition: Line segment joining any 2 pts lies inside shape convex NOT convex.
Basic Feasible Solutions: Recap MS&E 211. WILL FOLLOW A CELEBRATED INTELLECTUAL TEACHING TRADITION.
Totally Unimodular Matrices Lecture 11: Feb 23 Simplex Algorithm Elliposid Algorithm.
Nonlinear Optimization for Optimal Control
1 Introduction to Linear and Integer Programming Lecture 9: Feb 14.
Optimization Linear Programming and Simplex Method
Unconstrained Optimization Problem
Optimization Methods One-Dimensional Unconstrained Optimization
Computational Geometry Piyush Kumar (Lecture 5: Linear Programming) Welcome to CIS5930.
Linear Programming Piyush Kumar. Graphing 2-Dimensional LPs Example 1: x y Feasible Region x  0y  0 x + 2 y  2 y  4 x  3 Subject.
ENCI 303 Lecture PS-19 Optimization 2
Paper review of ENGG*6140 Optimization Techniques Paper Review -- Interior-Point Methods for LP for LP Yanrong Hu Ce Yu Mar. 11, 2003.
Pareto Linear Programming The Problem: P-opt Cx s.t Ax ≤ b x ≥ 0 where C is a kxn matrix so that Cx = (c (1) x, c (2) x,..., c (k) x) where c.
296.3Page :Algorithms in the Real World Linear and Integer Programming II – Ellipsoid algorithm – Interior point methods.
Computer Animation Rick Parent Computer Animation Algorithms and Techniques Optimization & Constraints Add mention of global techiques Add mention of calculus.
WOOD 492 MODELLING FOR DECISION SUPPORT Lecture 3 Basics of the Simplex Algorithm.
Machine Learning Weak 4 Lecture 2. Hand in Data It is online Only around 6000 images!!! Deadline is one week. Next Thursday lecture will be only one hour.
559 Fish 559; Lecture 5 Non-linear Minimization. 559 Introduction Non-linear minimization (or optimization) is the numerical technique that is used by.
Linear Programming Maximize Subject to Worst case polynomial time algorithms for linear programming 1.The ellipsoid algorithm (Khachian, 1979) 2.Interior.
Chapter 4 Sensitivity Analysis, Duality and Interior Point Methods.
Linear Programming Chapter 9. Interior Point Methods  Three major variants  Affine scaling algorithm - easy concept, good performance  Potential.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Searching a Linear Subspace Lecture VI. Deriving Subspaces There are several ways to derive the nullspace matrix (or kernel matrix). ◦ The methodology.
OR Chapter 4. How fast is the simplex method  Efficiency of an algorithm : measured by running time (number of unit operations) with respect to.
Linear Programming Piyush Kumar Welcome to CIS5930.
Approximation Algorithms based on linear programming.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Optimal Control.
NUMERICAL ANALYSIS I. Introduction Numerical analysis is concerned with the process by which mathematical problems are solved by the operations.
-- Interior-Point Methods for LP
An Introduction to Linear Programming
Linear Programming Many problems take the form of maximizing or minimizing an objective, given limited resources and competing constraints. specify the.
EMGT 6412/MATH 6665 Mathematical Programming Spring 2016
Lap Chi Lau we will only use slides 4 to 19
Chapter 2 An Introduction to Linear Programming
Chap 10. Sensitivity Analysis
Topics in Algorithms Lap Chi Lau.
Non-linear Minimization
Solver & Optimization Problems
Computational Optimization
The minimum cost flow problem
Perturbation method, lexicographic method
6.5 Stochastic Prog. and Benders’ decomposition
Computer Science cpsc322, Lecture 14
Linear Programming Prof. Sweta Shah.
Linear Programming.
Chap 9. General LP problems: Duality and Infeasibility
Chapter 6. Large Scale Optimization
Chap 3. The simplex method
Analysis of Algorithms
Chapter 3 The Simplex Method and Sensitivity Analysis
Collaborative Filtering Matrix Factorization Approach
CSCI B609: “Foundations of Data Science”
CISC5835, Algorithms for Big Data
Chapter 6. Large Scale Optimization
1.3 Modeling with exponentially many constr.
Chapter 5. The Duality Theorem
CS5321 Numerical Optimization
Optimization and Some Traditional Methods
Lecture 18. SVM (II): Non-separable Cases
I.4 Polyhedral Theory (NW)
I.4 Polyhedral Theory.
6.5 Stochastic Prog. and Benders’ decomposition
Graphical solution A Graphical Solution Procedure (LPs with 2 decision variables can be solved/viewed this way.) 1. Plot each constraint as an equation.
15-853:Algorithms in the Real World
Chapter 6. Large Scale Optimization
Constraints.
Presentation transcript:

Algorithms in the Real World Linear and Integer Programming II Ellipsoid algorithm Interior point methods

Ellipsoid Algorithm First known-to-be polynomial-time algorithm for linear programming (Khachian 79) Solves find x subject to Ax ≤ b i.e., find a feasible solution (no cost function) Run Time: O(n4L), where L = #bits to represent A and b Problem in practice: always takes this much time, and has numerical stability issues.

Reduction from maximization problem Could add constraint -cTx ≤ -z0, do binary search over various values of z0 to find feasible solution with maximum z, but approach based on dual gives direct solution. To solve: maximize: z = cTx subject to: Ax ≤ b, x ≥ 0 Convert to: find: x, y subject to: Ax ≤ b, -x ≤ 0 -ATy ≤ –c -y ≤ 0 -cTx +yTb ≤ 0 (implies optimal solution to original problem)

Ellipsoid Algorithm Consider a sequence of smaller and smaller ellipsoids each with the feasible region inside. For iteration k: ck = center of Ek Eventually ck has to be inside of F, and we are done, since we have found a feasible solution. Feasible region F ck

Ellipsoid Algorithm To find the next smaller ellipsoid: find most violated constraint ak identify intersection of current ellipsoid with constraint half-space find smallest ellipsoid that contains this intersection Feasible region F ck ak

Interior Point Methods Travel through the interior with a combination of An optimization term (moves toward objective) A centering term or constraint (keeps away from boundary) Used since 50s for nonlinear programming. Karmakar proved a variant is polynomial time in 1984 x2 x1

Methods Affine scaling: simplest, but no known time bounds Potential reduction: O(nL) iterations Central trajectory: O(n1/2 L) iterations The time for each iteration involves solving a linear system so it takes polynomial time. The “real world” time depends heavily on the matrix structure.

Assumptions We are trying to solve the problem: minimize z = cTx subject to Ax = b x ≥ 0

Outline Centering Methods Overview Picking a direction to move toward the optimal Remaining feasible (Ax = b) General method Example: Affine scaling Example: potential reduction Example: log barrier

Centering option 1: boundary term The “analytical center”: Minimize: y = -Si=1n lg xi y goes to infinity as x approaches any boundary. x1 x2 x4 x3 x5

Centering option 2: constraint Elliptical Scaling: x5 d x1 x4 Dikin Ellipsoid x3 x2 An ellipsoid contained in the feasible region that “summarizes” the feasible region. Move must stay within the ellipsoide.

Finding the Optimal solution Let’s say f(x) is the combination of the “centering term” c(x) and the “optimization term” z(x) = cT x. (Alternately might replace centering term with constraint like ellipsoid.) We would like this to have a minimum at the same point in the feasible region as z(x) but can otherwise be quite different. In particular c(x) and hence f(x) need not be linear. Goal: find the minimum of f(x) over the feasible region starting at some interior point x0 Can do this by taking a sequence of steps toward the minimum. How do we pick a direction for a step?

Picking a direction: steepest descent Option 1: Find the steepest descent on x at x0 by taking the gradient: Problem: the gradient might be changing rapidly, so local steepest descent might not give us a good direction. Any ideas for better selection of a direction?

Picking a direction: Newton’s method Consider the truncated Taylor series: To find the minimum of this approximation, take the derivative with respect to x and set to 0. In matrix form, for arbitrary dimension: Hessian (second derivatives)

Next Step? Now that we have a direction, what do we do? The direction was chosen based on how to optimize f(x) without enforcing the Ax=b constraints. (The centering term or constraint addresses the x≥0 constraints.) The new point may lie outside of the feasible region! We may have to make a correction.

Remaining on the support plane A feasible solution must satisfy the constraint Ax = b Suppose that the current feasible solution is x0, and we take a step in direction d, where d lies in the null space of A, i.e., Ad = 0. Then A(x0+d) = Ax0 + Ad = b + 0. To satisfy Ax=b, d must lie in the null space of A. Note also that A(x0+αd) = 0, for any scalar α. (Still separately have to ensure that x0+d ≥ 0.)

Projecting to the Null Space of A Suppose, e.g., our direction is where We want to calculate d = Pg In general, to find y s.t. y=B-1x, where B is sparse, B-1 is dense, instead, solve for y in By=x.

Calculating Pc (without directly multiplying by P) Pg = (I – AT(AAT)-1A)g = g – ATw where w = (AAT)-1Ag Note that AATw = AAT(AAT)-1Ag = Ag, so all we need to do is solve for w in: AATw = Ag, then compute g – ATw. This system of linear equations can be solved with a sparse solver as described in the graph separator lectures. The solver is the workhorse of the interior point methods. Note that AAT will be more dense than A.

Next step? We now have a direction g and its projection d onto the null space of A. What do we do now? To decide how far to go we can find the minimum of f(x) along the line defined by d. Not too hard if f(x) is reasonably nice (e.g., has one minimum along the line). Alternatively we can go some fraction of the way to the boundary (e.g., 90%)

General Interior Point Method Pick start x0 Factor AAT (i.e., find LU decomposition) Repeat until done (within some threshold) decide on function to optimize f(x) (might be the same for all iterations) select direction d based on f(x) (e.g., with Newton’s method) project g onto null space of A (using factored AAT and solving a linear system) decide how far to go along that direction (must ensure x0+d ≥ 0). Caveat: every method is slightly different

Affine Scaling (Preserves Lines) A variant of Karmarkar’s algorithm, actually due to Dikin (1967), but not poly time. In each iteration solve: minimize cTd subject to Ad = 0 dTD-2d ≤ 1 Note that: Ellipsoid is centered around current solution x d is in the null space of A and can therefore be used as the direction without projection we are optimizing in the desired direction cT What does the Dikin Ellipsoid do for us? (Dikin ellipsoid constraint)

Dikin Ellipsoid For x > 0 (a true “interior point”), D-2 exists (no division by 0) and This constraint prevents y from crossing any face, i.e., x+d ≥ 0. (Combined with Ad = 0, we have feasibility.) Optimal value on boundary of ellipsoid due to convexity of ellipsoid, linear cost function. Ellipsoid biases search away from corners. x1 x2 x4 x3 x5 d x

Finding Optimal minimize cTd subject to Ad = 0 dTD-2d ≤ 1 At first glance this looks harder to solve than a linear program! We now have a non-linear constraint dTD-2d ≤ 1. But symmetry and lack of corners actually makes this easier to solve.

How to find min: Map ellipsoid to unit sphere By substitution of variables: d = Dd’ minimize: cTDd’ subject to: ADd’ = 0 d’TDD-2 Dd’ ≤ 1  d’Td’ ≤ 1 (Note D = DT because D is diagonal and DD-1=D-1D=I.) Distance to unit sphere d’Td’ = 1 is the same in any direction, so optimal point is in direction d’=cTD. So we project the direction cTD = Dc onto the nullspace of B = AD: d’ = (I – BT(BBT)-1B)Dc and d = Dd’ = D (I – BT(BBT)-1B)Dc As before, solve for w in BBTw = BDc and d = D(Dc – BTw) = D2(c – ATw)

Affine Interior Point Method Pick start x0 Symbolically factor AAT (LU decomposition) Repeat until done (within some threshold) B = ADi Solve BBTw = ADiDiTAT = BDic for w (use symbolically factored AAT… same non-zero structure) d = Di(Dic – BTw) move in direction d a fraction a of the way to the boundary (something like a = .96 is used in practice) Note that Di changes in each iteration since it depends on xi

Potential Reduction Method minimize: z = q ln(cTx – bTy) - Sj=1n ln(xj) subject to: Ax = b x ≥ 0 yA + s = 0 (dual of standard form LP) s ≥ 0 First term of z is the optimization term. q is a scalar such that q > n. The second term of z is the centering term. The objective function is not linear. Use hill climbing or “Newton Step” to optimize. (cTx – bTy) goes to 0 near the solution

Central Trajectory (log barrier) Dates back to 50s for nonlinear problems. On step i: minimize: cTx - mk åj=1n ln(xj), s.t. Ax = b, x > 0 select: mk+1 ≤ mk Each minimization can be done with a constrained Newton step. mk needs to approach zero to terminate. A primal-dual version using higher order approximations is currently the best interior-point method in practice.

Summary of Algorithms Actual algorithms used in practice are very sophisticated Practice matches theory reasonably well Interior-point methods dominate when Large n Small Cholesky factors (i.e., low fill) Highly degenerate Simplex dominates when starting from a previous solution very close to the final solution Ellipsoid algorithm not currently practical Large problems can take hours or days to solve. Parallelism is very important.