Core-Sets and Geometric Optimization problems.

Slides:



Advertisements
Similar presentations
Pattern Recognition and Machine Learning
Advertisements

Sublinear-time Algorithms for Machine Learning Ken Clarkson Elad Hazan David Woodruff IBM Almaden Technion IBM Almaden.
Statistical Machine Learning- The Basic Approach and Current Research Challenges Shai Ben-David CS497 February, 2007.
Partitional Algorithms to Detect Complex Clusters
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
 Review: The Greedy Method
Approximation Algorithms Chapter 14: Rounding Applied to Set Cover.
CSE 681 Bounding Volumes. CSE 681 Bounding Volumes Use simple volume enclose object(s) tradeoff for rays where there is extra intersection test for object.
Collision Detection and Resolution Zhi Yuan Course: Introduction to Game Development 11/28/
Support vector machine
Approximation Algorithms Chapter 5: k-center. Overview n Main issue: Parametric pruning –Technique for approximation algorithms n 2-approx. algorithm.
S. J. Shyu Chap. 1 Introduction 1 The Design and Analysis of Algorithms Chapter 1 Introduction S. J. Shyu.
CSC321: Neural Networks Lecture 3: Perceptrons
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Principal Component Analysis
Explaining High-Dimensional Data
Iterative closest point algorithms
Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University
Job Scheduling Lecture 19: March 19. Job Scheduling: Unrelated Multiple Machines There are n jobs, each job has: a processing time p(i,j) (the time to.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
Computational Geometry Piyush Kumar (Lecture 5: Linear Programming) Welcome to CIS5930.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
How to reform a terrain into a pyramid Takeshi Tokuyama (Tohoku U) Joint work with Jinhee Chun (Tohoku U) Naoki Katoh (Kyoto U) Danny Chen (U. Notre Dame)
1 Bart Jansen Independent Set Kernelization for a Refined Parameter: Upper and Lower bounds TACO Day, Utrecht January 12 th, 2011 Joint work with Hans.
Particle Filters for Shape Correspondence Presenter: Jingting Zeng.
RSVM: Reduced Support Vector Machines Y.-J. Lee & O. L. Mangasarian First SIAM International Conference on Data Mining Chicago, April 6, 2001 University.
Computing Inner and Outer Shape Approximations Joseph S.B. Mitchell Stony Brook University.
Linear Models for Classification
CS558 Project Local SVM Classification based on triangulation (on the plane) Glenn Fung.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
1 Enclosing Ellipsoids of Semi-algebraic Sets and Error Bounds in Polynomial Optimization Makoto Yamashita Masakazu Kojima Tokyo Institute of Technology.
1  Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 8: Crash Course in Computational Complexity.
NP Completeness Piyush Kumar. Today Reductions Proving Lower Bounds revisited Decision and Optimization Problems SAT and 3-SAT P Vs NP Dealing with NP-Complete.
Clustering (1) Chapter 7. Outline Introduction Clustering Strategies The Curse of Dimensionality Hierarchical k-means.
Proximal Plane Classification KDD 2001 San Francisco August 26-29, 2001 Glenn Fung & Olvi Mangasarian Second Annual Review June 1, 2001 Data Mining Institute.
Support Vector Machine: An Introduction. (C) by Yu Hen Hu 2 Linear Hyper-plane Classifier For x in the side of o : w T x + b  0; d = +1; For.
Lower Bounds on Extended Formulations Noah Fleming University of Toronto Supervised by Toniann Pitassi.
Lecture 16: Image alignment
Support vector machines
Semi-Supervised Clustering
LINEAR CLASSIFIERS The Problem: Consider a two class task with ω1, ω2.
Joint work with Hans Bodlaender
Multiplicative updates for L1-regularized regression
Abolfazl Asudeh Azade Nazi Nan Zhang Gautam DaS
Haim Kaplan and Uri Zwick
Nonnegative polynomials and applications to learning
Approximation Algorithms
Support Vector Machines
Computability and Complexity
Multiple Column Partitioned Min Max
CSCI B609: “Foundations of Data Science”
Online Learning Kernels
Support vector machines
Fair Clustering through Fairlets ( NIPS 2017)
Minimizing the Aggregate Movements for Interval Coverage
I.4 Polyhedral Theory (NW)
Kinetic Collision Detection for Convex Fat Objects
Support vector machines
Computer Animation Algorithms and Techniques
Support vector machines
Concave Minimization for Support Vector Machine Classifiers
Lecture 15: Least Square Regression Metric Embeddings
I.4 Polyhedral Theory.
Clustering.
Zero-Skew Trees Zero-Skew Tree: rooted tree in which all root-to-leaf paths have the same length Used in VLSI clock routing & network multicasting.
University of Wisconsin - Madison
Chapter 1. Formulations.
Presentation transcript:

Core-Sets and Geometric Optimization problems. Piyush Kumar Department of Computer Science Florida State University http://piyush.compgeom.com Email: piyush@acm.org Joint work with Alper Yildirim

Talk Outline Introduction to Core-Sets for Geometric Optimization Problems. Problems and Applications Minimum Enclosing Balls (Next talk) Axis Aligned Minimum Volume Ellipsoids Motivation Optimization Formulation/IVA Algorithm Computational Experiments Future Directions.

In order to cluster, we need Geometric Clustering In order to cluster, we need Points ( Polyhedra / Balls / Ellipsoids ? ) A distance measure Assume Euclidian unless otherwise stated. A method for evaluating our clustering. (We look at k-centers, 1-Ecenter, 1-center, kernel 1-center, k-line-centers)

Fitting/Covering Problems For some shapes, the problem is convex and hence tractable. (MEB / MVE / AAMVE). Minimize the maximum distance O(n) in 2D but becomes harder in higher dimensions. Least squares / SVM Regression / … d ≠ O(1)?

Fitting multiple subspaces/shapes? Non-Convex (Min the max distance) / Non-Linear / NP-Hard to approx. Has many applications Minimize sum of distances (orthogonal) : SVD Other assumptions : GPCA/PPCA/PCA…

[AHV06 Survey on Core-Sets] Core Sets are a small subset of the input points whose fitting/covering is “almost” same as the fitting/covering of the entire input. [AHV06 Survey on Core-Sets]

Centers Core Set points Core-Sets

Core-Sets : Why Care?? Because they are small ! Hence we can work on large data sets Because if we can solve the Optimization problem for Core Sets, we are guaranteed to be near the optimal for the entire input Because they are easy to compute. Most algorithms are practical. Because they exist for even infinite point sets (MEB of balls , ellipsoids, etc)

Core-Sets Summary of known results for high dimensions.

High Level Algorithm (for most core-set algorithms) Compute an initial approximation of the solution and its core-set. 2. Find the furthest violator q. 3. Add q to the current core-set and update the corresponding solution. 4. Goto 2 if the solution is not good enough.

Axis Aligned Minimum Volume Ellipsoids Motivation. Optimization Formulation. Initial Volume Approximation. Algorithm. Computational Experiments.

Motivation Collision Detection [Eberly 01] Bounding volume Hierarchies Machine Learning [BJKS 04] Kernel clustering between MVEs and MEBs?

Optimization Formulation Volume of unit ball in n-dim space.

Optimization Formulation Convex

Optimization Formulation

High Level Algorithm (for most core-set algorithms) Compute an initial approximation of the solution and its core-set. 2. Find the furthest point q from the current solution. 3. Add q to the current core-set and update the corresponding ellipsoid. 4. Goto 2 if the solution is not good enough.

Optimization Formulation: Lemma 1

Initial Volume Approximation Output :

Feasible solution of (LD) Furthest point from current ellipsoid. Quality measure of current ellipsoid.

Feasible solution of (LD) Furthest point from current ellipsoid. Quality measure of current ellipsoid.

Increase weight for furthest point while decreasing it for remaining Points ensuring feasibility for (LD)

What the algorithm outputs? Complexity:

Computational Experiments Implementation in Matlab

MVE/AAMVE with outliers k-AAMVE Coverings. Future Work MVE/AAMVE with outliers k-AAMVE Coverings. Distribution dependent tighter core-set bounds? Better practical methods? Acknowledgements NSF CAREER for support.