Core-Sets and Geometric Optimization problems. Piyush Kumar Department of Computer Science Florida State University http://piyush.compgeom.com Email: piyush@acm.org Joint work with Alper Yildirim
Talk Outline Introduction to Core-Sets for Geometric Optimization Problems. Problems and Applications Minimum Enclosing Balls (Next talk) Axis Aligned Minimum Volume Ellipsoids Motivation Optimization Formulation/IVA Algorithm Computational Experiments Future Directions.
In order to cluster, we need Geometric Clustering In order to cluster, we need Points ( Polyhedra / Balls / Ellipsoids ? ) A distance measure Assume Euclidian unless otherwise stated. A method for evaluating our clustering. (We look at k-centers, 1-Ecenter, 1-center, kernel 1-center, k-line-centers)
Fitting/Covering Problems For some shapes, the problem is convex and hence tractable. (MEB / MVE / AAMVE). Minimize the maximum distance O(n) in 2D but becomes harder in higher dimensions. Least squares / SVM Regression / … d ≠ O(1)?
Fitting multiple subspaces/shapes? Non-Convex (Min the max distance) / Non-Linear / NP-Hard to approx. Has many applications Minimize sum of distances (orthogonal) : SVD Other assumptions : GPCA/PPCA/PCA…
[AHV06 Survey on Core-Sets] Core Sets are a small subset of the input points whose fitting/covering is “almost” same as the fitting/covering of the entire input. [AHV06 Survey on Core-Sets]
Centers Core Set points Core-Sets
Core-Sets : Why Care?? Because they are small ! Hence we can work on large data sets Because if we can solve the Optimization problem for Core Sets, we are guaranteed to be near the optimal for the entire input Because they are easy to compute. Most algorithms are practical. Because they exist for even infinite point sets (MEB of balls , ellipsoids, etc)
Core-Sets Summary of known results for high dimensions.
High Level Algorithm (for most core-set algorithms) Compute an initial approximation of the solution and its core-set. 2. Find the furthest violator q. 3. Add q to the current core-set and update the corresponding solution. 4. Goto 2 if the solution is not good enough.
Axis Aligned Minimum Volume Ellipsoids Motivation. Optimization Formulation. Initial Volume Approximation. Algorithm. Computational Experiments.
Motivation Collision Detection [Eberly 01] Bounding volume Hierarchies Machine Learning [BJKS 04] Kernel clustering between MVEs and MEBs?
Optimization Formulation Volume of unit ball in n-dim space.
Optimization Formulation Convex
Optimization Formulation
High Level Algorithm (for most core-set algorithms) Compute an initial approximation of the solution and its core-set. 2. Find the furthest point q from the current solution. 3. Add q to the current core-set and update the corresponding ellipsoid. 4. Goto 2 if the solution is not good enough.
Optimization Formulation: Lemma 1
Initial Volume Approximation Output :
Feasible solution of (LD) Furthest point from current ellipsoid. Quality measure of current ellipsoid.
Feasible solution of (LD) Furthest point from current ellipsoid. Quality measure of current ellipsoid.
Increase weight for furthest point while decreasing it for remaining Points ensuring feasibility for (LD)
What the algorithm outputs? Complexity:
Computational Experiments Implementation in Matlab
MVE/AAMVE with outliers k-AAMVE Coverings. Future Work MVE/AAMVE with outliers k-AAMVE Coverings. Distribution dependent tighter core-set bounds? Better practical methods? Acknowledgements NSF CAREER for support.