Lecture 16: Earth-Mover Distance

Slides:



Advertisements
Similar presentations
Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
Advertisements

Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Embedding Metric Spaces in Their Intrinsic Dimension Ittai Abraham, Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech.
A Dependent LP-Rounding Approach for the k-Median Problem Moses Charikar 1 Shi Li 1 1 Department of Computer Science Princeton University ICALP 2012, Warwick,
On Complexity, Sampling, and -Nets and -Samples. Range Spaces A range space is a pair, where is a ground set, it’s elements called points and is a family.
Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.
Embedding the Ulam metric into ℓ 1 (Ενκρεβάτωση του μετρικού χώρου Ulam στον ℓ 1 ) Για το μάθημα “Advanced Data Structures” Αντώνης Αχιλλέως.
Algorithmic High-Dimensional Geometry 1 Alex Andoni (Microsoft Research SVC)
Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)
 Distance Problems: › Post Office Problem › Nearest Neighbors and Closest Pair › Largest Empty and Smallest Enclosing Circle  Sub graphs of Delaunay.
Searching on Multi-Dimensional Data
MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.
Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
Data Structures and Functional Programming Algorithms for Big Data Ramin Zabih Cornell University Fall 2012.
Complexity 16-1 Complexity Andrei Bulatov Non-Approximability.
Polynomial Time Approximation Scheme for Euclidian Traveling Salesman
Probabilistic Methods in Coding Theory: Asymmetric Covering Codes Joshua N. Cooper UCSD Dept. of Mathematics Robert B. Ellis Texas A&M Dept. of Mathematics.
Lecture 6: Point Location Computational Geometry Prof. Dr. Th. Ottmann 1 Point Location 1.Trapezoidal decomposition. 2.A search structure. 3.Randomized,
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
Dimensionality Reduction
Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.
Embedding and Sketching Non-normed spaces Alexandr Andoni (MSR)
Volume distortion for subsets of R n James R. Lee Institute for Advanced Study & University of Washington Symposium on Computational Geometry, 2006; Sedona,
Efficient Partition Trees Jiri Matousek Presented By Benny Schlesinger Omer Tavori 1.
Edge-disjoint induced subgraphs with given minimum degree Raphael Yuster 2012.
Similarity Searching in High Dimensions via Hashing Paper by: Aristides Gionis, Poitr Indyk, Rajeev Motwani.
1 Embedding and Similarity Search for Point Sets under Translation Minkyoung Cho and David M. Mount University of Maryland SoCG 2008.
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
11 Lecture 24: MapReduce Algorithms Wrap-up. Admin PS2-4 solutions Project presentations next week – 20min presentation/team – 10 teams => 3 days – 3.
A Binary Linear Programming Formulation of the Graph Edit Distance Presented by Shihao Ji Duke University Machine Learning Group July 17, 2006 Authors:
Sparse RecoveryAlgorithmResults  Original signal x = x k + u, where x k has k large coefficients and u is noise.  Acquire measurements Ax = y. If |x|=n,
Coarse Differentiation and Planar Multiflows
Encountering Euler's Number
Information Complexity Lower Bounds
Stream-based Geometric Algorithms
Lecture 22: Linearity Testing Sparse Fourier Transform
Haim Kaplan and Uri Zwick
Spectral Clustering.
Lecture 18: Uniformity Testing Monotonicity Testing
Sublinear Algorithmic Tools 3
Lecture 11: Nearest Neighbor Search
Computability and Complexity
COMS E F15 Lecture 2: Median trick + Chernoff, Distinct Count, Impossibility Results Left to the title, a presenter can insert his/her own image.
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
Lecture 4: CountSketch High Frequencies
Lecture 7: Dynamic sampling Dimension Reduction
Enumerating Distances Using Spanners of Bounded Degree
Randomized Algorithms CS648
Near-Optimal (Euclidean) Metric Compression
עידן שני ביה"ס למדעי המחשב אוניברסיטת תל-אביב
Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University
Locality Sensitive Hashing
Overcoming the L1 Non-Embeddability Barrier
Introduction Wireless Ad-Hoc Network
Fair Clustering through Fairlets ( NIPS 2017)
Chapter 11 Limitations of Algorithm Power
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Embedding Metrics into Geometric Spaces
CS5112: Algorithms and Data Structures for Applications
Lecture 6: Counting triangles Dynamic graphs & sampling
Lecture 15: Least Square Regression Metric Embeddings
Clustering.
President’s Day Lecture: Advanced Nearest Neighbor Search
Approximating Edit Distance in Near-Linear Time
Calibration and homographies
Switching Lemmas and Proof Complexity
Locality In Distributed Graph Algorithms
Presentation transcript:

Lecture 16: Earth-Mover Distance COMS E6998-9 F15 Lecture 16: Earth-Mover Distance Left to the title, a presenter can insert his/her own image pertinent to the presentation.

Administrivia, Plan Administrivia: Plan: Scriber? NO CLASS next Tuesday 11/3 (holiday) Plan: Earth-Mover Distance Scriber?

Earth-Mover Distance Definition: Which metric space? Given two sets 𝐴, 𝐵 of points in a metric space 𝐸𝑀𝐷(𝐴,𝐵) = min cost bipartite matching between 𝐴 and 𝐵 Which metric space? Can be plane, ℓ 2 , ℓ 1 … Applications in image vision Images courtesy of Kristen Grauman

Embedding EMD into ℓ 1 Why ℓ 1 ? At least as hard as ℓ 1 Can embed 0,1 𝑑 into EMD with distortion 1 ℓ 1 is richer than ℓ 2 Will focus on integer grid Δ 2 :

Embedding EMD into ℓ 1 [Charikar’02, Indyk-Thaper’03] Theorem: Can embed EMD over Δ 2 into ℓ 1 with distortion 𝑂 log Δ . In fact, will construct a randomized 𝑓: 2 Δ 2 → ℓ 1 such that: for any 𝐴,𝐵⊂ Δ 2 : 𝐸𝑀𝐷 𝐴,𝐵 ≤𝑬 ||𝑓 𝐴 −𝑓 𝐵 | | 1 ≤𝑂 log Δ ⋅𝐸𝑀𝐷(𝐴,𝐵) time to embed a set of 𝑠 points: 𝑂 𝑠 log Δ . Consequences: Nearest Neighbor Search: 𝑂(𝑐 log Δ ) approximation with 𝑂(𝑠 𝑛 1+1/𝑐 ) space, and 𝑂( 𝑛 1/𝑐 ⋅𝑠 log Δ ) query time. Computation: 𝑂(log⁡Δ) approximation in 𝑂 𝑠 log Δ time Best known: 1+𝜖 approximation in 𝑂 𝑠 time [AS’12]

What if 𝐴 ≠|𝐵| ? Suppose: Define where 𝐴 =𝑎 𝐵 =𝑏<𝑎 Define 𝐸𝑀𝐷 ∆ 𝐴,𝐵 =Δ 𝑎−𝑏 + 𝑚𝑖𝑛 𝐴′,𝜋 𝑎∈𝐴′ 𝑑(𝑎,𝜋 𝑎 ) where 𝐴 ′ ranges over all subsets of 𝐴 of size 𝑏 𝜋:𝐴′→ 𝐵 ranges over all 1-to-1 mappings For optimal 𝐴′, call 𝑎∈𝐴\ 𝐴 ′ unmatched

Embedding EMD over small grid Suppose =3 𝑓(𝐴) has nine coordinates, counting # points in each integer point 𝑓(𝐴)=(2,1,1,0,0,0,1,0,0) 𝑓(𝐵)=(1,1,0,0,2,0,0,0,1) Claim: 2 2 distortion embedding

High level embedding Set in Δ 2 box Embedding of set 𝐴: Want to prove take a quad-tree grid of cell size Δ/3 partition each cell in 3x3 recurse until of size 3x3 randomly shift it Each cell gives a coordinate: 𝑓 (𝐴)𝑐=#points in the cell 𝑐 Want to prove 𝑬 𝑓 𝐴 −𝑓 𝐵 1 ≈𝐸𝑀𝐷(𝐴,𝐵) 2 1 1 1 2 2 2 1 2 1 𝑓 𝑨 = …2210… 0002…0011…0100…0000… 𝑓 𝑩 = …1202… 0100…0011…0000…1100…

Main idea: intuition Decompose EMD over []2 into EMDs over smaller grids Recursively reduce to =𝑂(1) ≈ +

Decomposition Lemma 𝑘 /𝑘 For randomly-shifted cut-grid 𝐺 of side length 𝑘, will prove: 1) 𝐸𝑀𝐷(𝐴,𝐵) ≤ 𝐸𝑀 𝐷 𝑘 (𝐴1, 𝐵1) + 𝐸𝑀 𝐷 𝑘 (𝐴2,𝐵2)+… + 𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 (𝐴𝐺, 𝐵𝐺) 2) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥ 1 3 𝑬[𝐸𝑀 𝐷 𝑘 𝐴 1 , 𝐵 1 +𝐸𝑀 𝐷 𝑘 𝐴 2 , 𝐵 2 +…] 3) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 )] The distortion will follow by applying the lemma recursively to (AG,BG) 𝑘 /𝑘

1 (lower bound) Claim 1: for a randomly-shifted cut-grid 𝐺 of side length 𝑘: 𝐸𝑀𝐷(𝐴,𝐵) ≤ 𝐸𝑀𝐷𝑘(𝐴1, 𝐵1) + 𝐸𝑀𝐷𝑘(𝐴2,𝐵2)+… + 𝑘⋅ 𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 ) Construct a matching 𝜋 for 𝐸𝑀 𝐷 Δ 𝐴,𝐵 from the matchings on RHS as follows For each 𝑎𝐴 (suppose 𝑎 𝐴 𝑖 ) it is either: 1) matched in 𝐸𝑀𝐷( 𝐴 𝑖 , 𝐵 𝑖 ) to some 𝑏 𝐵 𝑖 (if 𝑎∈ 𝐴 𝑖 ′) then 𝜋 𝑎 =𝑏 2) or 𝑎∉ 𝐴 𝑖 ′, and then it is matched in 𝐸𝑀𝐷( 𝐴 𝐺 , 𝐵 𝐺 ) to some 𝑏 𝐵 𝑗 (𝑗≠𝑖) Cost? 1) paid by 𝐸𝑀𝐷( 𝐴 𝑖 , 𝐵 𝑖 ) 2) Move 𝑎 to center () Charge to 𝐸𝑀𝐷( 𝐴 𝑖 , 𝐵 𝑖 ) Move from cell 𝑖 to cell 𝑗 Charge 𝑘 to 𝐸𝑀𝐷( 𝐴 𝐺 , 𝐵 𝐺 ) If 𝐴 >|𝐵|, extra |𝐴|−|𝐵| pay 𝑘⋅ Δ 𝑘 =Δ on LHS & RHS 𝑘 /𝑘

2 & 3 (upper bound) Claims 2,3: for a randomly-shifted cut-grid 𝐺 of side length 𝑘, we have: 2) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥ 1 3 𝑬[𝐸𝑀 𝐷 𝑘 𝐴 1 , 𝐵 1 +𝐸𝑀 𝐷 𝑘 𝐴 2 , 𝐵 2 +…] 3) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 )] Fix a matching 𝜋 minimizing 𝐸𝑀 𝐷 Δ (𝐴,𝐵) Will construct matchings for each EMD on RHS Uncut pairs 𝑎,𝑏 ∈𝜋 are matched in respective (𝐴,𝐵) Cut pairs 𝑎,𝑏 ∈𝜋 : are unmatched in their mini-grids are matched in ( 𝐴 𝐺 , 𝐵 𝐺 )

3: Cost Claim 2: 3⋅𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝐸𝑀 𝐷 𝑘 𝐴 1 , 𝐵 1 +𝐸𝑀 𝐷 𝑘 𝐴 2 , 𝐵 2 +…] Uncut pairs (𝑎,𝑏) are matched in respective ( 𝐴 𝑖 , 𝐵 𝑖 ) Total contribution from uncut pairs ≤𝐸𝑀 𝐷 Δ (𝐴,𝐵) Consider a cut pair (𝑎,𝑏) at distance 𝑎−𝑏=( 𝑑 𝑥 , 𝑑 𝑦 ) (𝑎,𝑏) can contribute to RHS as they may be unmatched in their own mini-grids Pr[(𝑎,𝑏) cut] =1− 1− 𝑑 𝑥 𝑘 + 1− 𝑑 𝑦 𝑘 + ≤ 𝑑 𝑥 𝑘 + 𝑑 𝑦 𝑘 ≤ 1 𝑘 ||𝑎−𝑏| | 2 Expected contribution of (𝑎,𝑏) to RHS: ≤Pr[(𝑎,𝑏) cut] ⋅2𝑘≤2 𝑎−𝑏 2 Total expected cost contributed to RHS: 2⋅𝐸𝑀 𝐷 Δ (𝐴,𝐵) Total (cut & uncut pairs): 3⋅𝐸𝑀 𝐷 Δ (𝐴,𝐵) 𝑘 𝑑 𝑥

3: Cost Claim: 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 )] 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 )] Uncut pairs: contribute zero to RHS! Cut pair: 𝑎,𝑏 ∈𝜋 with 𝑎−𝑏=( 𝑑 𝑥 , 𝑑 𝑦 ) if 𝑑 𝑥 = 𝑥𝑘+ 𝑟 𝑘 , and 𝑑 𝑦 =𝑦𝑘+ 𝑟 𝑦 , then expected cost contribution to 𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 ): ≤ 𝑥+ 𝑟 𝑥 𝑘 ⋅𝑘 + 𝑦+ 𝑟 𝑦 𝑘 ⋅𝑘 = 𝑑 𝑥 + 𝑑 𝑦 = 𝑎−𝑏 2 Total expected cost ≤ 𝐸𝑀 𝐷 Δ (𝐴,𝐵) 𝑘 𝑘 𝑘 𝑑 𝑥

Recurse on decomposition For randomly-shifted cut-grid 𝐺 of side length 𝑘, we have: 1) 𝐸𝑀𝐷(𝐴,𝐵) ≤ 𝐸𝑀 𝐷 𝑘 (𝐴1, 𝐵1) + 𝐸𝑀 𝐷 𝑘 (𝐴2,𝐵2)+… + 𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 (𝐴𝐺, 𝐵𝐺) 2) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥ 1 3 𝑬[𝐸𝑀 𝐷 𝑘 𝐴 1 , 𝐵 1 +𝐸𝑀 𝐷 𝑘 𝐴 2 , 𝐵 2 +…] 3) 𝐸𝑀 𝐷 Δ 𝐴,𝐵 ≥𝑬[𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 , 𝐵 𝐺 )] We applying decomposition recursively for 𝑘=3 Choose randomly-shifted cut-grid 𝐺 1 on []2 Obtain many grids [3]2, and a big grid [/3]2 Then choose randomly-shifted cut-grid 𝐺 2 on [/3]2 Obtain more grids [3]2, and another big grid [/9]2 Then choose randomly-shifted cut-grid 𝐺 3 on [/9]2 … Then, embed each of the small grids [3]2 into ℓ1, using 𝑂(1) distortion embedding, and concatenate the embeddings Each 3 2 grid occupies 9 coordinates on ℓ 1 embedding

Proving recursion works Claim: embedding contracts distances by 𝑂 1 : 𝐸𝑀𝐷(𝐴,𝐵) ≤ ≤∑𝑖 𝐸𝑀𝐷𝑘 𝐴𝑖, 𝐵𝑖 +𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 1 , 𝐵 𝐺 1 ) ≤∑𝑖 𝐸𝑀 𝐷 𝑘 𝐴𝑖, 𝐵𝑖 + 𝑘∑𝑖 𝐸𝑀𝐷𝑘 𝐴 𝐺 1 ,𝑖 , 𝐵 𝐺 1 ,𝑖 +𝑘⋅ 𝐸𝑀 𝐷 Δ 𝑘 2 𝐴 𝐺 2 , 𝐵 𝐺 2 ≤ … ≤ sum of 𝐸𝑀 𝐷 3 costs of 3×3 instances ≤ 1 2 2 ||𝑓 𝐴 −𝑓 𝐵 | | 1 Claim: embedding distorts distances by 𝑂 log Δ in expectation: 3 log 𝑘 Δ ⋅𝐸𝑀𝐷(𝐴,𝐵) ≥3⋅𝐸𝑀𝐷(𝐴, 𝐵) + 3 log 𝑘 Δ 𝑘 ⋅𝐸𝑀 𝐷 Δ (𝐴, 𝐵) ≥𝐄[ ∑𝑖 𝐸𝑀 𝐷 𝑘 (𝐴𝑖, 𝐵𝑖) + 3 log 𝑘 Δ 𝑘 ⋅𝑘⋅𝐸𝑀 𝐷 Δ/𝑘 ( 𝐴 𝐺 1 , 𝐵 𝐺 1 ) ] ≥… ≥ sum of 𝐸𝑀 𝐷 3 costs of 3×3 instances ≥ ||𝑓 𝐴 −𝑓 𝐵 | | 1

Final theorem Theorem: can embed EMD over [Δ]2 into ℓ 1 with 𝑂( log Δ ) distortion in expectation. Notes: Dimension required: 𝑂( Δ 2 ), but a set 𝐴 of size 𝑠 maps to a vector that has only 𝑂(𝑠⋅ log Δ ) non-zero coordinates. Time: can compute in 𝑂(𝑠⋅log⁡) By Markov’s, it’s 𝑂( log Δ ) distortion with 90% probability Applications: Can compute 𝐸𝑀𝐷(𝐴,𝐵) in time 𝑂(𝑠⋅ log Δ ) NNS: 𝑂(𝑐⋅log Δ) approximation, with 𝑂( 𝑛 1+1/𝑐 ⋅𝑠) space, and 𝑂( 𝑛 1/𝑐 ⋅𝑠⋅log Δ) query time.