Approximate Near Neighbors for General Symmetric Norms

Slides:

Advertisements

Similar presentations

Sublinear-time Algorithms for Machine Learning Ken Clarkson Elad Hazan David Woodruff IBM Almaden Technion IBM Almaden.

Advertisements

Subspace Embeddings for the L1 norm with Applications Christian Sohler David Woodruff TU Dortmund IBM Almaden.

A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.

On Complexity, Sampling, and -Nets and -Samples. Range Spaces A range space is a pair, where is a ground set, it’s elements called points and is a family.

Lecture 9 Support Vector Machines

Nearest Neighbor Search in High Dimensions Seminar in Algorithms and Geometry Mica Arie-Nachimson and Daniel Glasner April 2009.

Algorithmic High-Dimensional Geometry 1 Alex Andoni (Microsoft Research SVC)

Spectral Approaches to Nearest Neighbor Search arXiv: Robert Krauthgamer (Weizmann Institute) Joint with: Amirali Abdullah, Alexandr Andoni, Ravi.

Overcoming the L 1 Non- Embeddability Barrier Robert Krauthgamer (Weizmann Institute) Joint work with Alexandr Andoni and Piotr Indyk (MIT)

MIT CSAIL Vision interfaces Towards efficient matching with random hashing methods… Kristen Grauman Gregory Shakhnarovich Trevor Darrell.

Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.

Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]

Data Structures and Functional Programming Algorithms for Big Data Ramin Zabih Cornell University Fall 2012.

Navigating Nets: Simple algorithms for proximity search Robert Krauthgamer (IBM Almaden) Joint work with James R. Lee (UC Berkeley)

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)

Semidefinite Programming

Approximate Nearest Neighbors and the Fast Johnson-Lindenstrauss Transform Nir Ailon, Bernard Chazelle (Princeton University)

1 Numerical geometry of non-rigid shapes Spectral Methods Tutorial. Spectral Methods Tutorial 6 © Maks Ovsjanikov tosca.cs.technion.ac.il/book Numerical.

Testing Metric Properties Michal Parnas and Dana Ron.

Y. Weiss (Hebrew U.) A. Torralba (MIT) Rob Fergus (NYU)

Optimal Data-Dependent Hashing for Approximate Near Neighbors

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.

Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.

Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.

Sketching and Nearest Neighbor Search (2) Alex Andoni (Columbia University) MADALGO Summer School on Streaming Algorithms 2015.

Geometric Problems in High Dimensions: Sketching Piotr Indyk.

Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwani The 30 th annual ACM symposium on theory of computing.

Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.

Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.

Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.

Optimal Data-Dependent Hashing for Nearest Neighbor Search Alex Andoni (Columbia University) Joint work with: Ilya Razenshteyn.

Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.

Data Transformation: Normalization

Information Complexity Lower Bounds

New Characterizations in Turnstile Streams with Applications

Dimension reduction for finite trees in L1

Understanding Generalization in Adaptive Data Analysis

Ultra-low-dimensional embeddings of doubling metrics

Spectral Methods Tutorial 6 1 © Maks Ovsjanikov

Sublinear Algorithmic Tools 3

Sublinear Algorithmic Tools 2

Jianping Fan Dept of CS UNC-Charlotte

K Nearest Neighbor Classification

Sketching and Embedding are Equivalent for Norms

Lecture 7: Dynamic sampling Dimension Reduction

Turnstile Streaming Algorithms Might as Well Be Linear Sketches

Near(est) Neighbor in High Dimensions

Data-Dependent Hashing for Nearest Neighbor Search

Lecture 16: Earth-Mover Distance

Linear sketching with parities

Probabilistic Models with Latent Variables

Yair Bartal Lee-Ad Gottlieb Hebrew U. Ariel University

Locality Sensitive Hashing

Linear sketching over

Overcoming the L1 Non-Embeddability Barrier

Advances in Linear Sketching over Finite Fields

Summarizing Data by Statistics

Sampling in Graphs: node sparsifiers

Linear sketching with parities

CS5112: Algorithms and Data Structures for Applications

Streaming Symmetric Norms via Measure Concentration

On Approximating Covering Integer Programs

CS5112: Algorithms and Data Structures for Applications

Lecture 6: Counting triangles Dynamic graphs & sampling

Lecture 15: Least Square Regression Metric Embeddings

Minwise Hashing and Efficient Search

President’s Day Lecture: Advanced Nearest Neighbor Search

Memory-Based Learning Instance-Based Learning K-Nearest Neighbor

Recent Structure Lemmas for Depth-Two Threshold Circuits

Presentation transcript:

Approximate Near Neighbors for General Symmetric Norms Ilya Razenshteyn (MIT CSAIL) joint with Alexandr Andoni (Columbia University) Aleksandar Nikolov (University of Toronto) Erik Waingarten (Columbia University) arXiv:1611.06222

Nearest Neighbor Search Motivation Data Feature vector space + distance function Data analysis Geometry / Linear Algebra / Optimization Similarity search Nearest Neighbor Search

An example Word embeddings Yale Harvard graduate faculty undergraduate Juilliard university undergraduates Cornell MIT Word embeddings High-dimensional vectors that capture semantic similarity between words (and more) GloVe [Pennigton, Socher, Manning 2014], 400K words, 300 dimensions Ten nearest neighbors for “NYU”?

Approximate Near Neighbors (ANN) Dataset: 𝑛 points in a metric space 𝑋 (denote by 𝑃) Approximation 𝑐>1, distance threshold 𝑟>0 Query: 𝑞∈𝑋 such that there is 𝑝 ∗ ∈𝑃 with 𝑑 𝑋 (𝑞, 𝑝 ∗ )≤𝑟 Want: 𝑝 ∈𝑃 such that 𝑑 𝑋 (𝑞, 𝑝 )≤𝑐𝑟 Parameters: space, query time 𝑐𝑟 𝑟 𝑞

This talk: a metric on 𝑅 𝑑 , where 𝜔 log 𝑛 ≤𝑑≤ 𝑛 𝑜(1) FAQ Focus of this talk Q: why approximation? A: the exact case is hard for the high-dimensional problem. Q: what does “high-dimensional” mean? A: when 𝑑=𝜔( log 𝑛 ), where 𝑑 is the dimension of a metric. Q: how is the dimension defined? A: a metric is typically defined on 𝑅 𝑑 ; alternatively, doubling dimension, etc. Must depend on 𝑑 as 2 𝑜(𝑑) , ideally as 𝑑 𝑂(1) This talk: a metric on 𝑅 𝑑 , where 𝜔 log 𝑛 ≤𝑑≤ 𝑛 𝑜(1)

Which distance function to use? 𝛼 A distance function Must capture semantic similarity well Must be algorithmically tractable Word embeddings, etc.: cosine similarity The goal: classify metrics according to the complexity of high-dimensional ANN For theory: a poorly-understood property of a metric For practice: universal algorithm for ANN

High-dimensional norms An important case: 𝑋 is a normed space 𝑑 𝑋 𝑥 1 , 𝑥 2 = 𝑥 1 − 𝑥 2 , where ⋅ : 𝑅 𝑑 → 𝑅 + is such that 𝑥 =0 iff 𝑥=0 𝛼𝑥 =|𝛼| 𝑥 𝑥 1 + 𝑥 2 ≤ 𝑥 1 + 𝑥 2 Lots of tools (linear functional analysis) [Andoni, Krauthgamer, R 2015] characterizes norms that allow efficient sketching (succinct summarization), which implies efficient ANN Approximation O 𝑑 is easy (John’s theorem)

Unit balls A norm can be given by its unit ball 𝐵 𝑋 = 𝑥∈ 𝑅 𝑑 𝑥 ≤1 𝑥 ≈2 Claim: 𝐵 𝑋 is a symmetric convex body Claim: any such body can be a unit ball 𝑥 𝐾 = inf 𝑡>0 𝑥 𝑡 ∈𝐾 What property of a convex body makes ANN wrt it tractable? John’s theorem: any symmetric convex body is close to an ellipsoid (gives approximation 𝑑 ) 𝐵 𝑋 𝑥 ≈2

Our result Invariant under permutation of coordinates and changing signs 𝑑= 2 𝑜 log 𝑛 log log 𝑛 If 𝑋 is a symmetric normed space, and 𝑑= 𝑛 𝑜(1) , can solve ANN with: Approximation 𝑂(1) Space 𝑛 1+𝑜(1) Query time 𝑛 𝑜(1) log log 𝑛 𝑂(1)

Examples Usual 𝑙 𝑝 norms 𝑥 𝑝 = 𝑖 𝑥 𝑖 𝑝 1 𝑝 Top-𝑘 norm: sum of 𝑘 largest absolute values of coordinates Interpolates between 𝑙 1 and 𝑙 ∞ Orlicz norms: a unit ball is 𝑥∈ 𝑅 𝑑 𝑖 𝐺 𝑥 𝑖 ≤1 , Where 𝐺(⋅) is convex and non-negative, and 𝐺 0 =0. Gives 𝑙 𝑝 norms for 𝐺 𝑡 = 𝑡 𝑝 𝑘-support norm, box-Θ norm, 𝐾-functional (arise in probability and machine learning) 𝑡 𝐺(𝑡)

Prior work: symmetric norms [Blasiok, Braverman, Chestnut, Krauthgamer, Yang 2015]: classification of symmetric norms according to their streaming complexity Depends on how well the norm concentrates on the Euclidean ball Unlike streaming, ANN is always tractable

Prior work: ANN Mostly, focus on 𝑙 1 (Hamming/Manhattan) and 𝑙 2 (Euclidean) norms Work for many applications Allow efficient algorithms based on hashing Locality-Sensitive Hashing [Indyk, Motwani 1998] [Andoni, Indyk 2006] Data-dependent LSH [Andoni, Indyk, Nguyen, R 2014] [Andoni, R 2015] [Andoni, Laarhoven, R, Waingarten 2017]: tight trade-off between space and query time for every 𝑐>1 Few results for other norms ( 𝑙 ∞ , general 𝑙 𝑝 , will see later)

ANN for 𝑙 ∞ [Indyk 1998] ANN for 𝑑-dimensional 𝑙 ∞ : Space 𝑑⋅𝑛 1+𝜀 Query time 𝑂(𝑑 log 𝑛 ) Approximation 𝑂( 𝜀 −1 ⋅ log log 𝑑 ) Main idea: recursive partitioning “Small” ball with Ω(𝑛) points ― easy No such balls ― there is a “good” cut wrt some coordinate [Andoni, Croitoru, Patrascu 2008] [Kapralov, Panigrahy 2012]: Approximation 𝑂 log log 𝑑 is tight for decision trees!

𝑑 𝑌 𝑓 𝑎 , 𝑓 𝑏 /𝐶≤ 𝑑 𝑋 𝑎,𝑏 ≤ 𝑑 𝑌 (𝑓 𝑎 , 𝑓 𝑏 ) Metric embeddings A map 𝑓:𝑋→𝑌 is an embedding with distortion 𝑪, if for 𝑎,𝑏∈𝑋: 𝑑 𝑌 𝑓 𝑎 , 𝑓 𝑏 /𝐶≤ 𝑑 𝑋 𝑎,𝑏 ≤ 𝑑 𝑌 (𝑓 𝑎 , 𝑓 𝑏 ) Reductions for geometric problems 𝑓(𝑎) 𝑓(𝑏) 𝑓 𝑋 𝑌 a 𝑏 ANN with approximation 𝐷 for 𝑌 ANN with approximation 𝐶𝐷 for 𝑋

Embedding norms into 𝑙 ∞ For a normed space 𝑋 and 𝜀>0 there exists 𝑓:𝑋→ 𝑙 ∞ 𝑑 ′ with 𝑓 𝑥 ∞ ∈ 1±𝜀 ⋅ 𝑥 𝑋 Proof idea: 𝑥 𝑋 ≈ max 𝑦∈𝑁 𝑥, 𝑦 Take all directions and discretize (more details later) Can we combine it with ANN for 𝑙 ∞ and obtain ANN for any norm? No! Discretization requires 𝑑 ′ = 1 𝜀 𝑂(𝑑) . Tight even for 𝑙 2 . Approximation 𝑂 log log 𝑑 ′ =𝑂 log 𝑑 ≪ 𝑑 .

The strategy    What Where Dimension Any norm 𝑙 ∞ 𝑑 ′ 𝑑 ′ = 2 𝑂(𝑑) 𝑙 ∞ 𝑑 ′ 𝑑 ′ = 2 𝑂(𝑑)   Symmetric norm ⨁ 𝑙 ∞ ⨁ 𝑙 1 top− 𝑘 𝑖𝑗 norm 𝑑 ′ = 𝑑 𝑂( log log 𝑑 )  Bypass non-embeddability into low-dimensional 𝑙 ∞ allowing a more complicated host space, which is still tractable

𝑙 𝑝 -direct sums of metric spaces For metrics 𝑀 1 , 𝑀 2 , …, 𝑀 𝑡 , define ⨁ 𝑙 𝑝 𝑀 𝑖 as follows: The ground set is 𝑀 1 × 𝑀 2 ×…× 𝑀 𝑡 The distance is: 𝑑 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑡 , ( 𝑦 1 , 𝑦 2 ,…, 𝑦 𝑡 ) = (𝑑 𝑥 1 , 𝑦 1 , 𝑑 𝑥 2 , 𝑦 2 ,…,𝑑( 𝑥 𝑡 , 𝑦 𝑡 ) 𝑝 Example: ⨁ 𝑙 𝑝 𝑙 𝑞 ― cascaded norms Our host space: ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 , where 𝑋 𝑖𝑗 is 𝑅 𝑑 equipped with the top- 𝑘 𝑖𝑗 norm Outer sum is of size 𝑑 𝑂( log log 𝑑 ) Inner sum is of size 𝑑

Two necessary steps Embed a symmetric norm into ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 Solve ANN for ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 Prior work on ANN via product spaces: for Frechet distance [Indyk 2002], edit distance [Indyk 2004], and Ulam distance [Andoni, Indyk, Krauthgamer 2009]

ANN for ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 [Indyk 2002], [Andoni 2009]: if for 𝑀 1 , 𝑀 2 , …, 𝑀 𝑡 there are data structures for 𝑐-ANN, then for ⨁ 𝑙 𝑝 𝑀 𝑖 one can get 𝑂(𝑐 log log 𝑛 )-ANN with almost the same time and space A powerful generalization of ANN for 𝑙 ∞ [Indyk 1998] Trivially implies ANN for general 𝑙 𝑝 Thus, enough to handle ANN for 𝑋 𝑖𝑗 (top-𝑘 norms)!

ANN for top-𝑘 norms Include 𝑙 1 and 𝑙 ∞ , thus, need a unified approach Idea: embed a top-𝑘 norm into 𝑙 ∞ 𝑑 ′ and use [Indyk 1998] Approximation: distortion × O(log log 𝑑′ ) Problem: 𝑙 1 requires 2 Ω(𝑑) -dimensional 𝑙 ∞ Solution: use randomized embeddings

Embedding top-𝑘 norm into 𝑙 ∞ The case 𝑘=𝑑 (that is, 𝑙 1 ) Embedding (uses min-stability of exponential distribution): Sample i.i.d. 𝑢 1 , 𝑢 2 ,…, 𝑢 𝑑 ~Exp(1) Embed 𝑓: 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑑 ↦ 𝑥 1 𝑢 1 , 𝑥 2 𝑢 2 ,…, 𝑥 𝑑 𝑢 𝑑 Pr 𝑓 𝑥 ∞ ≤𝑡 = 𝑖 Pr 𝑥 𝑖 𝑢 𝑖 ≤𝑡 = 𝑖 𝑒 −| 𝑥 𝑖 |/𝑡 = 𝑒 − 𝑥 𝟏 /𝑡 Constant distortion w.h.p. In reality: slightly different parameters General 𝒌: sample 𝑢 𝑖 ~max 1 𝑘 ,Exp(1)

Detour: ANN for Orlicz norms Reminder: for convex 𝐺:𝑅→ 𝑅 + with 𝐺 0 =0, define a norm whose unit ball is 𝑥 𝑖 𝐺 𝑥 𝑖 ≤1 (e.g., 𝐺 𝑡 = 𝑡 𝑝 gives 𝑙 𝑝 norms). Embedding into 𝑙 ∞ (as before, 𝑂(1) distortion w.h.p.): Sample i.i.d. 𝑢 1 , 𝑢 2 ,…, 𝑢 𝑑 ~𝓓 Embed 𝑓: 𝑥 1 , 𝑥 2 ,…, 𝑥 𝑑 ↦ 𝑥 1 𝑢 1 , 𝑥 2 𝑢 2 ,…, 𝑥 𝑑 𝑢 𝑑 Pr 𝑋∼𝒟 [𝑋≤𝑡] =1− 𝑒 −𝐺(𝑡) A special case for 𝑙 𝑝 norms appeared in [Andoni 2009]

Where are we? Can solve ANN for ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 , where 𝑋 𝑖𝑗 is 𝑅 𝑑 equipped with a top- 𝑘 𝑖𝑗 norm What remains to be done? Embed a 𝑑-dimensional symmetric norm into ( 𝑑 𝑂( log log 𝑑 ) -dimensional) ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗

Starting point: embedding any norm into 𝑙 ∞ For a normed space 𝑋 and 𝜀>0 there is linear 𝑓:𝑋→ 𝑙 ∞ 𝑑 ′ s.t. 𝑓 𝑥 ∞ ∈ 1±𝜀 ⋅ 𝑥 𝑋 A normed space 𝑋 ∗ dual to 𝑋: 𝑦 𝑋 ∗ = sup 𝑥 𝑋 ≤1 〈𝑥,𝑦〉 . Dual to 𝑙 𝑝 is 𝑙 𝑞 where 1 𝑝 + 1 𝑞 =1 ( 𝑙 1 vs. 𝑙 ∞ , 𝑙 2 vs. 𝑙 2 , etc.). Claim: for every 𝑥∈𝑋, have: 𝑥 𝑋 ∈ 1±𝜀 ⋅ max 𝑦∈N | 𝑥, 𝑦 | , where 𝑁 is an 𝜺-net of 𝐁 𝑿 ∗ (wrt 𝑋 ∗ ) Immediately gives an embedding

Proof For every 𝑦∈𝑁, have 𝑥,𝑦 ≤ 𝑥 𝑋 ⋅ 𝑦 𝑋 ∗ ≤ 𝑥 𝑋 , 𝑥,𝑦 ≤ 𝑥 𝑋 ⋅ 𝑦 𝑋 ∗ ≤ 𝑥 𝑋 , thus, max 𝑦∈𝑁 |〈𝑥, 𝑦〉| ≤ 𝑥 𝑋 . There exists 𝑦 such that 𝑦 𝑋 ∗ ≤1 and 𝑥,𝑦 = 𝑥 𝑋 Non-trivial, requires Hahn–Banach theorem Move 𝑦 to the closest 𝑦 ′ ∈𝑁 Get 𝑥,𝑦′ ≥ 1−𝜀 ⋅ 𝑥 𝑋 Thus, max 𝑦∈𝑁 |〈𝑥, 𝑦〉| ≥ 1−𝜀 ⋅ 𝑥 𝑋 Can take 𝑁 = 1 𝜀 𝑂(𝑑) by the volume argument

Better embeddings for symmetric norms Recap: can’t embed even 𝑙 2 𝑑 into 𝑙 ∞ 𝑑′ unless 𝑑 ′ = 2 Ω(𝑑) Instead, aim at embedding a symmetric norm into ⨁ 𝑙 ∞ ⨁ 𝑙 1 𝑋 𝑖𝑗 High level idea: a new space is more forgiving and allows to consider an 𝜀-net of 𝐵 𝑋 ∗ up to a symmetry Show that there is an 𝜀-net that is a result of applying symmetries to merely 𝑑 𝑂( log log 𝑑 ) vectors!

Exploiting symmetry For a vector 𝑥, 𝜋∈ 𝑆 𝑑 , and 𝜎∈{−1, 1 } 𝑑 , denote 𝑥 𝜋,𝜎 be 𝑥 with coordinates permuted according to 𝜋 and signs flipped according to 𝜎 Recap: 𝑥 𝜋, 𝜎 𝑋 = 𝑥 𝑋 Suppose that 𝑁 ′ is an 𝜀-net for 𝐵 𝑋 ∗ intersect ℒ= 𝑦 𝑦 1 ≥ 𝑦 2 ≥…≥ 𝑦 𝑑 ≥0 Then, 𝑥 𝑋 ∈ 1±𝜀 ⋅ max 𝑦∈ 𝑁 ′ ,𝜋,𝜎 𝑥, 𝑦 𝜋,𝜎 = 1±𝜀 ⋅ max 𝑦∈𝑁′ max 𝜋,𝜎 〈𝑥, 𝑦 𝜋,𝜎 〉 = 1±𝜀 ⋅ max 𝑦∈𝑁′ 𝑥 𝑦 = 1±𝜀 ⋅ max 𝑦∈𝑁′ 𝑘 𝑦 𝑘 − 𝑦 𝑘+1 ⋅ (top−𝑘 norm of 𝑥) max 𝑦∈𝑁′ 𝑘 𝑦 𝑘 − 𝑦 𝑘+1 ⋅ (top−𝑘 norm of 𝑥)

Small nets What remains to be done: an 𝜀-net for 𝐵 𝑋 ∗ ∩ 𝑦 𝑦 1 ≥ 𝑦 2 ≥…≥ 𝑦 𝑑 ≥0 of size 𝑑 𝑂 𝜀 ( log log 𝑑 ) Will see a weaker bound of 𝑑 𝑂 𝜀 ( log 𝑑 ) , still non-trivial Volume bound fails Instead, a simple explicit construction

Small nets: continued Want to approximate a vector 𝑦∈ 𝐵 𝑋 ∗ with 𝑦 1 ≥ 𝑦 2 ≥…≥ 𝑦 𝑑 ≥0 Zero all 𝑦 𝑖 ’s that are smaller than 𝑦 1 /poly 𝑑 Round all coordinates to a nearest power of 1+𝜀 O 𝜀 (log 𝑑 ) scales Only cardinality of each scale matters 𝑑 𝑂 𝜀 ( log 𝑑 ) vectors total Can be improved to 𝑑 𝑂 𝜀 ( log log 𝑑 ) by one more trick

Quick summary Embed a symmetric norm into a 𝑑 log log 𝑑 -dimensional product space of top- 𝑘 norms Use known techniques to reduce the ANN problem on the product space to ANN for the top-𝑘 norm Uses truncated exponential random variables to embed the top-𝑘 norm into 𝑙 ∞ and use a known ANN data structure there

Two immediate open questions Improve the dependence on 𝑑 from 𝑑 log log 𝑑 to 𝑑 𝑂(1) Better 𝜀-net for 𝐵 𝑋 ∗ ∩ 𝑦 𝑦 1 ≥ 𝑦 2 ≥…≥ 𝑦 𝑑 ≥0 Looks doable Improve approximation from log log 𝑛 𝑂 1 to O(log log 𝑑) Beyond log log 𝑑 is hard due to 𝑙 ∞ Need to bypass ANN for product spaces Maybe randomized embedding into low-dimensional 𝑙 ∞ for any symmetric norm?

General norms Exists an embedding into 𝑙 2 with distortion 𝑂 𝑑 Universal 𝑑 log log 𝑑 -dimensional space that can host all 𝑑-dimensional symmetric norms Impossible for general norms even for randomized embeddings: even distortion 𝑑 0.49 requires dimension 2 𝑑 Ω(1) Stronger hardness results? Implied by: there is a family of spectral expanders that embed with distortion 𝑂 1 into some log 𝑂(1) 𝑛 -dimensional norm, where 𝑛 is the number of nodes [Naor 2016]

The main open question Thanks! Is there an efficient ANN algorithm for general high-dimensional norms with approximation 𝑑 𝑜(1) ? There is hope… Thanks!