Dimension reduction for finite trees in L1

Slides:



Advertisements
Similar presentations
Routing Complexity of Faulty Networks Omer Angel Itai Benjamini Eran Ofek Udi Wieder The Weizmann Institute of Science.
Advertisements

Optimal Bounds for Johnson- Lindenstrauss Transforms and Streaming Problems with Sub- Constant Error T.S. Jayram David Woodruff IBM Almaden.
Embedding Metric Spaces in Their Intrinsic Dimension Ittai Abraham, Yair Bartal*, Ofer Neiman The Hebrew University * also Caltech.
Vertex sparsifiers: New results from old techniques (and some open questions) Robert Krauthgamer (Weizmann Institute) Joint work with Matthias Englert,
Property Testing of Data Dimensionality Robert Krauthgamer ICSI and UC Berkeley Joint work with Ori Sasson (Hebrew U.)
Shortest Vector In A Lattice is NP-Hard to approximate
A Nonlinear Approach to Dimension Reduction Robert Krauthgamer Weizmann Institute of Science Joint work with Lee-Ad Gottlieb TexPoint fonts used in EMF.
Distance Preserving Embeddings of Low-Dimensional Manifolds Nakul Verma UC San Diego.
Quantum One-Way Communication is Exponentially Stronger than Classical Communication TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
Embedding the Ulam metric into ℓ 1 (Ενκρεβάτωση του μετρικού χώρου Ulam στον ℓ 1 ) Για το μάθημα “Advanced Data Structures” Αντώνης Αχιλλέως.
Trees and Markov convexity James R. Lee Institute for Advanced Study [ with Assaf Naor and Yuval Peres ] RdRd x y.
Metric Embeddings with Relaxed Guarantees Hubert Chan Joint work with Kedar Dhamdhere, Anupam Gupta, Jon Kleinberg, Aleksandrs Slivkins.
Paul Cuff THE SOURCE CODING SIDE OF SECRECY TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AA.
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
Metric Embeddings As Computational Primitives Robert Krauthgamer Weizmann Institute of Science [Based on joint work with Alex Andoni]
A Nonlinear Approach to Dimension Reduction Lee-Ad Gottlieb Weizmann Institute of Science Joint work with Robert Krauthgamer TexPoint fonts used in EMF.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Institute) Robert Krauthgamer (Weizmann Institute) Ilya Razenshteyn (CSAIL MIT)
Convergent and Correct Message Passing Algorithms Nicholas Ruozzi and Sekhar Tatikonda Yale University TexPoint fonts used in EMF. Read the TexPoint manual.
Hard Metrics from Cayley Graphs Yuri Rabinovich Haifa University Joint work with Ilan Newman.
Avraham Ben-Aroya (Tel Aviv University) Oded Regev (Tel Aviv University) Ronald de Wolf (CWI, Amsterdam) A Hypercontractive Inequality for Matrix-Valued.
Lattices for Distributed Source Coding - Reconstruction of a Linear function of Jointly Gaussian Sources -D. Krithivasan and S. Sandeep Pradhan - University.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Simons Inst. / Columbia) Robert Krauthgamer (Weizmann Inst.) Ilya Razenshteyn (MIT, now.
Linear Codes for Distributed Source Coding: Reconstruction of a Function of the Sources -D. Krithivasan and S. Sandeep Pradhan -University of Michigan,
Embedding and Sketching Alexandr Andoni (MSR). Definition by example  Problem: Compute the diameter of a set S, of size n, living in d-dimensional ℓ.
Random Projections of Signal Manifolds Michael Wakin and Richard Baraniuk Random Projections for Manifold Learning Chinmay Hegde, Michael Wakin and Richard.
cover times, blanket times, and majorizing measures Jian Ding U. C. Berkeley James R. Lee University of Washington Yuval Peres Microsoft Research TexPoint.
Eigenvectors of random graphs: nodal domains James R. Lee University of Washington Yael Dekel and Nati Linial Hebrew University TexPoint fonts used in.
Distance scales, embeddings, and efficient relaxations of the cut cone James R. Lee University of California, Berkeley.
Embedding and Sketching Non-normed spaces Alexandr Andoni (MSR)
Volume distortion for subsets of R n James R. Lee Institute for Advanced Study & University of Washington Symposium on Computational Geometry, 2006; Sedona,
Algorithms on negatively curved spaces James R. Lee University of Washington Robert Krauthgamer IBM Research (Almaden) TexPoint fonts used in EMF. Read.
On Rearrangements of Fourier Series Mark Lewko TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A AAA A AAA A A A A.
Computing and Communicating Functions over Sensor Networks A.Giridhar and P. R. Kumar Presented by Srikanth Hariharan.
Entropy-based Bounds on Dimension Reduction in L 1 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A AAAA A Oded Regev.
Ad Hoc and Sensor Networks – Roger Wattenhofer –3/1Ad Hoc and Sensor Networks – Roger Wattenhofer – Topology Control Chapter 3 TexPoint fonts used in EMF.
PODC Distributed Computation of the Mode Fabian Kuhn Thomas Locher ETH Zurich, Switzerland Stefan Schmid TU Munich, Germany TexPoint fonts used in.
13 th Nov Geometry of Graphs and It’s Applications Suijt P Gujar. Topics in Approximation Algorithms Instructor : T Kavitha.
1 Embedding and Similarity Search for Point Sets under Translation Minkyoung Cho and David M. Mount University of Maryland SoCG 2008.
Embeddings, flow, and cuts: an introduction University of Washington James R. Lee.
Doubling Dimension: a short survey Anupam Gupta Carnegie Mellon University Barriers in Computational Complexity II, CCI, Princeton.
Lower Bounds for Embedding Edit Distance into Normed Spaces A. Andoni, M. Deza, A. Gupta, P. Indyk, S. Raskhodnikova.
Tight Bound for the Gap Hamming Distance Problem Oded Regev Tel Aviv University TexPoint fonts used in EMF. Read the TexPoint manual before you delete.
On the Impossibility of Dimension Reduction for Doubling Subsets of L p Yair Bartal Lee-Ad Gottlieb Ofer Neiman.
Sketching and Embedding are Equivalent for Norms Alexandr Andoni (Columbia) Robert Krauthgamer (Weizmann Inst) Ilya Razenshteyn (MIT) 1.
S IMILARITY E STIMATION T ECHNIQUES FROM R OUNDING A LGORITHMS Paper Review Jieun Lee Moses S. Charikar Princeton University Advanced Database.
Theory of Computational Complexity Probability and Computing Chapter Hikaru Inada Iwama and Ito lab M1.
Coarse Differentiation and Planar Multiflows
Random Access Codes and a Hypercontractive Inequality for
Approximate Near Neighbors for General Symmetric Norms
Information Complexity Lower Bounds
New Characterizations in Turnstile Streams with Applications
Sampling of min-entropy relative to quantum knowledge Robert König in collaboration with Renato Renner TexPoint fonts used in EMF. Read the TexPoint.
Ultra-low-dimensional embeddings of doubling metrics
Sublinear Algorithmic Tools 3
Branching Programs Part 3
Dimension reduction techniques for lp (1<p<2), with applications
Lecture 10: Sketching S3: Nearest Neighbor Search
Sketching and Embedding are Equivalent for Norms
Turnstile Streaming Algorithms Might as Well Be Linear Sketches
Lecture 16: Earth-Mover Distance
Near-Optimal (Euclidean) Metric Compression
Sampling in Graphs: node sparsifiers
CSE 321 Discrete Structures
Streaming Symmetric Norms via Measure Concentration
Embedding and Sketching
Dimension versus Distortion a.k.a. Euclidean Dimension Reduction
Embedding Metrics into Geometric Spaces
Lecture 15: Least Square Regression Metric Embeddings
The Intrinsic Dimension of Metric Spaces
Clustering.
Presentation transcript:

Dimension reduction for finite trees in L1 James R. Lee Mohammad Moharrami University of Washington Arnaud De Mesmay École Normale Supérieure TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAA

dimension reduction in Lp Given an n-point subset X µ Rd, find a mapping such that for all x, y 2 X, n = size of X k = target dimension D = distortion Dimension reduction as “geometric information theory”

the case p=2 When p=2, the Johnson-Lindenstrauss transform gives, for every n-point subset X µ Rd and " > 0, Applications to… Statistics over data streams Nearest-neighbor search Compressed sensing Quantum information theory Machine learning

dimension reduction in L1 Natural to consider is p=1. n = size of X k = target dimension D = distortion History: - Caratheodory’s theorem yields D=1 and - [Schechtman’87, Bourgain-Lindenstrauss-Milman’89, Talagrand’90] Linear mappings (sampling + reweighting) yield D · 1+" and - [Batson-Spielman-Srivastava’09, Newman-Rabinovich’10] Sparsification techniques yield D · 1+" and

the Brinkman-Charikar lower bound There are n-point subsets such that distortion D requires [Brinkman-Karagiozova-L 07] Lower bound tight for these spaces Very technical argument based on LP-duality. [L-Naor’04]: One-page argument based on uniform convexity.

Brinkman-Charikar and ACNN lower bounds. more lower bounds [Andoni-Charikar-Neiman-Nguyen’11]: There are n-point subsets such that distortion 1+" requires [Regev’11]: Simple, elegant, information-theoretic proof of both the Brinkman-Charikar and ACNN lower bounds. Low-dimensional embedding ) encoding scheme

the simplest of L1 objects A tree metric is a graph theoretic tree T=(V, E) together with non-negative lengths on the edges Easy to embed isometrically into RE equipped with the L1 norm.

dimension reduction for trees in L1 Charikar and Sahai (2002) showed that for trees one can achieve A. Gupta improved this to In 2003 in Princeton with Gupta and Talwar, we asked: Is possible? even for complete binary trees?

dimension reduction for trees in L1 Theorem: For every n-point tree metric, one can achieve and (Can get for “symmetric” trees.) Complete binary tree using local lemma Schulman’s tree codes Complete binary tree using re-randomization Extension to general trees

dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = B log2n Choose edge labels uniformly at random. Nodes at tree distance have probability to get labels with hamming distance

dimension reduction for the complete binary tree Every edge gets B bits ) target dimension = B log2n Choose edge labels uniformly at random. Siblings have probability 2-B to have the same label, yet there are n/2 of them.

Lovász Local Lemma Pairs at distance L have probability to be “good” Number of dependent “distance L” events is LLL + sum over levels ) good embedding

Schulman’s tree codes LLL argument difficult to extend to arbitrary trees. Same as construction of Schulman’96: Tree codes for interactive communication

re-randomization Random isometry: For every level on the right, exchange 0’s and 1’s with probability half (independently for each level)

re-randomization Pairs at distance L have probability to be “good” Number of pairs at distance L is

extension to general trees Unfortunately, the general case is technical (paper is 50 pages) Obstacles: General trees do not have O(log n) depth Use “topological depth” of Matousek. How many coordinates to change per edge, and by what magnitude? Multi-scale entropy functional

? open problems Coding/dimension reduction: Extend/make explicit the connection between L1 dimension reduction and information theory. Close the gap: For distortion 10, is the right target dimension ? or Other Lp norms: Nothing non-trivial is known for