On the Capacity of Information Networks Nick Harvey Collaborators: Micah Adler (UMass), Kamal Jain (Microsoft), Bobby Kleinberg (MIT/Berkeley/Cornell),

Slides:



Advertisements
Similar presentations
Lower Bounds for Additive Spanners, Emulators, and More David P. Woodruff MIT and Tsinghua University To appear in FOCS, 2006.
Advertisements

Lecture 7. Network Flows We consider a network with directed edges. Every edge has a capacity. If there is an edge from i to j, there is an edge from.
IMIM v v v v v v v v v DEFINITION L v 11 v 2 1 v 31 v 12 v 2 2 v 32.
1 LP Duality Lecture 13: Feb Min-Max Theorems In bipartite graph, Maximum matching = Minimum Vertex Cover In every graph, Maximum Flow = Minimum.
~1~ Infocom’04 Mar. 10th On Finding Disjoint Paths in Single and Dual Link Cost Networks Chunming Qiao* LANDER, CSE Department SUNY at Buffalo *Collaborators:
Introduction to Algorithms
15.082J & 6.855J & ESD.78J October 14, 2010 Maximum Flows 2.
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
Multicut Lower Bounds via Network Coding Anna Blasiak Cornell University.
Umans Complexity Theory Lectures Lecture 2a: Reductions & Completeness.
Online Social Networks and Media. Graph partitioning The general problem – Input: a graph G=(V,E) edge (u,v) denotes similarity between u and v weighted.
Geometric embeddings and graph expansion James R. Lee Institute for Advanced Study (Princeton) University of Washington (Seattle)
1 Chapter 7 Network Flow Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Chapter 7 Maximum Flows: Polynomial Algorithms
The number of edge-disjoint transitive triples in a tournament.
1 Network Coding: Theory and Practice Apirath Limmanee Jacobs University.
Tighter Cut-Based Bounds for k-pairs Communication Problems Nick Harvey Robert Kleinberg.
1 Optimization problems such as MAXSAT, MIN NODE COVER, MAX INDEPENDENT SET, MAX CLIQUE, MIN SET COVER, TSP, KNAPSACK, BINPACKING do not have a polynomial.
Randomized Algorithms and Randomized Rounding Lecture 21: April 13 G n 2 leaves
Network Design Adam Meyerson Carnegie-Mellon University.
1 University of Freiburg Computer Networks and Telematics Prof. Christian Schindelhauer Mobile Ad Hoc Networks Theory of Data Flow and Random Placement.
Network Coding Project presentation Communication Theory 16:332:545 Amith Vikram Atin Kumar Jasvinder Singh Vinoo Ganesan.
1 Simple Network Codes for Instantaneous Recovery from Edge Failures in Unicast Connections Salim Yaacoub El Rouayheb, Alex Sprintson Costas Georghiades.
Deterministic Network Coding by Matrix Completion Nick Harvey David Karger Kazuo Murota.
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
Distributed Combinatorial Optimization
1 Nick Harvey (MIT) Kamal Jain (MSR) Lap Chi Lau (U. Toronto) Chandra Nair (MSR) Yunnan Wu (MSR) Conservative Network Coding.
Packing Element-Disjoint Steiner Trees Mohammad R. Salavatipour Department of Computing Science University of Alberta Joint with Joseph Cheriyan Department.
Integrality Gaps for Sparsest Cut and Minimum Linear Arrangement Problems Nikhil R. Devanur Subhash A. Khot Rishi Saket Nisheeth K. Vishnoi.
Equality Function Computation (How to make simple things complicated) Nitin Vaidya University of Illinois at Urbana-Champaign Joint work with Guanfeng.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Theory of Computing Lecture 13 MAS 714 Hartmut Klauck.
Primal-Dual Algorithms for Connected Facility Location Chaitanya SwamyAmit Kumar Cornell University.
1 Network Coding and its Applications in Communication Networks Alex Sprintson Computer Engineering Group Department of Electrical and Computer Engineering.
Network Coding: A New Direction in Combinatorial Optimization Nick Harvey.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
CS 473Lecture ?1 CS473-Algorithms I Lecture ? Network Flows Finding Max Flow.
Flows in Planar Graphs Hadi Mahzarnia. Outline O Introduction O Planar single commodity flow O Multicommodity flows for C 1 O Feasibility O Algorithm.
Maximum Flow. p2. Maximum Flow A flow network G=(V, E) is a DIRECTED graph where each has a nonnegative capacity u.
Embeddings, flow, and cuts: an introduction University of Washington James R. Lee.
Multicommodity flow, well-linked terminals and routing problems Chandra Chekuri Lucent Bell Labs Joint work with Sanjeev Khanna and Bruce Shepherd Mostly.
Chapter 7 April 28 Network Flow.
1 The Encoding Complexity of Network Coding Michael Langberg California Institute of Technology Joint work with Jehoshua Bruck and Alex Sprintson.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 25.
New algorithms for Disjoint Paths and Routing Problems
Multi-commodity Flows and Cuts in Polymatroidal Networks
and 6.855J March 6, 2003 Maximum Flows 2. 2 Network Reliability u Communication Network u What is the maximum number of arc disjoint paths from.
Graph Partitioning using Single Commodity Flows
Approximating Buy-at-Bulk and Shallow-Light k-Steiner Trees Mohammad T. Hajiaghayi (CMU) Guy Kortsarz (Rutgers) Mohammad R. Salavatipour (U. Alberta) Presented.
Prof. Swarat Chaudhuri COMP 382: Reasoning about Algorithms Fall 2015.
Theory of Computing Lecture 12 MAS 714 Hartmut Klauck.
C&O 355 Lecture 19 N. Harvey TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A.
TU/e Algorithms (2IL15) – Lecture 12 1 Linear Programming.
Approximation Algorithms Duality My T. UF.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Minimum Cost Flow Algorithms and Networks. Algorithms and Networks: Minimum Cost Flow2 This lecture The minimum cost flow problem: statement and applications.
TU/e Algorithms (2IL15) – Lecture 8 1 MAXIMUM FLOW (part II)
Optimization problems such as
CS4234 Optimiz(s)ation Algorithms
Algorithm Design and Analysis
Richard Anderson Lecture 23 Network Flow
Lecture 22 Network Flow, Part 2
Instructor: Shengyu Zhang
Network Flow and Connectivity in Wireless Sensor Networks
3.5 Minimum Cuts in Undirected Graphs
Max Flow Min Cut, Bipartite Matching Yin Tat Lee
Algorithms (2IL15) – Lecture 7
and 6.855J March 6, 2003 Maximum Flows 2
Flow Feasibility Problems
Lecture 22 Network Flow, Part 2
Presentation transcript:

On the Capacity of Information Networks Nick Harvey Collaborators: Micah Adler (UMass), Kamal Jain (Microsoft), Bobby Kleinberg (MIT/Berkeley/Cornell), and April Lehman (MIT/Google)

What is the capacity of a network?

s1s1 s2s2 Send items from s 1  t 1 and s 2  t 2 Problem: no disjoint paths bottleneck edge What is the capacity of a network? t2t2 t1t1

b1⊕b2b1⊕b2 An Information Network b1b1 b2b2 s1s1 s2s2 t2t2 t1t1 If sending information, we can do better Send xor b 1 ⊕ b 2 on bottleneck edge

Moral of Butterfly Network Flow Capacity ≠ Information Flow Capacity

Network Coding New approach for information flow problems  Blend of combinatorial optimization, information theory  Multicast, k-Pairs k-Pairs problems: Network coding when each commodity has one sink  Analogous to multicommodity flow Definitions for cyclic networks are subtle

Multicommodity Flow Efficient algorithms for computing maximum concurrent (fractional) flow. Connected with metric embeddings via LP duality. Approximate max-flow min-cut theorems. Network Coding Computing the max concurrent coding rate may be:  Undecidable  Decidable in poly-time No adequate duality theory. No cut-based parameter is known to give sublinear approximation in digraphs. Directed and undirected problems behave quite differently

Coding rate can be much larger than flow rate! Butterfly:  Coding rate = 1  Flow rate = ½ Thm [HKL’04,LL’04]:  graphs G(V,E) where Coding Rate = Ω( flow rate ∙ |V| ) Directed k-pairs s1s1 s2s2 t2t2 t1t1 Thm:  graphs G(V,E) where Coding Rate = Ω( flow rate ∙ |E| )  And this is optimal  Recurse on butterfly construction

Coding rate can be much larger than flow rate! …and much larger than the sparsity (same example) Directed k-pairs Flow Rate  Sparsity < Coding Rate in some graphs

No known undirected instance where coding rate ≠ max flow rate! (The undirected k-pairs conjecture) Undirected k-pairs Flow Rate  Coding Rate  Sparsity Pigeonhole principle argument Gap can be Ω(log n) when G is an expander

Undirected k-Pairs Conjecture Flow Rate Sparsity Coding Rate <= ?? =< Undirected k-pairs conjecture Unknown until this work

Okamura-Seymour Graph s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 Every cut has enough capacity to carry all commodities separated by the cut Cut

Okamura-Seymour Max-Flow s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 Flow Rate = 3/4 s i is 2 hops from t i. At flow rate r, each commodity consumes  2r units of bandwidth in a graph with only 6 units of capacity.

The trouble with information flow… If an edge codes multiple commodities, how to charge for “consuming bandwidth”? We work around this obstacle and bound coding rate by 3/4. s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 At flow rate r, each commodity consumes at least 2r units of bandwidth in a graph with only 6 units of capacity.

Definition: A e if for every coding solution, the messages sent on edges of A uniquely determine the message sent on e. Given A and e, how hard is it to determine whether A e? Is it even decidable? Theorem: There is an algorithm to compute whether A e in time O(k²m).  Based on a combinatorial characterization of informational dominance Informational Dominance i  i  i

What can we prove? s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 Combine Informational Dominance with Shannon inequalities for Entropy Flow rate = coding rate for “Special Bipartite Graphs”:  Bipartite  Every source is 2 hops away from its sink  Dual of flow LP is optimized by assigning length 1 to all edges Next: show that proving conjecture for all graphs is quite hard

k-pairs conjecture & I/O complexity I/O complexity model [AV’88] :  A large, slow external memory consisting of pages each containing p records  A fast internal memory that holds 2 pages  Basic I/O operation: read in two pages from external memory, write out one page

I/O Complexity of Matrix Transposition Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops.

I/O Complexity of Matrix Transposition Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

Matching Lower Bound Theorem: (Floyd ’72, AV’88) A matrix transposition algorithm using only read and write operations (no arithmetic on values) must perform Ω(p log p) I/O operations. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

Ω(p log p) Lower Bound Proof: Let N ij denote the number of ops in which record (i,j) is written. For all j, Σ i N ij ≥ p log p. Hence Σ ij N ij ≥ p² log p. Each I/O writes only p records. QED. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Definition: An oblivious algorithm is one whose pattern of read/write operations does not depend on the input. Theorem: If there is an oblivious algorithm for matrix transposition using o(p log p) I/O ops, the undirected k-pairs conjecture is false. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Proof:  Represent the algorithm with a diagram as before.  Assume WLOG that each node has only two outgoing edges. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Proof:  Represent the algorithm with a diagram as before.  Assume WLOG that each node has only two outgoing edges.  Make all edges undirected, capacity p.  Create a commodity for each matrix entry. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Proof:  The algorithm itself is a network code of rate 1.  Assuming the k-pairs conjecture, there is a flow of rate 1.  Σ i,j d(s i,t j ) ≤ p |E(G)|.  Arguing as before, LHS is Ω(p² log p).  Hence |E(G)|=Ω(p log p). s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

Other consequences for complexity The undirected k-pairs conjecture implies:  A Ω(p log p) lower bound for matrix transposition in the cell-probe model. [Same proof.]  A Ω(p² log p) lower bound for the running time of oblivious matrix transposition algorithms on a multi-tape Turing machine. [I/O model can emulate multi-tape Turing machines with a factor p speedup.]

Distance arguments Rate-1 flow solution implies Σ i d(s i,t i ) ≤ |E|  LP duality; directed or undirected Does rate-1 coding solution imply Σ i d(s i,t i ) ≤ |E|?  Undirected graphs: this is essentially the k-pairs conjecture!  Directed graphs: this is completely false

k commodities (s i,t i ) Distance d(s i,t i ) = O(log k)  i O(k) edges! Recursive construction s(1) s(2)s(3)s(4)s(5)s(6)s(7)s(8) t(1) t(2)t(3)t(4)t(5)t(6)t(7)t(8)

Recursive Construction s1s1 s2s2 t1t1 t2t2 G (1): Equivalent to: s1s1 s2s2 t1t1 t2t2 Edge capacity = 1 2 commodities 7 edges Distance = 3

Recursive Construction s1s1 s2s2 s3s3 s4s4 t1t1 t2t2 t3t3 t4t4 G (2): Start with two copies of G (1)

Recursive Construction s1s1 s2s2 s3s3 s4s4 t1t1 t2t2 t3t3 t4t4 G (2): Replace middle edges with copy of G (1)

Recursive Construction s1s1 s2s2 s3s3 s4s4 G (1) t1t1 t2t2 t3t3 t4t4 G (2): 4 commodities, 19 edges, Distance = 5

Recursive Construction G (n-1) G (n): # commodities = 2 n, |V| = O(2 n ), |E| = O(2 n ) Distance = 2n+1 s1s1 s2s2 t1t1 t2t2 s3s3 s4s4 t3t3 t4t4 s 2 n -1 s2ns2n t 2 n -1 t2nt2n

Summary Directed instances:  Coding rate >> flow rate Undirected instances:  Conjecture: Flow rate = Coding rate  Proof for special bip graphs  Tool: Informational Dominance  Proving conjecture solves Matrix Transposition Problem

Open Problems Computing the network coding rate in DAGs:  Recursively decidable?  How do you compute a o(n)-factor approximation? Undirected k-pairs conjecture:  Stronger complexity consequences?  Prove a Ω(log n) gap between sparsest cut and coding rate for some graphs  …or, find a fast matrix transposition algorithm.

Backup Slides

Optimality The graph G (n) proves: Thm [HKL’05]:  graphs G(V,E) where NCR = Ω( flow rate ∙ |E| ) G (n) is optimal: Thm [HKL’05]:  graph G(V,E), NCR/flow rate = O(min {|V|,|E|,k})

s1s1 s2s2 t2t2 t1t1 A does not dominate B Informational Dominance Def: A dominates B if information in A determines information in B in every network coding solution.

Informational Dominance Def: A dominates B if information in A determines information in B in every network coding solution. s1s1 s2s2 t2t2 t1t1 A dominates B Sufficient Condition: If no path from any source  B then A dominates B

Informational Dominance Example s1s1 s2s2 t1t1 t2t2 “Obviously” flow rate = NCR = 1  How to prove it? Markovicity?  No two edges disconnect t 1 and t 2 from both sources!

Informational Dominance Example s1s1 s2s2 t1t1 t2t2 Cut A Sufficient Condition: If no path from any source  B then A dominates B

Informational Dominance Example s1s1 s2s2 t1t1 t2t2 Our characterization implies that A dominates {t 1,t 2 }  H(A)  H(t 1,t 2 ) Cut A

Rate ¾ for Okamura-Seymour s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 s1s1 i s1s1 t3t3 s3s3

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 i i i ++ ≥ ++

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 i i i ++ ≥ ++

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 i i i ++ ≥ ++ i

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 ++ ≥ + ii ≥

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 ++ ≥ + ≥ 3 H(source) + 6 H(undirected edge) ≥ 11 H(source)6 H(undirected edge) ≥ 8 H(source) ¾ ≥ RATE