Network Coding: A New Direction in Combinatorial Optimization Nick Harvey.

Network Coding: A New Direction in Combinatorial Optimization Nick Harvey

Collaborators  David Karger  Robert Kleinberg  April Rasala Lehman  Kazuo Murota  Kamal Jain  Micah Adler UMass

Transportation Problems Max Flow

Transportation Problems Min Cut

Communication Problems “A problem of inherent interest in the planning of large-scale communication, distribution and transportation networks also arises with the current rate structure for Bell System leased-line services.” - Robert Prim, 1957 Spanning Tree Steiner Tree Facility Location Steiner Forest Steiner Network Multicommodity Buy-at-Bulk Motivation for Network Design largely from communication networks

s1s1 s2s2 Send items from s 1  t 1 and s 2  t 2 Problem: no disjoint paths bottleneck edge What is the capacity of a network? t2t2 t1t1

b1⊕b2b1⊕b2 An Information Network b1b1 b2b2 s1s1 s2s2 t2t2 t1t1 If sending information, we can do better Send xor b 1 ⊕ b 2 on bottleneck edge

Moral of Butterfly Transportation Network Capacity ≠ Information Network Capacity

Information Theory  Deep analysis of simple channels (noise, interference, etc.)  Little understanding of network structures Combinatorial Optimization  Deep understanding of transportation problems on complex structures  Does not address information flow Network Coding  Combine ideas from both fields Understanding Network Capacity

Definition: Instance Graph G (directed or undirected) Capacity c e on edge e k commodities, with  A source s i  Set of sinks T i  Demand d i Typically:  all capacities c e = 1  all demands d i = 1 s1s1 s2s2 t2t2 t1t1 Technicality:  Always assume G is directed. Replace with

Definition: Solution Alphabet  (e) for messages on edge e A function f e for each edge s.t.  Causality: Edge (u,v) sends information previously received at u.  Correctness: Each sink t i can decode data from source s i. b1⊕b2b1⊕b2 b1b1 b2b2 b1⊕b2b1⊕b2 b2b2 b1b1

Multicast

Graph is DAG 1 source, k sinks Source has r messages in alphabet  Each sink wants all msgs m1m1 m2m2 mrmr … Source: Sinks: Thm [ACLY00]: Network coding solution exists iff connectivity r from source to each sink

Multicast Example t1t1 t2t2 s m1m1 m2m2

Linear Network Codes Treat alphabet  as finite field Node outputs linear combinations of inputs Thm [LYC03]: Linear codes sufficient for multicast AB A+BA+B A+BA+B

Multicast Code Construction Thm [HKMK03]: Random linear codes work (over large enough field) Thm [JS…03]: Deterministic algorithm to construct codes Thm [HK M 05]: Deterministic algorithm to construct codes (general algebraic approach)

Random Coding Solution Randomly choose coding coefficients Sink receives linear comb of source msgs If connectivity  r, linear combs have full rank  can decode! Without coding, problem is Steiner Tree Packing (hard!)

Our Algorithm Derandomization of [HKMK] algorithm Technique: Max-Rank Completion of Mixed Matrices Mixed Matrix: contains numbers and variables Completion = choice of values for variables that maximizes the rank.

k-Pairs Problems aka “Multiple Unicast Sessions”

k-pairs problem Network coding when each commodity has one sink  Analogous to multicommodity flow Goal: compute max concurrent rate  This is an open question s1s1 s2s2 t2t2 t1t1

Rate Each edge has its own alphabet  (e) of messages Rate = min log(  (S(i)) ) NCR = sup { rate of coding solutions } Observation: If there is a fractional flow with rational coefficients achieving rate r, there is a network coding solution achieving rate r. Source S(i) Edge e log(  (e) )

Network coding rate can be much larger than flow rate! Butterfly graph  Network coding rate (NCR) = 1  Flow rate = ½ Thm [HKL’04,LL’04]:  graphs G(V,E) where NCR = Ω( flow rate ∙ |V| ) Thm [HKL’05]:  graphs G(V,E) where NCR = Ω( flow rate ∙ |E| ) Directed k-pairs s1s1 s2s2 t2t2 t1t1

NCR / Flow Gap s1s1 s2s2 t1t1 t2t2 G (1): Equivalent to: s1s1 s2s2 t1t1 t2t2 Edge capacity = 1 s1s1 s2s2 t1t1 t2t2 Edge capacity = ½ Network CodingFlow NCR = 1 Flow rate = ½

NCR / Flow Gap s1s1 s2s2 s3s3 s4s4 t1t1 t2t2 t3t3 t4t4 G (2): Start with two copies of G (1)

NCR / Flow Gap s1s1 s2s2 s3s3 s4s4 t1t1 t2t2 t3t3 t4t4 G (2): Replace middle edges with copy of G (1)

NCR / Flow Gap s1s1 s2s2 s3s3 s4s4 G (1) t1t1 t2t2 t3t3 t4t4 G (2): NCR = 1, Flow rate = ¼

NCR / Flow Gap G (n-1) G (n): # commodities = 2 n, |V| = O(2 n ), |E| = O(2 n ) NCR = 1, Flow rate = 2 -n s1s1 s2s2 t1t1 t2t2 s3s3 s4s4 t3t3 t4t4 s 2 n -1 s2ns2n t 2 n -1 t2nt2n

Optimality The graph G (n) proves: Thm [HKL’05]:  graphs G(V,E) where NCR = Ω( flow rate ∙ |E| ) G (n) is optimal: Thm [HKL’05]:  graph G(V,E), NCR/flow rate = O(min {|V|,|E|,k})

Network flow vs. information flow Multicommodity Flow Efficient algorithms for computing maximum concurrent (fractional) flow. Connected with metric embeddings via LP duality. Approximate max-flow min-cut theorems. Network Coding Computing the max concurrent network coding rate may be:  Undecidable  Decidable in poly-time No adequate duality theory. No cut-based parameter is known to give sublinear approximation in digraphs. No known undirected instance where network coding rate ≠ max flow! (The undirected k-pairs conjecture.)

Why not obviously decidable? How large should alphabet size be? Thm [LL05]: There exist networks where max-rate solution requires alphabet size Moreover, rate does not increase monotonically with alphabet size!  No such thing as a “large enough” alphabet

Approximate max-flow / min-cut? The value of the sparsest cut is a O(log n)-approximation to max-flow in undirected graphs. [AR’98, LLR’95, LR’99] a O(√n)-approximation to max-flow in directed graphs. [CKR’01, G’03, HR’05] not even a valid upper bound on network coding rate in directed graphs! s1s1 s2s2 t2t2 t1t1 e {e} has capacity 1 and separates 2 commodities, i.e. sparsity is ½. Yet network coding rate is 1.

Approximate max-flow / min-cut? The value of the sparsest cut induced by a vertex partition is a valid upper bound, but can exceed network coding rate by a factor of Ω(n). We next present a cut parameter which may be a better approximation… sisi titi sjsj tjtj

Definition: A e if for every network coding solution, the messages sent on edges of A uniquely determine the message sent on e. Given A and e, how hard is it to determine whether A e? Is it even decidable? Theorem [HKL’05]: There is a combinatorial characterization of informational dominance. Also, there is an algorithm to compute whether A e in time O(k²m). Informational Dominance i  i  i

s1s1 s2s2 t2t2 t1t1 A does not dominate B Informational Dominance Def: A dominates B if information in A determines information in B in every network coding solution.

Informational Dominance Def: A dominates B if information in A determines information in B in every network coding solution. s1s1 s2s2 t2t2 t1t1 A dominates B Sufficient Condition: If no path from any source  B then A dominates B (not a necessary condition)

Informational Dominance Example s1s1 s2s2 t1t1 t2t2 “Obviously” flow rate = NCR = 1  How to prove it? Markovicity?  No two edges disconnect t 1 and t 2 from both sources!

Informational Dominance Example s1s1 s2s2 t1t1 t2t2 Our characterization implies that A dominates {t 1,t 2 }  H(A)  H(t 1,t 2 ) Cut A

Informational Meagerness Def: Edge set A informationally isolates commodity set P if A υ P P. iM (G) = min A,P for P informationally isolated by A Claim: network coding rate  iM (G). i  Capacity of edges in A Demand of commodities in P

Approximate max-flow / min-cut? Informational meagerness is no better than an Ω(log n)-approximation to the network coding rate, due to a family of instances called the iterated split butterfly.

Approximate max-flow / min-cut? Informational meagerness is no better than a Ω(log n)-approximation to the network coding rate, due to a family of instances called the iterated split butterfly. On the other hand, we don’t even know if it is a o(n)-approximation in general. And we don’t know if there is a polynomial-time algorithm to compute a o(n)-approximation to the network coding rate in directed graphs.

Sparsity Summary Directed Graphs Undirected Graphs Flow Rate  Sparsity < NCR  iM (G) in some graphs Flow Rate  NCR  Sparsity easy consequence of info. dom. Gap can be Ω(log n) when G is an expander

Undirected k-Pairs Conjecture Flow Rate Sparsity NCR <= ?? =< ?? Undirected k-pairs conjecture Unknown until this work

The Okamura-Seymour Graph s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 Every edge cut has enough capacity to carry the combined demand of all commodities separated by the cut. Cut

Okamura-Seymour Max-Flow s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 Flow Rate = 3/4 s i is 2 hops from t i. At flow rate r, each commodity consumes  2r units of bandwidth in a graph with only 6 units of capacity.

The trouble with information flow… If an edge combines messages from multiple sources, which commodities get charged for “consuming bandwidth”? We present a way around this obstacle and bound NCR by 3/4. s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 At flow rate r, each commodity consumes at least 2r units of bandwidth in a graph with only 6 units of capacity.

Thm [AHJKL’05]: flow rate = NCR = 3/4. We will prove: Thm [HKL’05]: NCR  6/7 < Sparsity. Proof uses properties of entropy.  A  B  H(A)  H(B)  Submodularity: H(A)+H(B)  H(A  B)+H(A  B) Lemma (Cut Bound): For a cut A  E, H( A )  H( A, sources separated by A ). Okamura-Seymour Proof

s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 H(A)  H(A,s 1,s 2,s 4 ) (Cut Bound) Cut A

s1s1 t1t1 s2s2 t2t2 s3s3 t3t3 s4s4 t4t4 H(B)  H(B,s 1,s 2,s 4 ) (Cut Bound) Cut B

Add inequalities: H(A) + H(B)  H(A,s 1,s 2,s 4 ) + H(B,s 1,s 2,s 4 ) Apply submodularity: H(A) + H(B)  H(A  B,s 1,s 2,s 4 ) + H(s 1,s 2,s 4 ) Note: A  B separates s 3 (Cut Bound)  H(A  B,s 1,s 2,s 4 )  H(s 1,s 2,s 3,s 4 ) Conclude:  H(A) + H(B)  H(s 1,s 2,s 3,s 4 ) + H(s 1,s 2,s 4 )  6 edges  rate of 7 sources  rate  6/7. Cut A Cut B

Rate ¾ for Okamura-Seymour s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 s1s1 i s1s1 t3t3 s3s3

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 i i i ++ ≥ ++

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 i i i ++ ≥ ++ i

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 ++ ≥ + ii ≥

s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 ++ ≥ + ≥ 3 H(source) + 6 H(undirected edge) ≥ 11 H(source)6 H(undirected edge) ≥ 8 H(source) ¾ ≥ RATE

Special Bipartite Graphs s1 t3s1 t3 s2 t1s2 t1 s3 t2s3 t2 s4s4 t4t4 This proof generalizes to show that max-flow = NCR for every instance which is:  Bipartite  Every source is 2 hops away from its sink.  Dual of flow LP is optimized by assigning length 1 to all edges.

The k-pairs conjecture and I/O complexity In the I/O complexity model [AV’88], one has:  A large, slow external memory consisting of pages each containing p records.  A fast internal memory that holds O(1) pages. (For concreteness, say 2.)  Basic I/O operation: read in two pages from external memory, write out one page.

I/O Complexity of Matrix Transposition Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops.

I/O Complexity of Matrix Transposition Matrix transposition: Given a p×p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3

I/O Complexity of Matrix Transposition Matrix transposition: Given a p x p matrix of records in row-major order, write it out in column-major order. Obvious algorithm requires O(p²) ops. A better algorithm uses O(p log p) ops. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

I/O Complexity of Matrix Transposition Theorem: (Floyd ’72, AV’88) If a matrix transposition algorithm performs only read and write operations (no bitwise operations on records) then it must perform Ω(p log p) I/O operations. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

I/O Complexity of Matrix Transposition Proof: Let N ij denote the number of ops in which record (i,j) is written. For all j, Σ i N ij ≥ p log p. Hence Σ ij N ij ≥ p² log p. Each I/O writes only p records. QED. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Definition: An oblivious algorithm is one whose pattern of read/write operations does not depend on the input. Theorem: If there is an oblivious algorithm for matrix transposition using o(p log p) I/O ops, the undirected k-pairs conjecture is false. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4

The k-pairs conjecture and I/O complexity Proof:  Represent the algorithm with a diagram as before.  Assume WLOG that each node has only two outgoing edges. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4 p1p1 p2p2 p1p1 qp2p2

The k-pairs conjecture and I/O complexity Proof:  Represent the algorithm with a diagram as before.  Assume WLOG that each node has only two outgoing edges.  Make all edges undirected, capacity p.  Create a commodity for each matrix entry. s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4 p1p1 p2p2 p1p1 qp2p2

The k-pairs conjecture and I/O complexity Proof:  The algorithm itself is a network code of rate 1.  Assuming the k-pairs conjecture, there is a flow of rate 1.  Σ i,j d(s i,t j ) ≤ p |E(G)|.  Arguing as before, LHS is Ω(p² log p).  Hence |E(G)|=Ω(p log p). s1s1 s2s2 s3s3 s4s4 t1t1 t3t3 t2t2 t4t4 p1p1 p2p2 p1p1 qp2p2

Other consequences for complexity The undirected k-pairs conjecture implies:  A Ω(p log p) lower bound for matrix transposition in the cell-probe model. [Same proof.]  A Ω(p² log p) lower bound for the running time of oblivious matrix transposition algorithms on a multi-tape Turing machine. [I/O model can emulate multi-tape Turing machines with a factor p speedup.]

Open Problems Computing the network coding rate in DAGs:  Recursively decidable?  How do you compute a o(n)-factor approximation? Undirected k-pairs conjecture: Does flow rate = NCR?  At least prove a Ω(log n) gap between sparsest cut and network coding rate for some graphs.

Summary Information ≠ Transportation For multicast, NCR rate = min cut  Algorithms to find solution k-pairs:  Directed: NCR >> flow rate  Undirected: Flow rate = NCR in O-S graph Informational dominance

Network Coding: A New Direction in Combinatorial Optimization Nick Harvey.

Similar presentations

Presentation on theme: "Network Coding: A New Direction in Combinatorial Optimization Nick Harvey."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Network Coding: A New Direction in Combinatorial Optimization Nick Harvey.

Similar presentations

Presentation on theme: "Network Coding: A New Direction in Combinatorial Optimization Nick Harvey."— Presentation transcript:

Similar presentations

About project

Feedback