The Power of Tuning: A Novel Approach for the Efficient Design of Survivable Networks Ron Banner and Ariel Orda Department of Electrical Engineering Technion- Israel Institute of Technology
Introduction u Transmission rates have increased to 10 Gbit/s and beyond. u Any failure may lead to a vast amount of data loss. u Survivability = the capability of the network to maintain service continuity in the presence of failures.
Current Survivability Schemes u Based on securing an independent resource for each network element. u This is translated into the establishment of pairs of disjoint paths. u Given a pair of disjoint paths, either 1+1 or 1:1 protection can be employed With 1+1 protection, identical traffic is transmitted over a pair of disjoint paths. With 1:1 protection, traffic is sent only on one (active) path. The other (backup) path is activated only upon a failure on the active path.
Pros and cons of current survivability schemes u Pro: pairs of disjoint paths provide full (100%) protection to single network failures. u Cons: in practice, too restrictive and requires excessive redundancy. Often, very limiting (G. Maier, A. Pattavina, S. De Patre and M. Martinelli, 2002). Sometimes, even infeasible (N. Taft-Plotkin, B. Bellur and R. Ogier, 1999).
Tunable Survivability u Since full survivability is too restrictive, we propose tunable survivability. u Tunable survivability allows any desired degree of survivability in the range 0% to 100%. Provides a quantitative measure to specify the desired level of survivability. Can substantially increase the space of feasible solutions. Enables to consider & quantify valuable tradeoffs (e.g., survivability vs. delay, survivability vs. jitter, survivability vs. bandwidth…)
Survivable connections u We adopt the widely used single link failure model. Has been the focus of most studies on survivability. u Tunable survivability enables the establishment of p- survivable connections. s t link with “poor” performance p= survivable connection (0.999) 2 -survivable connection (0.999) 2 ≈ (0.999) 3 -survivable connection (0.999) 3 ≈ ab c d link b → d is bypassed! A single path is enough!
Survivable connections (cont.) u Connections with tunable survivability can also employ either 1+1 or 1:1 protection architecture. u However, the maximal traffic rate (bandwidth) of a survivable connection with 1+1 protection may be different than that with 1:1 protection. Indeed, for any given survivable connection, the flow configuration induced by 1+1 protection is different than that induced by 1:1 protection.
How much is gained by employing tunable survivability? u Through comprehensive simulations on random Internet networks we demonstrate the major power of Tunable survivability. u By just slightly alleviating the requirement of full survivability, major increase in bandwidth as well as in feasibility is accomplished. Details
Analytical results u Motivated by the simulation results we investigated the tunable survivability concept. u Established several fundamental properties of survivable connections. u Designed polynomial (optimal) algorithmic schemes for the establishment of survivable connections for 1:1 and 1+1 protection. u Derived for the tunable survivability approach a new “hybrid” protection architecture that has several advantages over both 1:1 and 1+1 protection.
Property: two paths are ENOUGH! u Claim: Given a survivable connection that admits more than two paths, it is possible to obtain the same level of survivability with only two paths.
Two paths are ENOUGH! (cont.) u Proof (sketch): u Under the single link failure model, a failure in a link that is NOT common to all paths can never fail a survivable connection. Hence, the probability to survive a single failure is equal to the probability that all common links are operational. u It is possible to construct a pair of paths that intersect only on the common links. common (critical) links
Types of survivable connections u Most Survivable connection= A connection that has the maximum probability to survive a single failure. u Most survivable connection with a bandwidth of at least B= A survivable connection that among all connections with a bandwidth of at least B, has the maximum probability to survive a single failure. u Widest p-survivable connection= A p-survivable connection that has the maximum bandwidth.
Polynomial optimal algorithms for survivable connection u For each type of survivable connection, we designed a polynomial optimal algorithm (both for 1+1 and 1:1 protection). u Most-survivable-connection-with-a-bandwidth-of-at-least-B is established by a novel reduction to the min-cost flow problem. u This reduction constitutes an algorithmic building block for the establishment of the most survivable connection and the widest p-survivable connection. Indeed, most survivable connection with a bandwidth of at least B=0 is a most survivable connection per se.
Polynomial optimal algorithms (cont.) u How to establish a widest p-survivable connection? u Idea : search for the largest B such that the most-survivable- connection-with-a-bandwidth-of-at-least-B is a p-survivable connection. u We show that it is sufficient to perform a binary search over the set u Therefore, the widest p-survivable connection is established within O(logN) executions of any min-cost flow algorithm. Indeed, the above set contains 2·M elements. Therefore, a binary search over this set enables to consider O(log2·M)=O(logN) candidates.
u Up to now, only focused on 1:1 and 1+1 protection architectures. u Tunable survivability gives rise to a third protection architecture that combines 1:1 and 1+1 protection. u Advantages Propagates data over minimum-latency paths. Produces better congestion level (over the common links) than 1+1 protection. Has better recovery time from a failure than 1:1 protection For 1:1, signaling is required to perform the switch-over operation. Hybrid Protection s t e1e1 e2e2 e3e3 e4e4 e5e5 u v p1p1 p2p2
u Disadvantage Requires additional nodal capabilities. u As with 1+1 and 1:1 protection, designed for hybrid protection optimal algorithms that establish survivable connections in a polynomial running time. Hybrid Protection (cont.) s t e1e1 e2e2 e3e3 e4e4 e5e5 u v
u Additive QoS Extensions In many cases it is important to consider additive metrics as quality criteria for survivable connections. Additive metrics: delay, jitter, cost… Fortunately, all the algorithms can be modified to consider additive metrics while still admitting a polynomial running time. u Beyond the single link failure model Establishment of p-survivable connections is an NP-hard problem. Yet, we introduce an alternative survivability criterion for multiple failures that admits optimal (polynomial) solutions. Approximation algorithms – good direction for future research. Extensions
Thank you!
How much is gained by employing tunable survivability? u Experiment : Generated random networks that include 10,000 Waxman topologies & 10,000 Power-law topologies. Bandwidth Ratio p = the ratio between the maximum bandwidth of a p-survivable connection and the maximum bandwidth of a 1- survivable connection. Bandwidth Ratio (p) Power-law networks Waxman networks 1:1 Protection Architecture
How much is gained? (cont.) Bandwidth Ratio (p) 1+1 Protection Architecture Power-law networks Waxman networks level of survivability p
How much is gained? (cont.) level of survivability p Feasibility Ratio (p) Power-law networks Waxman networks Feasibility Ratio (p)= the ratio between the number of networks that have at least one p-survivable connections and the number of networks that have at least one connection with full survivability.
Establishing the widest p-survivable connection u Why is it enough to perform the search over the set If one path admits a link e then the bandwidth of the connection is at most c e. If both paths admit a link e then the bandwidth of the connection is at most. Hence, by definition, there exists at least one tight link e E such that the bandwidth of the connection is either c e or. u Why O(logN) executions of a min cost flow algorithm ? The set contains 2·M elements. A binary search over the set enables to consider O(log2·M)=O(logN) values.
Waxman and Power-law topologies u 10,000 Waxman networks: Source and destination are located at the diagonally opposite corner of a square area of unit dimension. 198 nodes are uniformly spread over the square. A link between two nodes u,v exists with the following probability, which depends on the distance between them δ(u,v): where α=1.8, β=0.05. u 10,000 Power-law networks: We assigned a number of out-degree credits to each node, using the power-law distribution β∙x -α where α=0.75 and β=0.05. Then, we connected the nodes so that every node obtained the assigned out-degree.
Property: Only the Common Links Count u Under the single link failure model, only the links that are common to all paths can affect a survivable connection. u Therefore, the probability that a survivable connection remains operational upon a failure is equal to the probability that all its common links are operational upon that failure. u Hence, (p 1,p 2 ) is a most survivable connection if it maximizes common link
Most Survivable Connections with a Bandwidth of at Least B u Established by reduction to the min cost flow problem. The flow demand is set to 2∙B flow units. Since both the flow demand and the capacities are B- integral, the resulting flow is B-integral. Hence, the flow decomposition algorithm can construct a pair of paths each with a bandwidth B. A link in the original network Links in the transformed network Discard the link C e <B B≤C e <2∙B C e ≥2∙B c e =B, w e =0 c e =B, w e =-ln(1-p e ) c e,p e The flow demand is set to 2∙B flow units. Since both the flow demand and the capacities are B-integral, the resulting flow is B-integral.
Most Survivability with a Bandwidth of at Least B (cont.) u A min cost flow maximizes the success probability of the common links. Only the common links incurs a non-zero cost of -B∙ln(1-p e ). Hence, a min cost flow minimizes. hence, it maximizes A link in the original network Links in the transformed network Discard the link C e <B B≤C e <2∙B C e ≥2∙B c e =B, w e =0 c e =B, w e =-ln(1-p e ) c e,p e
u The only difference in the reduction lies for the links that have capacities in the range [B,2B]. u For 1:1 protection, only one of the paths carries B flow units. u Hence, all links that have a capacity in the range [B,2B] can be employed by both paths concurrently. A link in the original network Links in the transformed network Discard the link C e <B C e ≥B c e =B, w e =0 c e =B, w e =-ln(1-p e ) c e,p e Establishing Survivable Connections for 1:1 protection Go to 1+1 reduction