Responsive Yet Stable Traffic Engineering Srikanth Kandula Dina Katabi, Bruce Davie, and Anna Charny.

Slides:



Advertisements
Similar presentations
Path Splicing with Network Slicing Nick Feamster Murtaza Motiwala Santosh Vempala.
Advertisements

Congestion Control and Fairness Models Nick Feamster CS 4251 Computer Networking II Spring 2008.
Stable Load Control with Load Prediction in Multipath Packet Forwarding IlKyu Park, Youngseok Lee, and Yanghee Choi Proc. 15 th IEEE Int l conf. on Information.
Optimal Capacity Sharing of Networks with Multiple Overlays Zheng Ma, Jiang Chen, Yang Richard Yang and Arvind Krishnamurthy Yale University University.
Abhigyan, Aditya Mishra, Vikas Kumar, Arun Venkataramani University of Massachusetts Amherst 1.
Traffic Engineering with Forward Fault Correction (FFC)
Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
1 Traffic Engineering (TE). 2 Network Congestion Causes of congestion –Lack of network resources –Uneven distribution of traffic caused by current dynamic.
Flowlet Switching Srikanth Kandula Shan Sinha & Dina Katabi.
1 EL736 Communications Networks II: Design and Algorithms Class8: Networks with Shortest-Path Routing Yong Liu 10/31/2007.
Router-assisted congestion control Lecture 8 CS 653, Fall 2010.
SIGCOMM 2003 Making Intra-Domain Routing Robust to Changing and Uncertain Traffic Demands: Understanding Fundamental Tradeoffs David Applegate Edith Cohen.
Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.
XCP: Congestion Control for High Bandwidth-Delay Product Network Dina Katabi, Mark Handley and Charlie Rohrs Presented by Ao-Jan Su.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
Network Architecture for Joint Failure Recovery and Traffic Engineering Martin Suchara in collaboration with: D. Xu, R. Doverspike, D. Johnson and J. Rexford.
Ashish Gupta Under Guidance of Prof. B.N. Jain Department of Computer Science and Engineering Advanced Networking Laboratory.
Traffic Engineering With Traditional IP Routing Protocols
Traffic Engineering Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Multiple constraints QoS Routing Given: - a (real time) connection request with specified QoS requirements (e.g., Bdw, Delay, Jitter, packet loss, path.
December 20, 2004MPLS: TE and Restoration1 MPLS: Traffic Engineering and Restoration Routing Zartash Afzal Uzmi Computer Science and Engineering Lahore.
Network Protocols Designed for Optimizability Jennifer Rexford Princeton University
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Traffic Matrix Estimation: Existing Techniques and New Directions A. Medina (Sprint Labs, Boston University), N. Taft (Sprint Labs), K. Salamatian (University.
More Traffic Engineering TE with MPLS TE in the inter-domain.
Rethinking Internet Traffic Management: From Multiple Decompositions to a Practical Protocol Jiayue He Princeton University Joint work with Martin Suchara,
RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.
Congestion Control for High Bandwidth-delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs.
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
Multipath Protocol for Delay-Sensitive Traffic Jennifer Rexford Princeton University Joint work with Umar Javed, Martin Suchara, and Jiayue He
S. Suri, M, Waldvogel, P. Warkhede CS University of Washington Profile-Based Routing: A New Framework for MPLS Traffic Engineering.
Congestion Control for High Bandwidth-Delay Product Environments Dina Katabi Mark Handley Charlie Rohrs.
On Self Adaptive Routing in Dynamic Environments -- A probabilistic routing scheme Haiyong Xie, Lili Qiu, Yang Richard Yang and Yin Yale, MR and.
A Study of MPLS Department of Computing Science & Engineering DE MONTFORT UNIVERSITY, LEICESTER, U.K. By PARMINDER SINGH KANG
Jennifer Rexford Princeton University MW 11:00am-12:20pm Wide-Area Traffic Management COS 597E: Software Defined Networking.
SMUCSE 8344 Constraint-Based Routing in MPLS. SMUCSE 8344 Constraint Based Routing (CBR) What is CBR –Each link a collection of attributes (performance,
Receiver-driven Layered Multicast Paper by- Steven McCanne, Van Jacobson and Martin Vetterli – ACM SIGCOMM 1996 Presented By – Manoj Sivakumar.
Routing Games for Traffic Engineering F. Larroca and J.L. Rougier IEEE International Conference on Communications (ICC 2009) Dresden, Germany, June
1 Latency Equalization: A Programmable Routing Service Primitive Minlan Yu Joint work with Marina Thottan, Li Li at Bell Labs.
MATE: MPLS Adaptive Traffic Engineering Anwar Elwalid, et. al. IEEE INFOCOM 2001.
Roadmap-Based End-to-End Traffic Engineering for Multi-hop Wireless Networks Mustafa O. Kilavuz Ahmet Soran Murat Yuksel University of Nevada Reno.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
Lecture 15. IGP and MPLS D. Moltchanov, TUT, Spring 2008 D. Moltchanov, TUT, Spring 2015.
Shannon Lab 1AT&T – Research Traffic Engineering with Estimated Traffic Matrices Matthew Roughan Mikkel Thorup
A Fair and Dynamic Load Balancing Mechanism F. Larroca and J.L. Rougier International Workshop on Traffic Management and Traffic Engineering for the Future.
Quick-Start for TCP and IP Draft-amit-quick-start-03.txt A.Jain, S. Floyd, M. Allman, and P. Sarolahti ICIR, December
Intradomain Traffic Engineering By Behzad Akbari These slides are based in part upon slides of J. Rexford (Princeton university)
Research Unit in Networking - University of Liège A Distributed Algorithm for Weighted Max-Min Fairness in MPLS Networks Fabian Skivée
6 December On Selfish Routing in Internet-like Environments paper by Lili Qiu, Yang Richard Yang, Yin Zhang, Scott Shenker presentation by Ed Spitznagel.
Fast recovery in IP networks using Multiple Routing Configurations Amund Kvalbein Simula Research Laboratory.
1 Slides by Yong Liu 1, Deep Medhi 2, and Michał Pióro 3 1 Polytechnic University, New York, USA 2 University of Missouri-Kansas City, USA 3 Warsaw University.
HELSINKI UNIVERSITY OF TECHNOLOGY Visa Holopainen 1/18.
XCP: eXplicit Control Protocol Dina Katabi MIT Lab for Computer Science
MATE: MPLS Adaptive Traffic Engineering Anwar Elwalid Cheng Jin Steven Low Indra Widjaja Bell Labs Michigan altech Fujitsu 2006.
TeXCP: Protecting Providers’ Networks from Unexpected Failures & Traffic Spikes Dina Katabi MIT - CSAIL nms.csail.mit.edu/~dina.
1 Traffic Engineering By Kavitha Ganapa. 2 Introduction Traffic engineering is concerned with the issue of performance evaluation and optimization of.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Real-time Transport for Assured Forwarding: An Architecture for both Unicast and Multicast Applications By Ashraf Matrawy and Ioannis Lambadaris From Carleton.
Congestion Control for High Bandwidth-Delay Product Networks Dina Katabi, Mark Handley, Charlie Rohrs Presented by Yufei Chen.
William Stallings Data and Computer Communications
CS 268: Lecture 6 Scott Shenker and Ion Stoica
ECE 544: Traffic engineering (supplement)
Loop protection and IPFRR
ISP and Egress Path Selection for Multihomed Networks
Multi-Core Parallel Routing
FAST TCP : From Theory to Experiments
TCP Congestion Control
Achieving Resilient Routing in the Internet
Srinivasan Seetharaman - College of Computing, Georgia Tech
Presentation transcript:

Responsive Yet Stable Traffic Engineering Srikanth Kandula Dina Katabi, Bruce Davie, and Anna Charny

Good mapping requires Load Balancing Good Mapping Good Performance & Low Cost ISPs needs to map traffic to underlying topology 100% Ingress Egress

100% Ingress Egress More, ISPs want to re-balance load when an unexpected event causes congestion

100% Ingress Egress Traffic Change More, ISPs want to re-balance load when an unexpected event causes congestion failure, BGP reroute, flash crowd, or attack

100% Ingress Egress Traffic Change Move Traffic

Need to rebalance load ASAP o Remove congestion before it affects users performance But, moving quickly may overshoot congestion on a different path more drops … But, rebalancing load in realtime is risky Congestion! Ingress1 Ingress2 Traffic Change

Need to rebalance load ASAP o Remove congestion before it affects users performance But, moving quickly may overshoot congestion on a different path more drops … But, rebalancing load in realtime is risky Ingress1 Ingress2 Traffic Change Congestion!

Problem: How to make Traffic Engineering: Responsive: reacts ASAP Stable: converges to balanced load without overshooting or generating new congestion Problem: How to make Traffic Engineering: Responsive: reacts ASAP Stable: converges to balanced load without overshooting or generating new congestion Need to rebalance load ASAP o Remove congestion before it affects users performance But, moving quickly may overshoot congestion on a different path more drops … But, rebalancing load in realtime is risky

Current Approaches Offline TE (e.g., OSPF-TE) Avoids the risk of instability caused by realtime adaptation, but also misses the benefits Balances the load in steady state Deal with failures and change in demands by computing routes that work under most conditions Overprovision for unanticipated events Online TE (e.g., MATE) Try to adapt to unanticipated events But, can overshoot causing drops and instability Long-Term Demands Long-Term Demands Link Weights OSPF-TE

This Talk TeXCP: Responsive & Stable Online TE Idea: o Use adaptive load balancing o But add explicit-feedback congestion control to prevent overshoot and drops TeXCP keeps utilization always within a few percent of optimal Compare to MATE and OSPF-TE, showing that TeXCP outperforms both

Typical Formalization of the TE Problem Find a routing that: Min Max-Utilization Removes hot spots and balances load High Max-Utilization is an indicator that the ISP should upgrade its infrastructure

Online TE involves solving 2 sub-problems 1.Find the traffic split that minimizes the Max-Utilization 2.Converge to the balanced traffic splits in a stable manner Also, an implementation mechanism to force traffic to follow the desired splits

A TeXCP agent per IE, at ingress node ISP configures each TeXCP agent with paths between IE Paths are pinned (e.g., MPLS tunnels) Force traffic along the right paths Implementation: Ingress Egress TeXCP Agent Solution:

Periodically, TeXCP agent probes a path for its utilization Distributedly, TeXCP agents find balanced traffic splits Sub-Problem: Solution: TeXCP Load Balancer Ingress Egress x U1 = 0.4 U2 = 0.7 U1 = 0.4 U2 = 0.7 Probes follow the slow path like ICMP messages

Periodically, TeXCP agent probes a path for its utilization Sub-Problem: Solution: A TeXCP agent iteratively moves traffic from over-utilized paths to under-utilized paths o r p is this agents traffic on path p Deal with different path capacity Deal with inactive paths ( r p =0) TeXCP Load Balancer Distributedly, TeXCP agents find balanced traffic splits

Periodically, TeXCP agent probes a path for its utilization Sub-Problem: Solution: A TeXCP agent iteratively moves traffic from over-utilized paths to under-utilized paths o r p is this agents traffic on path p Deal with different path capacity Deal with inactive paths ( r p =0) TeXCP Load Balancer Proof in paper Distributedly, TeXCP agents find balanced traffic splits

Converge to balanced load in a stable way Sub-Problem: Solution: Use Experience from Congestion Control (XCP) Congestion Control Flow from sender to receiver Senders share the bottleneck; need coordination to prevent oscillations Online TE Flow from ingress to egress TeXCP agents share physical link; need coordination to prevent oscillations Move in really small increments No Overshoot! Challenge is to move traffic quickly w/o overshoot

Load Balancer Load Balancer Path Controller Path Controller Path Controller Path Controller Ingress-Egress Demands Congestion Management layer Analogous to an Application Congestion Management Layer between Load Balancer and Data Plane o Set of light-weight per-path congestion controllers Unlike prior online TE, Load Balancer can push a decision to the data plane only as fast as the Congestion Management Layer allows it

Explicit feedback from core routers (like XCP) Periodically, collects feedback in ICMP-like probes Per-Path Light-Weight Congestion Controller Ingress Egress Utilization Feedback Probe

Explicit feedback from core routers (like XCP) Periodically, collects feedback in ICMP-like probes Per-Path Light-Weight Congestion Controller Ingress Egress U =.8 F = 100kbps Probe

Explicit feedback from core routers (like XCP) Periodically, collects feedback in ICMP-like probes Per-Path Light-Weight Congestion Controller Ingress Egress U =.8 F = 100kbps Probe U =.2 F = 500kbps U =.2 F = 500kbps

Explicit feedback from core routers (like XCP) Periodically, collects feedback in ICMP-like probes Per-Path Light-Weight Congestion Controller Ingress Egress Core router computes aggregate feedback Δ = Spare BW – Queue / Max-RTT Estimates number of IE-flows by counting probes, and divides feedback between them Occasional explicit feedback in probes… Need software changes only

Stability Idea Theorem 1: Given a particular load split, the path controller stabilizes the traffic on each link Theorem 2: Given stable path controllers, o Every TeXCP agent sees balanced load on all paths o Unused paths have higher utilization than used paths Load Balancer Path Controller Ingress Egress Informally stated: Per-path controller works at a faster timescale than load balancer Can decouple components Stabilize separately

Performance

Simulation Setup Standard for TE Rocketfuel topologies Average demands follow gravity model IE-traffic consists of large # of Pareto on-off sources TeXCP Parameters: o Each agent is configured with 10 shortest paths o Probe for explicit feedback every 0.1s o Load balancer re-computes a split every 0.5s Compare to Optimal Max-Utilization o Obtained with a centralized oracle that has Immediate and exact demands info, and uses as many paths as necessary

TeXCP Balances Load Without Oscillations TeXCP converges to a few percent of optimal Time (s) Maximum Link Utilization

TeXCP Balances Load Without Oscillations Time (s) Utilizations of all links in the network change without oscillations Link Utilization

Comparison with MATE MATE is the state-of-the-art in online TE All simulation parameters are from the MATE paper Ingress 1 Ingress 2 Ingress 3 Egress 1 Egress 2 Egress 3 L1 L2 L3

Time (s) Avg. drop rate in MATE is 20% during convergence decrease in cross traffic TeXCP MATE Explicit feedback allows TeXCP to react faster and without oscillations TeXCP balances load better than MATE Offered Link Load

Comparison with OSPF-TE OSPF-TE is the most-studied offline TE scheme It computes link weights, which when used in OSPF balance the load OSPF-TE-FAIL is an extension that optimizes for failures OSPF-TE-Multi-TM is an extension that optimizes for variations in traffic demands

Comparison with OSPF-TE under Static Load Abovenet Genuity Sprint Tiscali AT&T Optimal Ratio of Max-U to Opt.

TeXCP is within a few percent of optimal, outperforming OSPF-TE Comparison with OSPF-TE under Static Load Abovenet Genuity Sprint Tiscali AT&T Ratio of Max-U to Opt. OSPF-TE TeXCP

Comparison with OSPF-TE-Fail Abovenet Genuity Sprint Tiscali AT&T Ratio of Max-U to Opt. TeXCP allows an ISP to support same failure resilience with about ½ the capacity ! OSPF-TE-Fail TeXCP

Performance When Traffic Deviates From Long-term Averages TeXCP reacts better to realtime demands! Ratio of Max-U to Opt. Deviation from Long-term Average Demands OSPF-TE-Multi-TM TeXCP

Conclusion TeXCP: Responsive & Stable Online TE Combines load balancing with a Cong. Mngt. Layer to prevent overshoot and drops TeXCP keeps utilization always within a few percent of optimal Compared to MATE, it is faster and does not overshoot Compared to OSPF-TE o it keeps utilization 20% to 100% lower o it supports the same failure resilience with ½ the capacity major savings for the ISP