RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar.

Slides:



Advertisements
Similar presentations
Responsive Yet Stable Traffic Engineering Srikanth Kandula Dina Katabi, Bruce Davie, and Anna Charny.
Advertisements

Internet Measurement Conference 2003 Source-Level IP Packet Bursts: Causes and Effects Hao Jiang Constantinos Dovrolis (hjiang,
Stable Load Control with Load Prediction in Multipath Packet Forwarding IlKyu Park, Youngseok Lee, and Yanghee Choi Proc. 15 th IEEE Int l conf. on Information.
Florin Dinu T. S. Eugene Ng Rice University Inferring a Network Congestion Map with Traffic Overhead 0 zero.
CSIT560 Internet Infrastructure: Switches and Routers Active Queue Management Presented By: Gary Po, Henry Hui and Kenny Chong.
Architectures for Congestion-Sensitive Pricing of Network Services Thesis Defense by Murat Yuksel CS Department, RPI July 3 rd, 2002.
Bayesian Piggyback Control for Improving Real-Time Communication Quality Wei-Cheng Xiao 1 and Kuan-Ta Chen Institute of Information Science, Academia Sinica.
Fine-Grained Latency and Loss Measurements in the Presence of Reordering Myungjin Lee, Sharon Goldberg, Ramana Rao Kompella, George Varghese.
Comp Spring 2003 Delay Jitter Ketan Mayer-Patel.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli University of Calif, Berkeley and Lawrence Berkeley National Laboratory SIGCOMM.
Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,
1 End to End Bandwidth Estimation in TCP to improve Wireless Link Utilization S. Mascolo, A.Grieco, G.Pau, M.Gerla, C.Casetti Presented by Abhijit Pandey.
Using FEC for Rate Adaptation of Multimedia Streams Marcin Nagy Supervised by: Jörg Ott Instructed by: Varun Singh Conducted at Comnet, School of Electrical.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
On Modeling Feedback Congestion Control Mechanism of TCP using Fluid Flow Approximation and Queuing Theory  Hisamatu Hiroyuki Department of Infomatics.
1 Estimating Shared Congestion Among Internet Paths Weidong Cui, Sridhar Machiraju Randy H. Katz, Ion Stoica Electrical Engineering and Computer Science.
1 USC INFORMATION SCIENCES INSTITUTE RAP: An End-to-End Congestion Control Mechanism for Realtime Streams in the Internet Reza Rejaie, Mark Handley, Deborah.
The War Between Mice and Elephants Presented By Eric Wang Liang Guo and Ibrahim Matta Boston University ICNP
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
Loss and Delay Accountability for the Internet by Presented by:Eric Chan Kai Chen.
1 Resource Management in IP Telephony Networks Matthew Caesar, Dipak Ghosal, Randy H. Katz {mccaesar,
1 Estimating Shared Congestion Among Internet Paths Weidong Cui, Sridhar Machiraju Randy H. Katz, Ion Stoica Electrical Engineering and Computer Science.
Controlling High- Bandwidth Flows at the Congested Router Ratul Mahajan, Sally Floyd, David Wetherall AT&T Center for Internet Research at ICSI (ACIRI)
Towards More Adaptive Internet Routing Mukund Seshadri Prof. Randy Katz.
Traffic Engineering Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Metrics for Performance Evaluation Nelson Fonseca State University of Campinas.
Diffusion Mechanisms for Active Queue Management Department of Electrical and Computer Engineering University of Delaware May 19th / 2004 Rafael Nunez.
1 Design study for multimedia transport protocol in heterogeneous networks Haitao Wu; Qian Zhang; Wenwu Zhu; Communications, ICC '03. IEEE International.
Configurable restoration in overlay networks Matthew Caesar, Takashi Suzuki.
1 End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003.
Analyzing Cooperative Containment Of Fast Scanning Worms Jayanthkumar Kannan Joint work with Lakshminarayanan Subramanian, Ion Stoica, Randy Katz.
ACN: AVQ1 Analysis and Design of an Adaptive Virtual Queue (AVQ) Algorithm for Active Queue Managment Srisankar Kunniyur and R. Srikant SIGCOMM’01 San.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
A Study of VoIP Gateway Selection Techniques Matthew Caesar, Dipak Ghosal, Randy Katz {mccaesar,
Study of Distance Vector Routing Protocols for Mobile Ad Hoc Networks Yi Lu, Weichao Wang, Bharat Bhargava CERIAS and Department of Computer Sciences Purdue.
FTDCS 2003 Network Tomography based Unresponsive Flow Detection and Control Authors Ahsan Habib, Bharat Bhragava Presenter Mohamed.
TCP-Carson A Loss-event Based Adaptive AIMD Protocol for Long-lived Flows Hariharan Kannan Advisor: Prof. M Claypool Co-Advisor: Prof. R Kinicki Reader:
The War Between Mice and Elephants By Liang Guo (Graduate Student) Ibrahim Matta (Professor) Boston University ICNP’2001 Presented By Preeti Phadnis.
Diffusion Mechanisms for Active Queue Management Department of Electrical and Computer Engineering University of Delaware May 19th / 2004 Rafael Nunez.
UCB Improvements in Core-Stateless Fair Queueing (CSFQ) Ling Huang U.C. Berkeley cml.me.berkeley.edu/~hlion.
PCP: Efficient Endpoint Congestion Control To appear in NSDI, 2006 Thomas Anderson, Andrew Collins, Arvind Krishnamurthy and John Zahorjan University of.
Availability in Wide-Area Service Composition Bhaskaran Raman and Randy H. Katz SAHARA, EECS, U.C.Berkeley.
Not All Microseconds are Equal: Fine-Grained Per-Flow Measurements with Reference Latency Interpolation Myungjin Lee †, Nick Duffield‡, Ramana Rao Kompella†
Divert: Fine-grained Path Selection for Wireless LAN Allen Miu, Godfrey Tan, Hari Balakrishnan, John Apostolopoulos * MIT Computer Science and Artificial.
An Overlay Architecture for High Quality VoIP Streams IEEE Trans. on Multimedia 2006 R 翁郁婷 R 周克遠.
End-to-end QoE Optimization Through Overlay Network Deployment Bart De Vleeschauwer, Filip De Turck, Bart Dhoedt and Piet Demeester Ghent University -
Routing Protocol Evaluation David Holmer
1 On Class-based Isolation of UDP, Short-lived and Long-lived TCP Flows by Selma Yilmaz Ibrahim Matta Computer Science Department Boston University.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
Requirements for Simulation and Modeling Tools Sally Floyd NSF Workshop August 2005.
Wide-Area Service Composition: Performance, Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Presentation at Ericsson, Jan 2002.
A Light-Weight Distributed Scheme for Detecting IP Prefix Hijacks in Real-Time Lusheng Ji†, Joint work with Changxi Zheng‡, Dan Pei†, Jia Wang†, Paul Francis‡
15744 Course Project1 Evaluation of Queue Management Algorithms Ningning Hu, Liu Ren, Jichuan Chang 30 April 2001.
Multiplicative Wavelet Traffic Model and pathChirp: Efficient Available Bandwidth Estimation Vinay Ribeiro.
1 A Framework for Measuring and Predicting the Impact of Routing Changes Ying Zhang Z. Morley Mao Jia Wang.
TCP-Cognizant Adaptive Forward Error Correction in Wireless Networks
1 SIGCOMM ’ 03 Low-Rate TCP-Targeted Denial of Service Attacks A. Kuzmanovic and E. W. Knightly Rice University Reviewed by Haoyu Song 9/25/2003.
PCP: Efficient Endpoint Congestion Control NSDI, 2006 Thomas Anderson, Andrew Collins, Arvind Krishnamurthy and John Zahorjan University of Washington.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04, 2004 Presented by Alok Rakkhit, Ionut Trestian.
PathChirp & STAB Measuring Available Bandwidth and Locating Bottlenecks in Packet Networks Vinay Ribeiro Rolf Riedi, Richard Baraniuk Rice University spin.rice.edu.
Network Computing Laboratory Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM.
OverQos: An Overlay based Architecture for Enhancing Internet Qos L Subramanian*, I Stoica*, H Balakrishnan +, R Katz* *UC Berkeley, MIT + USENIX NSDI’04,
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
1 On the Interaction between Dynamic Routing in the Native and Overlay Layers Infocom2006 Srinivasan Seetharaman and Mostafa Ammar College of Computing.
COS 561: Advanced Computer Networks
COS 561: Advanced Computer Networks
BGP Interactions Jennifer Rexford
Project proposal Multi-stream and multi-path audio transmission
Presentation transcript:

RRAPID: Real-time Recovery based on Active Probing, Introspection, and Decentralization Takashi Suzuki Matthew Caesar

Motivation Today’s internet core has bursty losses Backbones have low average loss rates (<0.2%), but experience large bursts in loss Loss durations vary from 10ms to 33.72sec 6 out of 7 providers experienced large outage periods sec for 1-2 times per day Difficult for multimedia applications to recover from repeated loss (e.g. with FEC) Commonly used restoration techniques insufficient Link layer recovery, MPLS not yet uniformly deployed RON too slow (20 sec), not scalable  real-time recovery desired “Assessment of VoIP Quality over Internet Backbones,” Markopoulou, Tobagi, Karam (INFOCOM 2002)

Approach RRAPID: Real-time Recovery based on Adaptive Probing, Introspection, and Dampening Technique: Overlay based, real-time recovery Use Link-state routing Determine link cost from packet receipt delay Adaptively dampen route advertisements Desirable properties: Speed: Low end-to-end failure time Stability: Few route oscillations Accuracy: Avoid reacting to transient failures Scalability: Low probing/communication overhead

System Architecture: Reaction Mechanism Route Stabilization (RS): Dampens route flaps Adaptive Tracking (AT): Filters noise Reacts quickly to changes Link Cost Estimation (LCE): Estimates failure probability from packet loss “Delay-deficit algorithm” RS AT LCE

Simulation Results: Layered Control Show detailed actions of layers --- LCE output: metric representing probability link has failed --- AT output: metric with noise filtered --- RS output: advertised value for link Red spikes result from back- to-back packet losses Setup Link Failure at t=[150s-170s] Probe every 300ms, 10% loss --- LCE output --- AT output --- RS output Results First Detection in 0.92s, next at 5.42 Several false positives due to cold start. Stabilizes in 100s. 0.92s corresponds to 3 lost probes plus propagation delay of 0.02s

Simulation Results: Reaction Speed Reaction Speed Probing faster improves speed Probing every <400ms can give ~1s reaction times Loss decreases reaction time Overhead Probing every >50ms gives reasonable overhead Effect of packet loss Increasing packet loss decreases accuracy Advertisements and probes are dropped Subsecond reactions even at 5% loss

Simulation Results: Comparison Compared RRAPID, RON, and “Oracle- based” routing. Results: RON requires 4 to 10x more advertisements than RRAPID RON’s overhead increases exponentially with probe speed, RRAPID’s overhead increases linearly Packet loss has an extreme effect on RON, moderate effect on RRAPID

Emulation Results: Real Internet Workload Method Measured performance on real Internet workload Traces acquired between UIUC and Stanford Emulated 2-path overlay topology, one trace for each path 1 natural failure at time t=[123.4s to 133.7s], introduced two failures from t=[40s to 50s] and t=[60s to 70s] Result Stable, sub-second reactions --- Number of flows on link #1 --- Number of flows on link #2 Overlay path 1 Overlay path 2

Analysis Simplified model of system Modeled RS layer as MIAD  Increase by 1, Decrease by 1/k  Advertisement threshold limited to n Ignored AT layer effects  n*k state Markov chain Given: Probe loss probability p Number of paths N Probe interval I We can determine: Speed: Average reaction time Overhead: Average advertisement rate Found best-case expected Overhead and Reaction time for variable transient loss rates. Results Can react quickly, stably for fairly large amounts of transient packet loss Overhead and reaction time increases super-linearly with loss rate

Conclusions 1. Can achieve sub-second reactions on most links with reasonable stability Congested links increase reaction time Can react well on most internet links 2. Trade off relationship between overhead and reaction speed 3. Lossy links worsen reaction time Hard to react quickly, stably if all paths have >10% loss. Future work: Improve scalability with route aggregation Extend evaluation of system parameters Consider wider range of topologies, cross traffic, offered loads