Enabling Flow-level Latency Measurements across Routers in Data Centers Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao Kompella.

Slides:



Advertisements
Similar presentations
Data Center Networking with Multipath TCP
Advertisements

Interconnection Networks: Flow Control and Microarchitecture.
Logically Centralized Control Class 2. Types of Networks ISP Networks – Entity only owns the switches – Throughput: 100GB-10TB – Heterogeneous devices:
Florin Dinu T. S. Eugene Ng Rice University Inferring a Network Congestion Map with Traffic Overhead 0 zero.
Programmable Measurement Architecture for Data Centers Minlan Yu University of Southern California 1.
Architectures for Congestion-Sensitive Pricing of Network Services Thesis Defense by Murat Yuksel CS Department, RPI July 3 rd, 2002.
A Fast and Compact Method for Unveiling Significant Patterns in High-Speed Networks Tian Bu 1, Jin Cao 1, Aiyou Chen 1, Patrick P. C. Lee 2 Bell Labs,
Fine-Grained Latency and Loss Measurements in the Presence of Reordering Myungjin Lee, Sharon Goldberg, Ramana Rao Kompella, George Varghese.
Packet Caches on Routers: The Implications of Universal Redundant Traffic Elimination Ashok Anand, Archit Gupta, Aditya Akella University of Wisconsin,
Bridging Router Performance and Queuing Theory Dina Papagiannaki, Intel Research Cambridge with Nicolas Hohn, Darryl Veitch and.
1 Reversible Sketches for Efficient and Accurate Change Detection over Network Data Streams Robert Schweller Ashish Gupta Elliot Parsons Yan Chen Computer.
Traffic Engineering With Traditional IP Routing Protocols
Server-based Inference of Internet Performance V. N. Padmanabhan, L. Qiu, and H. Wang.
1 Modeling and Taming Parallel TCP on the Wide Area Network Dong Lu,Yi Qiao Peter Dinda, Fabian Bustamante Department of Computer Science Northwestern.
Shadow Configurations: A Network Management Primitive Richard Alimi, Ye Wang, Y. Richard Yang Laboratory of Networked Systems Yale University.
© 2003 By Default! A Free sample background from Slide 1 SAVE: Source Address Validity Enforcement Protocol Authors: Li,
MPLS and Traffic Engineering
Privacy-Preserving Cross-Domain Network Reachability Quantification
CSCI 4550/8556 Computer Networks Comer, Chapter 20: IP Datagrams and Datagram Forwarding.
Impact of BGP Dynamics on Intra-Domain Traffic Patterns in the Sprint IP Backbone Sharad Agarwal, Chen-Nee Chuah, Supratik Bhattacharyya, Christophe Diot.
User-level Internet Path Diagnosis R. Mahajan, N. Spring, D. Wetherall and T. Anderson.
Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Rigorous fair Queueing requires per flow state: too costly in high speed core routers.
A General approach to MPLS Path Protection using Segments Ashish Gupta Ashish Gupta.
A victim-centric peer-assisted framework for monitoring and troubleshooting routing problems.
Ashish Gupta (98130) Ashish Gupta (98131) Under guidance of Prof. B. N. Jain.
Core Stateless Fair Queueing Stoica, Shanker and Zhang - SIGCOMM 98 Fair Queueing requires per flow state: too costly in high speed core routers Yet, some.
A Scalable, Commodity Data Center Network Architecture.
RelSamp: Preserving Application Structure in Sampled Flow Measurements Myungjin Lee, Mohammad Hajjat, Ramana Rao Kompella, Sanjay Rao.
Not All Microseconds are Equal: Fine-Grained Per-Flow Measurements with Reference Latency Interpolation Myungjin Lee †, Nick Duffield‡, Ramana Rao Kompella†
Jennifer Rexford Fall 2010 (TTh 1:30-2:50 in COS 302) COS 561: Advanced Computer Networks Stub.
MATE: MPLS Adaptive Traffic Engineering Anwar Elwalid, et. al. IEEE INFOCOM 2001.
1 Multi-Protocol Label Switching (MPLS) presented by: chitralekha tamrakar (B.S.E.) divya krit tamrakar (B.S.E.) Rashmi shrivastava(B.S.E.) prakriti.
A Machine Learning-based Approach for Estimating Available Bandwidth Ling-Jyh Chen 1, Cheng-Fu Chou 2 and Bo-Chun Wang 2 1 Academia Sinica 2 National Taiwan.
Tomo-gravity Yin ZhangMatthew Roughan Nick DuffieldAlbert Greenberg “A Northern NJ Research Lab” ACM.
1 Meeyoung Cha, Sue Moon, Chong-Dae Park Aman Shaikh Placing Relay Nodes for Intra-Domain Path Diversity To appear in IEEE INFOCOM 2006.
Introduction to MPLS and Traffic Engineering Zartash Afzal Uzmi.
VL2 – A Scalable & Flexible Data Center Network Authors: Greenberg et al Presenter: Syed M Irteza – LUMS CS678: 2 April 2013.
IPv4 TO IPv6 TRANSITION AND INTEROPERABILITY FOR TELECOM SERVICE PROVIDER Business Problem In today’s environment of growing connectivity where almost.
Department of Computer Science at Florida State LFTI: A Performance Metric for Assessing Interconnect topology and routing design Background ‒ Innovations.
Rev PA Signaled Provisioning of the IP Network Resources Between the Media Gateways in Mobile Networks Leena Siivola
David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:
Noise Can Help: Accurate and Efficient Per-flow Latency Measurement without Packet Probing and Time Stamping Dept. of Computer Science and Engineering.
High-Fidelity Latency Measurements in Low-Latency Networks Ramana Rao Kompella Myungjin Lee (Purdue), Nick Duffield (AT&T Labs – Research)
Use cases Navigation Problem notification Problem analysis.
1 Route Optimization for Large Scale Network Mobility Assisted by BGP Feriel Mimoune, Farid Nait-Abdesselam, Tarik Taleb and Kazuo Hashimoto GLOBECOM 2007.
Resource/Accuracy Tradeoffs in Software-Defined Measurement Masoud Moshref, Minlan Yu, Ramesh Govindan HotSDN’13.
Requirements for Simulation and Modeling Tools Sally Floyd NSF Workshop August 2005.
Datacenter Network Simulation using ns3
Jennifer Rexford Princeton University MW 11:00am-12:20pm Measurement COS 597E: Software Defined Networking.
April 4th, 2002George Wai Wong1 Deriving IP Traffic Demands for an ISP Backbone Network Prepared for EECE565 – Data Communications.
Trajectory Sampling for Direct Traffic Oberservation N.G. Duffield and Matthias Grossglauser IEEE/ACM Transactions on Networking, Vol. 9, No. 3 June 2001.
N. Hu (CMU)L. Li (Bell labs) Z. M. Mao. (U. Michigan) P. Steenkiste (CMU) J. Wang (AT&T) Infocom 2005 Presented By Mohammad Malli PhD student seminar Planete.
On the Impact of Clustering on Measurement Reduction May 14 th, D. Saucez, B. Donnet, O. Bonaventure Thanks to P. François.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
Goal Providing QoS guarantee without maintaining per- flow state on the data or control planes. Motivation Integrated Services – Per-flow state at all.
Internet Traffic Engineering Motivation: –The Fish problem, congested links. –Two properties of IP routing Destination based Local optimization TE: optimizing.
Placing Relay Nodes for Intra-Domain Path Diversity Meeyoung Cha Sue Moon Chong-Dae Park Aman Shaikh Proc. of IEEE INFOCOM 2006 Speaker 游鎮鴻.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
PATH DIVERSITY WITH FORWARD ERROR CORRECTION SYSTEM FOR PACKET SWITCHED NETWORKS Thinh Nguyen and Avideh Zakhor IEEE INFOCOM 2003.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
MOZART: Temporal Coordination of Measurement (SOSR’ 16)
Problem: Internet diagnostics and forensics
NOX: Towards an Operating System for Networks
Dingming Wu+, Yiting Xia+*, Xiaoye Steven Sun+,
EE 122: Lecture 18 (Differentiated Services)
EE 122: Differentiated Services
Lu Tang , Qun Huang, Patrick P. C. Lee
Author: Ramana Rao Kompella, Kirill Levchenko, Alex C
Presentation transcript:

Enabling Flow-level Latency Measurements across Routers in Data Centers Parmjeet Singh, Myungjin Lee Sagar Kumar, Ramana Rao Kompella

Latency-critical applications in data centers  Guaranteeing low end-to-end latency is important  Web search (e.g., Google’s instant search service)  Retail advertising  Recommendation systems  High-frequency trading in financial data centers  Operators want to troubleshoot latency anomalies  End-host latencies can be monitored locally  Detection, diagnosis and localization through a network: no native support of latency measurements in a router/switch

Prior solutions  Lossy Difference Aggregator (LDA)  Kompella et al. [SIGCOMM ’09]  Aggregate latency statistics  Reference Latency Interpolation (RLI)  Lee et al. [SIGCOMM ’10]  Per-flow latency measurements More suitable due to more fine-grained measurements

Deployment scenario of RLI  Upgrading all switches/routers in a data center network  Pros  Provide finest granularity of latency anomaly localization  Cons  Significant deployment cost  Possible downtime of entire production data centers  In this work, we are considering partial deployment of RLI  Our approach: RLI across Routers (RLIR)

Overview of RLI architecture  Goal  Latency statistics on a per-flow basis between interfaces  Problem setting  No storing timestamp for each packet at ingress and egress due to high storage and communication cost  Regular packets do not carry timestamps Router Ingress I Egress E

Overview of RLI architecture  Premise of RLI: delay locality  Approach 1) The injector sends reference packets regularly 2) Reference packet carries ingress timestamp 3) Linear interpolation: compute per-packet latency estimates at the latency estimator 4) Per-flow estimates by aggregating per-packet estimates Latency Estimator Reference Packet Injector Ingress IEgress E R L Delay Time R L 1 Linear interpolation line 2 Interpolated delay

Full vs. Partial deployment  Full deployment: 16 RLI sender-receiver pairs  Partial deployment: 4 RLI senders + 2 RLI receivers  % deployment cost reduction Switch 1Switch 5 Switch 2Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector)RLI Receiver (Latency Estimator)

Case 1: Presence of cross traffic  Issue: Inaccurate link utilization estimation at the sender leads to high reference packet injection rate  Approach  Not actively addressing the issue  Evaluation shows no much impact on packet loss rate increase  Details in the paper Switch 1Switch 5 Switch 2Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector)RLI Receiver (Latency Estimator) Link utilization estimation on Switch 1 Bottleneck Link Cross Traffic

Case 2: RLI Sender side  Issue: Traffic may take different routes at an intermediate switch  Approach: Sender sends reference packets to all receivers Switch 1Switch 5 Switch 2Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector)RLI Receiver (Latency Estimator)

Case 3: RLI Receiver side  Issue: Hard to associate reference packets and regular packets that traversed the same path  Approaches  Packet marking: requires native support from routers  Reverse ECMP computation: ‘reverse’ engineer intermediate routes using ECMP hash function  IP prefix matching at limited situation Switch 1Switch 5 Switch 2Switch 4 Switch 3 Switch 6 RLI Sender (Reference Packet Injector)RLI Receiver (Latency Estimator)

Deployment example in fat-tree topology RLI Sender (Reference Packet Injector)RLI Receiver (Latency Estimator) IP prefix matching Reverse ECMP computation / IP prefix matching

Evaluation  Simulation setup  Trace: regular traffic (22.4M pkts) + cross traffic (70M pkts)  Simulator  Results  Accuracy of per-flow latency estimates Traffic Divider Switch1Switch2 Cross Traffic Injector Packet Trace RLI Receiver RLI Sender Cross Traffic Regular Traffic Reference packets 10% / 1% injection rate

67% Accuracy of per-flow latency estimates 10% injection 1% injection 10% injection 1% injection Bottleneck link utilization:93% Relative error CDF 1.2%4.5% 18%31%

Summary  Low latency applications in data centers  Localization of latency anomaly is important  RLI provides flow-level latency statistics, but full deployment (i.e., all routers/switches) cost is expensive  Proposed a solution enabling partial deployment of RLI  No too much loss in localization granularity (i.e., every other router)

Thank you! Questions?