1 Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems Jonathon Duerig, Robert Ricci, Junxing Zhang, Daniel Gebhardt,

Slides:



Advertisements
Similar presentations
Ch. 12 Routing in Switched Networks
Advertisements

PLATO: Predictive Latency- Aware Total Ordering Mahesh Balakrishnan Ken Birman Amar Phanishayee.
Current methods for negotiating firewalls for the Condor ® system Bruce Beckles (University of Cambridge Computing Service) Se-Chang Son (University of.
PlanetLab: An Overlay Testbed for Broad-Coverage Services Bavier, Bowman, Chun, Culler, Peterson, Roscoe, Wawrzoniak Presented by Jason Waddle.
Computer Networking Lecture 20 – Queue Management and QoS.
A Measurement Study of Available Bandwidth Estimation Tools MIT - CSAIL with Jacob Strauss & Frans Kaashoek Dina Katabi.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Managing Cloud Resources: Distributed Rate Limiting Alex C. Snoeren Kevin Webb, Bhanu Chandra Vattikonda, Barath Raghavan, Kashi Vishwanath, Sriram Ramabhadran,
Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard Published in 2012 ACM’s Internet Measurement Conference (IMC) Five students from.
Experimental evaluation of TCP-L June 5, 2003 Stefan Alfredsson Karlstad University.
BZUPAGES.COM 1 User Datagram Protocol - UDP RFC 768, Protocol 17 Provides unreliable, connectionless on top of IP Minimal overhead, high performance –No.
APOHN: Subnetwork Layering to Improve TCP Performance over Heterogeneous Paths April 4, 2006 Dzmitry Kliazovich, Fabrizio Granelli, University of Trento,
Ahmed Mansy, Mostafa Ammar (Georgia Tech) Bill Ver Steeg (Cisco)
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Advanced Computer Networking Congestion Control for High Bandwidth-Delay Product Environments (XCP Algorithm) 1.
Transparent Checkpoint of Closed Distributed Systems in Emulab Anton Burtsev, Prashanth Radhakrishnan, Mike Hibler, and Jay Lepreau University of Utah,
Yale LANS ShadowStream: Performance Evaluation as a Capability in Production Internet Live Streaming Networks Chen Tian Richard Alimi Yang Richard Yang.
Extensible Networking Platform IWAN 2005 Extensible Network Configuration and Communication Framework Todd Sproull and John Lockwood
1 Modeling and Emulation of Internet Paths Pramod Sanaga, Jonathon Duerig, Robert Ricci, Jay Lepreau University of Utah.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
15-441: Computer Networking Lecture 26: Networking Future.
An Overlay Data Plane for PlanetLab Andy Bavier, Mark Huang, and Larry Peterson Princeton University.
1 End-to-End Detection of Shared Bottlenecks Sridhar Machiraju and Weidong Cui Sahara Winter Retreat 2003.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
1 Sonia Fahmy Ness Shroff Students: Roman Chertov Rupak Sanjel Center for Education and Research in Information Assurance and Security (CERIAS) Purdue.
Dennis Ippoliti 12/6/ MULTI-PATH ROUTING A packet by packet multi-path routing approach.
1 A Large-Scale Network and Distributed Systems Testbed Jay Lepreau Chris Alfeld David Andersen (MIT) Kristin Wright University of Utah
The Future of the Internet Jennifer Rexford ’91 Computer Science Department Princeton University
Congestion Control for High Bandwidth-Delay Product Environments Dina Katabi Mark Handley Charlie Rohrs.
Sven Ubik, CESNET TNC2004, Rhodos, 9 June 2004 Performance monitoring of high-speed networks from NREN perspective.
P2P File Sharing Systems
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jennifer Rexford Princeton University With Jiayue He, Rui Zhang-Shen, Ying Li,
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
These materials are licensed under the Creative Commons Attribution-Noncommercial 3.0 Unported license (
Overlay Network Physical LayerR : router Overlay Layer N R R R R R N.
CS332, Ch. 26: TCP Victor Norman Calvin College 1.
The ProactiveWatch Monitoring Service. Are These Problems For You? Your business gets disrupted when your IT environment has issues Your employee and.
Fairness Attacks in the eXplicit Control Protocol Christo Wilson Christopher Coakley Ben Y. Zhao University of California Santa Barbara.
Advanced Network Architecture Research Group 2001/11/74 th Asia-Pacific Symposium on Information and Telecommunication Technologies Design and Implementation.
Wide-Area Service Composition: Performance, Availability and Scalability Bhaskaran Raman SAHARA, EECS, U.C.Berkeley Presentation at Ericsson, Jan 2002.
Virtual Workspaces Kate Keahey Argonne National Laboratory.
1 Testbeds Breakout Tom Anderson Jeff Chase Doug Comer Brett Fleisch Frans Kaashoek Jay Lepreau Hank Levy Larry Peterson Mothy Roscoe Mehul Shah Ion Stoica.
DaVinci: Dynamically Adaptive Virtual Networks for a Customized Internet Jiayue He, Rui Zhang-Shen, Ying Li, Cheng-Yen Lee, Jennifer Rexford, and Mung.
CS640: Introduction to Computer Networks Aditya Akella Lecture 20 - Queuing and Basics of QoS.
Large-scale Virtualization in the Emulab Network Testbed Mike Hibler, Robert Ricci, Leigh Stoller Jonathon Duerig Shashi Guruprasad, Tim Stack, Kirk Webb,
1 Capacity Dimensioning Based on Traffic Measurement in the Internet Kazumine Osaka University Shingo Ata (Osaka City Univ.)
1 Evaluating NGI performance Matt Mathis
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
CS640: Introduction to Computer Networks Aditya Akella Lecture 15 TCP – III Reliability and Implementation Issues.
Jennifer Rexford Fall 2014 (TTh 3:00-4:20 in CS 105) COS 561: Advanced Computer Networks TCP.
OverQos: Less Bandwidth for More Reliable Service ?? --A Criticism Hongyu Gao Gregory Peaker.
Selective Packet Inspection to Detect DoS Flooding Using Software Defined Networking Author : Tommy Chin Jr., Xenia Mountrouidou, Xiangyang Li and Kaiqi.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
1 Transport Layer: Basics Outline Intro to transport UDP Congestion control basics.
Networks, Part 2 March 7, Networks End to End Layer  Build upon unreliable Network Layer  As needed, compensate for latency, ordering, data.
DECOR: A Distributed Coordinated Resource Monitoring System Shan-Hsiang Shen Aditya Akella.
Access Link Capacity Monitoring with TFRC Probe Ling-Jyh Chen, Tony Sun, Dan Xu, M. Y. Sanadidi, Mario Gerla Computer Science Department, University of.
1 Three ways to (ab)use Multipath Congestion Control Costin Raiciu University Politehnica of Bucharest.
HOW TO BUILD A BETTER TESTBED Fabien Hermenier Robert Ricci LESSONS FROM A DECADE OF NETWORK EXPERIMENTS ON EMULAB TridentCom ’
Using Rhythmic Nonces for Puzzle-Based DoS Resistance Ellick M. Chan, Carl A. Gunter, Sonia Jahid, Evgeni Peryshkin, and Daniel Rebolledo University of.
MicroGrid Update & A Synthetic Grid Resource Generator Xin Liu, Yang-suk Kee, Andrew Chien Department of Computer Science and Engineering Center for Networked.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Architecture and Algorithms for an IEEE 802
Applying Control Theory to Stream Processing Systems
3 | Analyzing Server, Network, and Client Health
TCP Congestion Control
TCP Congestion Control
EE 122: Lecture 22 (Overlay Networks)
Presentation transcript:

1 Flexlab: A Realistic, Controlled, and Friendly Environment for Evaluating Networked Systems Jonathon Duerig, Robert Ricci, Junxing Zhang, Daniel Gebhardt, Sneha Kasera, Jay Lepreau University of Utah HotNets-V November 30, 2006

2 Emulators (Emulab Sucks) Examples: Modelnet & Emulab The Good: Control, repeatability, wide variety of network conditions The Bad: Artificial network conditions

3 Overlay Testbeds (PlanetLab Sucks) Examples: RON & PlanetLab The Good: Real network conditions The Bad: Overloaded, No privileged operations, Poor repeatability, Hard to develop/debug

4 Goal: Best of Both Worlds (Don’t Suck)

5 Model-driven Emulation (How not to suck)

6 Key Points Flexlab is an emulation framework into which different network models may be plugged Exploit an overlay testbed to generate measurements for some example models –Models make different fidelity, overhead, and repeatability trade-offs Application-Centric Internet Modeling

7 Flexlab: Application

8 Flexlab: Application Monitor

9 Flexlab: Network Model

10 Flexlab: Measurement Repository

11 Flexlab: Path Emulator

12 Flexlab: Feedback

13 ACIM: Application-Centric Internet Modeling

14 Imagine Ideal Fidelity

15 ACIM Architecture

16 ACIM Challenges Hardening implementation to deal with PlanetLab unreliability CPU starvation on PlanetLab –Host artifacts in throughput –Packet loss from libpcap Reverse path congestion Measuring bottleneck queue size in time Discovering when bottleneck link is saturated

17 ACIM Network Conditions

18 ACIM Available Bandwidth Throughput == available bandwidth iff agent is saturating && bottleneck link is saturated Agent saturating  socket buffer full Bottleneck queue saturated  queue filling up  RTT increasing recently

19 Sample Experiment

20 Sample Results

21 Sample Results

22 Sample Results

23 Sample Results

24 Network Model Trade-offs

25 Sample Real Application: BitTorrent. with Static Model

26 BitTorrent w/ ACIM Model

27 BitTorrent w/ PlanetLab What is “correct”? Challenging to determine; work-in-progress.

28 Conclusions Contribution: Modeling Framework for Emulation –Models can allow the experimenter to trade-off fidelity, repeatability, and overhead Contribution: Application-Centric Internet Modeling Contribution: Running on Emulab and PlanetLab in alpha stage

29 Backup Slides

30 Why not just add more nodes to every PlanetLab site? (cf. public review) Remaining problems: –Poor repeatability –Hard to develop/debug –No privileged operations Malicious traffic cannot be tested Some Flexlab network models reduce network load Emulab node pool stat muxed and shared more efficiently than per-site pools Overload can (will?) still happen with PL’s pure shared-host model Major practical barriers: admin, cost

31 PlanetLab Overload (What)

32 PlanetLab Overload (Why) Only a few nodes per site –Sites supply their own nodes –No incentive to increase number of nodes No admission control No resource guarantees No incentive to minimize usage Typically tedious to set up experiments (exceptions: Emulab portal, Plush, other?)

33 Network Model 1: Static

34 Static Trade-offs Low fidelity Fixed continuous overhead Complete repeatability

35 Network Model 2: Dynamic

36 Dynamic Trade-offs Moderate fidelity Overhead proportional to number of paths used High repeatability

37 Low-Frequency Measurements Miss Changes (Changepoint Analysis) Path 20 Sec. Period 2 Sec. Period SrcDestCount Avg magnitude of 2 sec changes Commodity 22039% CommodityInternet % Internet2 00-

38 Flexlab and VINI Entirely different kinds of realism and control Flexlab: passes “experiment” traffic over shared path –Real Internet conditions from other traffic on same path, but app. traffic is not from real users –Control: of all software –Environment: friendly local dev. environ, dedicated hosts VINI: can pass “real traffic” over dedicated link –Real routing, real neighbor ISPs, potentially traffic from real users, but network resources are not realistic/representative –Dedicated pipes with dedicated bandwidth, that insulate experiment from normal Internet conditions –Control: restricted to VINI’s APIs (Click, XORP, etc) –Environment: distributed environ; shared host resources.

39 Dealing with PlanetLab Unreliability Our initial design was optimistic Nodes fail –There is no set of ‘good nodes’ –Agents must react robustly to node failure Most errors are transient –Log everything –Replay packet analysis

40 CPU Starvation on PlanetLab Host Artifacts –Long period when agent can’t read or write –Empty socket buffer or full receive window –Solution: Detect and ignore Packet loss from libpcap –Long period without reading libpcap buffer –Many packets are dropped at once –Solution: Detect and ignore

41 Handling Reverse Path Congestion Can cause ack compression Throughput Measurement –Throughput numbers become much noisier –We abuse the TCP timestamp option –PlanetLab: homogenous OS environment –Extending it would require hacking client RTT Measurement –Future work

42 Measuring Bottleneck Queue Size Important to emulate loss episodes due to congestion No one knows how in terms of bytes/packets Easier to measure in terms of time: –full = RTT when queue is full –empty = RTT when queue is empty –queue_time = full - empty

43 Initial Conditions Needed to bootstrap ACIM –ACIM uses traffic to generate conditions –But conditions must exist for first traffic We created a measurement framework –All pairs of sites are measured –Put data into measurement repository Set initial conditions to latest measurements

44 Path Emulator (detail)