Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis Charalampos Rotsos Andrew W. Moore Computer Laboratory,

Slides:



Advertisements
Similar presentations
Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton)
Advertisements

Virtualisation From the Bottom Up From storage to application.
CloudWatcher: Network Security Monitoring Using OpenFlow in Dynamic Cloud Networks or: How to Provide Security Monitoring as a Service in Clouds? Seungwon.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Network Virtualization COS 597E: Software Defined Networking.
Estinet open flow network simulator and emulator. IEEE Communications Magazine 51.9 (2013): Wang, Shie-Yuan, Chih-Liang Chou, and Chun-Ming Yang.
Xen , Linux Vserver , Planet Lab
SDN and Openflow.
Chapter 8 Hardware Conventional Computer Hardware Architecture.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
G Robert Grimm New York University Disco.
Xen and the Art of Virtualization A paper from the University of Cambridge, presented by Charlie Schluting For CS533 at Portland State University.
UC Berkeley 1 Time dilation in RAMP Zhangxi Tan and David Patterson Computer Science Division UC Berkeley.
Chapter 13 Embedded Systems
Figure 1.1 Interaction between applications and the operating system.
Chapter 4 Assessing and Understanding Performance
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis Charalampos Rotsos Andrew W. Moore Computer Laboratory,
Virtualization and the Cloud
Virtualization for Cloud Computing
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Software Stack COS 597E: Software Defined Networking.
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Methodologies, strategies and experiences Virtualization.
Tanenbaum 8.3 See references
EstiNet Network Simulator & Emulator 2014/06/ 尉遲仲涵.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Bottlenecks: Automated Design Configuration Evaluation and Tune.
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
An Introduction to Software Architecture
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 1 Introduction Read:
การติดตั้งและทดสอบการทำคลัสเต อร์เสมือนบน Xen, ROCKS, และไท ยกริด Roll Implementation of Virtualization Clusters based on Xen, ROCKS, and ThaiGrid Roll.
Improving Network I/O Virtualization for Cloud Computing.
Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
Architectures of distributed systems Fundamental Models
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
“Trusted Passages”: Meeting Trust Needs of Distributed Applications Mustaque Ahamad, Greg Eisenhauer, Jiantao Kong, Wenke Lee, Bryan Payne and Karsten.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Shivkumar Kalyanaraman Rensselaer Polytechnic Institute 1 Based upon slides from Jay Lepreau, Utah Emulab Introduction Shiv Kalyanaraman
Full and Para Virtualization
Lecture 26 Virtual Machine Monitors. Virtual Machines Goal: run an guest OS over an host OS Who has done this? Why might it be useful? Examples: Vmware,
Enabling Technologies for Distributed Computing Dr. Sanjay P. Ahuja, Ph.D. Fidelity National Financial Distinguished Professor of CIS School of Computing,
Computer Simulation of Networks ECE/CSC 777: Telecommunications Network Design Fall, 2013, Rudra Dutta.
Technical Reading Report Virtual Power: Coordinated Power Management in Virtualized Enterprise Environment Paper by: Ripal Nathuji & Karsten Schwan from.
(MRC) 2 These slides are not approved for public release Resilient high-dimensional datacenter 1 Control Plane: Controllers and Switches.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
© 2004 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Understanding Virtualization Overhead.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Open Source Virtualization Andrey Meganov RHCA, RHCX Consultant / VDEL
Computer System Structures
Virtualization for Cloud Computing
Virtualization.
Is Virtualization ready for End-to-End Application Performance?
Xen and the Art of Virtualization
The DPIaaS Controller Prototype
Current Generation Hypervisor Type 1 Type 2.
Performance Comparison of Virtual Machines and Containers with Unikernels Nagashree N Suprabha S Rajat Bansal.
Software Architecture in Practice
The Multikernel: A New OS Architecture for Scalable Multicore Systems
Comparison of the Three CPU Schedulers in Xen
OS Virtualization.
Indigo Doyoung Lee Dept. of CSE, POSTECH
Software Defined Networking (SDN)
Shadow: Scalable and Deterministic Network Experimentation
Virtualization Dr. S. R. Ahmed.
Presentation transcript:

Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis Charalampos Rotsos Andrew W. Moore Computer Laboratory, Systems Research Group University of Cambridge, UK

ANCS 2014, Marina del Rey, Califoria, USA2 100 Mbps 1 GbE Research on networked systems: Yesterday

ANCS 2014, Marina del Rey, Califoria, USA3 Research on networked systems: Modern era 1 GbE 10 GbE WAN link: 40++ Gbps How we evaluate new ideas ?

Simulation (ns3): Too much abstraction ANCS 2014, Marina del Rey, Califoria, USA4 Fat-Tree 8x clients 12x switches 1 GbE links 8 Gbps aggregeate Ns3 – Flat model – 2.75x lower throughput

Emulation (MiniNet): Poor scalability Identical experiment setup MiniNet – Out of CPU cycles 4.5x lower throughput performance artifacts ANCS 2014, Marina del Rey, Califoria, USA5

Everything is a trade-off ANCS 2014, Marina del Rey, Califoria, USA6 Fidelity Scalability Reproducibility Emulation: Sacrifice scalability Emulation: Sacrifice scalability Simulation: Sacrifice fidelity Simulation: Sacrifice fidelity Natural for simulation Emulation – MiniNet is the pioneer – How to maintain across different platforms ??

SIMULATIONEMULATION SELENA HYBRIDTESTBEDS Reproducibility Real Net Stacks Unmodified App Hardware Req. Scalability Fidelity Exec. speed SELENA: Standing on the shoulders of giants Fidelity: Emulation, Xen, real OS components Reproducibility: MiniNet approach Scalability: Time dilation (DieCast approach) ANCS 2014, Marina del Rey, Califoria, USA 7 Full user control: Trade execution speed for fidelity and scalability

API and experimental workflow ANCS 2014, Marina del Rey, Califoria, USA8 Experiment description Python API Selena compiler Selena compiler

SELENA’s Emulation model over Xen ANCS 2014, Marina del Rey, Califoria, USA9 OVS Bridge

The concept of Time-Dilation ANCS 2014, Marina del Rey, Califoria, USA10 I command you to slow down 1 tick = (1/C_Hz) seconds Real Time 10 Mbits data Real time rate REAL = 10 / (6*C_Hz) Mbps 2x Dilated time (TDF = 2) (tick rate)/2, C_Hz tick rate, 2*C_Hz OR Virtual time 10 Mbits data rate VIRT = 10 / (3*C_Hz) Mbps = 2 * rate REAL

Scaling resources via Time Dilation STEP 1: Create a scenario STEP 2: Choose a time dilation factor (TDF) – Linear and symmetric scaling of all resources Network, CPU, ram BW, disk I/O STEP 3: Control independently the “perceived” available resources – Configure via SELENA’s API independently CPU (Xen Credit2) Network (Xen VIF QoS, netem) Disk I/O (in guests via cgroups) ANCS 2014, Marina del Rey, Califoria, USA11

Xen PV-guest Time-Keeping ANCS 2014, Marina del Rey, Califoria, USA12 XEN Hypervisor rdtsc VIRQ_TIMER Hypervisor_set_timer_op XEN Clock Source TSC value XEN VIRQ set next event Time – Wall clock time (epoch) – System time (boot) – Independent mode rdtsc modes of operation – Native – Emulated Scheduled timers Periodic timers Loop delays

Implementing Time-Dilation ANCS 2014, Marina del Rey, Califoria, USA13 Linux Guest Xen Hypervisor Periodic VIRQ_TIMER is not really used TSC value Trap – Emulate - scale “rdtsc” Native “rdtsc” (constant, invariant) - Start-of-day: dilated wallclock time - VPCU time: _u.tsc_timestamp = tsc_stamp; _u.system_time = system_time; _u.tsc_to_system_mul = tsc_to_system_mul; VCPUOP_set_singleshot_timer set_timer(&v->singleshot_timer, dilatedTimeout); Periodic VIRQ_TIMER implemented (but is not really used)

Summarizing the elements of Fidelity Resource scaling via time dilation Real Stacks and other OS components Real Applications – Including SDN controllers Realistic SDN switch models – Why is it important ? – How it affects performance ? ANCS 2014, Marina del Rey, California, USA14

OpenFlow Switch X-Ray ANCS 2014, Marina del Rey, Califoria, USA15 Network OS ASIC OF Agent Control App Control App Control App Control App Control Channel Available capacity, synchronicity PCI bus capacity is limited in comparison to data plane ASIC driver affects how fast the policy is configured in the ASIC - Scarce co-processor resources - Switch OS scheduling is non-trivial Control application complexity Control plane performance is critical for the data plane

Building an OpenFlow switch model Pica8 P-3290 switch – Measure message processing performance (OFLOPS) – Extract latency characteristics of: flow table management the packet interception / injection mechanism counters extraction Configurable switch model – Replicate latency and loss characteristics – Implementation: Mirage-OS based switch Flexible, functional, non-bloated code Performance: uni-kernel Small footprint: scalable emulations ANCS 2014, Marina del Rey, Califoria, USA16

Evaluation methodology 1.Run experiment on real hardware 2.Reproduce results in: 1.MiniNet 2.NS3 3.SELENA (for various TDF) 3.Compare against “real” ANCS 2014, Marina del Rey, California, USA17

MiniNet and Ns Gbps and 5.2Gbps SELENA - 10x dilation: 99.5% accuracy - executes 9x faster than Ns3 Throughput fidelity ANCS 2014, Marina del Rey, Califoria, USA18

Latency fidelity ANCS 2014, Marina del Rey, Califoria, USA19 Setup - 18 nodes, 1Gbps links flows MiniNet &Ns3 accuracy: 32% and 44% Selena accuracy 71% with 5x dilation 98.7% with 20x dilation

SDN Control plane Fidelity ANCS 2014, Marina del Rey, Califoria, USA20 1Mb TCP flows completion time exponential arrival λ = 0.02 Stepping behavior: - TCP SYN & SYNACK loss Mininet switch model: - does not capture this throttling effect Stepping behavior: - TCP SYN & SYNACK loss Mininet switch model: - does not capture this throttling effect The model is not capable to capture transient switch OS scheduling effects.

Application fidelity (LAMP) ANCS 2014, Marina del Rey, Califoria, USA21 Fat-Tree CLOS – 1 Gbps links – 10x switches – 4x Clients – 4x WebServers: Apache2, PHP, MySQL, Redis, Wordpress

A layered SDN controller hierarchy ANCS 2014, Marina del Rey, Califoria, USA22 4 pod, Fat-Tree topology, 1GbE links 32 Gbps aggregate traffic The layered control-plane architecture Question: How does a layered controller hierarchy affect performance ? 1 st Layer Controller2 nd Layer Controller More layers – Control decisions taken higher in the hierarchy – Flow setup latency increases Network, Request pipelining, CPU load – Resilience

Scalability analysis Fat-Tree topology, 1 GbE links, multi Gbit sink link Domain-0 is allocated 4-cores – Why tops at 250% CPU utilisation ? Near linear scalability ANCS 2014, Marina del Rey, Califoria, USA23 OVS Bridge

How to (not) use SELENA SELENA is primarily a NETWORK emulation framework – Perfect match: network bound applications – Provides tuning knobs to experiment with: CPU, disk I/O and Network relative performance Real applications / SDN controllers / network stacks Time dilation is not a panacea – Device-specific Disk IO performance – Cache thrashing and data locality – Multi-core effects (e.g. per-core lock contention) – Hardware features (e.g. Intel DDIO) – Scheduling effects of Xen at scale (100s of VMs) Rule of thumb for choosing TDF – Low Dom-0 and Dom-U utilisation – Observation time-scales matter ANCS 2014, Marina del Rey, Califoria, USA24

Work in progress API compatibility with MiniNet Further improve scalability - Multi-machine emulation - Optimize guest-2-guest Xen communications Features and use cases – SDN coupling with workload consolidation – Emulation of live VM migration – Incorporate energy models ANCS 2014, Marina del Rey, California, USA25

SELENA is free and open. Give it a try: ANCS 2014, Marina del Rey, California, USA26

ANCS 2014, Marina del Rey, Califoria, USA27

ANCS 2014, Marina del Rey, Califoria, USA28

ANCS 2014, Marina del Rey, Califoria, USA29

ANCS 2014, Marina del Rey, Califoria, USA30

ANCS 2014, Marina del Rey, Califoria, USA31

Research on networked systems: past, present, future Animation: 3 examples of networks. Examples will show the evolution of “network-characteristics” on which research is conducted: – Past: 2-3 Layers, Hierarchical, TOR, 100Mbps, bare metal OS – Present: Fat-tree, 1Gbps links, Virtualization, WAN links – Near future: Flexible architectures, 10Gbps, Elastic resource management, SDN controllers, OF switches, large scale (DC), The point of this slide is that real-world systems progress at a fast pace (complexity, size) but common tools have not kept up with this pace I will challenge the audience to think: – Which of the 3 examples of illustrated networks they believe they can model with existing tools – What level of fidelity (incl. Protocols, SDN, Apps, Net emulation) – What are the common sized and link speeds they can model ANCS 2014, Marina del Rey, California, USA32

A simple example with NS-3 Here I will assume a simple star-topology 10x clients, 1x server, 1x switch (10Gbps aggregate) I will provide the throughput plot and explain why performance sucks Point out that NS3 is not appropriate for faster networks Simplicity of models + non real applications Using DCE: even slower, non full POSIX- compliant ANCS 2014, Marina del Rey, California, USA33

A simple example with MiniNet Same as before Throughput plot Better fidelity in terms of protocols, applications etc – Penalty in performance Explain what is the bottleneck, especially in relation to MiniNet’s implementation ANCS 2014, Marina del Rey, California, USA34

Everything is a trade-off Nothing comes for free when it comes to modelling and the 3 key- experimentation properties MiniNet aims for fidelity – Sacrifices scalability NS-3 aims for scalability (many abstractions) – Sacrifices fidelity, +scalability limitations The importance of Reproducibility – MiniNet is a pioneer – difficult to maintain from machine to machine MiniNet cannot guarantee that at the level of performance, only at the level of configuration ANCS 2014, Marina del Rey, California, USA35 Fidelity Scalability Reproducibility

SELENA: Standing on the shoulders of giants Fidelity: use Emulation – Unmodified apps and protocols: fidelity + usability – XEN: Support for common OS, good scalability, great control on resources Reproducible experiments – MiniNet approach, high-level experiment descriptions, automation Maintain fidelity under scale – DieCast approach: time dilation (will talk more later on that) The user is the MASTER: – Tuning knob: Experiment Execution speed ANCS 2014, Marina del Rey, California, USA36

SELENA Architecture Animation here: 3 steps show how an experiment is – Specified (python API) – compiled – deployed Explain mappings of network entities-features to Xen emulation components Give hints of optimization tweaks we use under the hood ANCS 2014, Marina del Rey, California, USA37 Experiment description Python API Selena compiler Selena compiler

Time Dilation and Reproducibility Explain how time dilation also FACILITATES reproducibility across different platforms Reproducibility – Replication of configuration Network architecture, links, protocols Applications Traffic / workloads How we do it in SELENA: Python API, XEN API – Reproduction of results and observed performance Each platform should have enough resources to rund faithfully the experiment How we do it in SELENA: time dilation – An older platform/hardware will require a different minimum TDF to reproduce the same results ANCS 2014, Marina del Rey, California, USA38

Demystifying Time-Dilation 1/3 Explain the concept in high-level terms – Give a solid example with a timeline Similar to slide 8: dilation/nsdi06-tdf-talk.pdfhttp://sysnet.ucsd.edu/projects/time- dilation/nsdi06-tdf-talk.pdf Explain that everything happens at the H/V level – Guest time sandboxing (experiment VMs) – Common time for kernel + user space – No modifications for PV guests Linux, FreeBSD, ClickOS, OSv, Mirage ANCS 2014, Marina del Rey, California, USA39

Demystifying Time-Dilation 2/3 Here we explain the low-level staff Give credits to DieCast, but also explain the incremental work we did Best to show/explain with an animation ANCS 2014, Marina del Rey, California, USA40

Demystifying Time-Dilation 3/3 Resources scaling – Linear and symmetric scaling for Network, CPU, ram BW, disk I/O – TDF only increases the perceived performance headroom of the above – SELENA allows for configuring independently the perceived speeds of CPU Network Disk I/O (from within the guests at the moment -- cgroups) Typical workflow 1.Create a scenario 2.Decide the minimum necessary TDF for supporting the desired (will see more later on that) 3.Independently scale resources, based on the requirements of the users and the focus of their studies ANCS 2014, Marina del Rey, California, USA41

Summarizing the elements of Fidelity Resource scaling via time dilation (already covered) Real Stacks and other OS components Real Applications – Including SDN controllers Realistic SDN switch models – Why is it important – How much can it affect observed behaviours ANCS 2014, Marina del Rey, California, USA42

Inside an OF switch Present a model of an OF switch internals – Show components – Show paths / interactions which affect performance Data plane (we do not model that currently) Control plane ANCS 2014, Marina del Rey, California, USA43 Random image from the web. Just a placeholder

Building a realistic OF switch model Methodology for constructing an empirical model – PICA-8 – OFLOPS measurements Collect, analyze, extract trends Stochastic model – Use a mirage-switch to implement the model Flexible, functional, non-bloated code Performant: uni-kernel, no context switches Small footprint: scalable emulations ANCS 2014, Marina del Rey, California, USA44

Evaluation methodology 1.Run experiment on real hardware 2.Reproduce results in: 1.MiniNet 2.NS3 3.SELENA (for various TDF) 3.Compare each one against “real” We evaluate multiple aspects of fidelity: – Data-Plane – Flow-level – SDN Control – Application ANCS 2014, Marina del Rey, California, USA45

Data-Plane fidelity Figure from paper Explain Star-topology Show comparison of MiniNet + NS3 – Same figures from slides 2+3 but now compared against Selena + real Point out how increasing TDF affects fidelity ANCS 2014, Marina del Rey, California, USA46

Flow-Level fidelity Figure from paper Explain Fat-tree topology ANCS 2014, Marina del Rey, Califorina, USA47

Execution Speed Compare against NS3, MiniNet Point out that SELENA executes faster than NS3 – NS3 however replicates only half speed network Therefore difference is even bigger ANCS 2014, Marina del Rey, California, USA48

SDN Control plane Fidelity Figure from paper Explain experiment setup Point out shortcomings of MiniNet – As good as OVS is Point out terrible support for SDN by NS3 ANCS 2014, Marina del Rey, California, USA49

Application level fidelity Figure from paper Explain the experiment setup Latency aspect Show how CPU utilisation matters for fidelity – Open the dialogue for the performance bottlenecks and limitations and make a smooth transition to the next slide ANCS 2014, Marina del Rey, California, USA50

Near-linear Scalability Figure from paper Explain how is scalability determined for a given TDF ANCS 2014, Marina del Rey, California, USA51

Limitations discussion Explain the effects of running on Xen Explain what happens if TDF is low and utilisation is high Explain that insufficient CPU compromises – Emulated network speeds – Capability of guests to utilise the available bandwidth – Skews the performance of networked applications – Adds excessive latency Scheduling also contributes ANCS 2014, Marina del Rey, California, USA52

A more complicated example Showcase the power of SELENA :P Use the MRC2 experiment ANCS 2014, Marina del Rey, California, USA53

Work in progress API compatibility with MiniNet Further improve scalability - Multi-machine emulation - Optimize guest-2-guest Xen communications Features and use cases – SDN coupling with workload consolidation – Emulation of live VM migration – Incorporate energy models ANCS 2014, Marina del Rey, California, USA54

SELENA is free and open. Give it a try: ANCS 2014, Marina del Rey, California, USA55