ClickOS and the Art of Network Function Virtualization Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*,

Slides:



Advertisements
Similar presentations
Wei Lu 1, Kate Keahey 2, Tim Freeman 2, Frank Siebenlist 2 1 Indiana University, 2 Argonne National Lab
Advertisements

Enabling Fast, Dynamic Network Processing with ClickOS
Virtual Switching Without a Hypervisor for a More Secure Cloud Xin Jin Princeton University Joint work with Eric Keller(UPenn) and Jennifer Rexford(Princeton)
XEN AND THE ART OF VIRTUALIZATION Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, lan Pratt, Andrew Warfield.
Bart Miller. Outline Definition and goals Paravirtualization System Architecture The Virtual Machine Interface Memory Management CPU Device I/O Network,
Performance Evaluation of Open Virtual Routers M.Siraj Rathore
Xen , Linux Vserver , Planet Lab
Scalable Flow-Based Networking with DIFANE 1 Minlan Yu Princeton University Joint work with Mike Freedman, Jennifer Rexford and Jia Wang.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Towards High-Availability for IP Telephony using Virtual Machines Devdutt Patnaik, Ashish Bijlani and Vishal K Singh.
Network Implementation for Xen and KVM Class project for E : Network System Design and Implantation 12 Apr 2010 Kangkook Jee (kj2181)
Virtualization for Cloud Computing
Container-based OS Virtualization A Scalable, High-performance Alternative to Hypervisors Stephen Soltesz, Herbert Pötzl, Marc Fiuczynski, Andy Bavier.
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Xen and the Art of Virtualization. Introduction  Challenges to build virtual machines Performance isolation  Scheduling priority  Memory demand  Network.
1 MASTERING (VIRTUAL) NETWORKS A Case Study of Virtualizing Internet Lab Avin Chen Borokhovich Michael Goldfeld Arik.
Christopher Bednarz Justin Jones Prof. Xiang ECE 4986 Fall Department of Electrical and Computer Engineering University.
Enabling Innovation Inside the Network Jennifer Rexford Princeton University
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Dual Stack Virtualization: Consolidating HPC and commodity workloads in the cloud Brian Kocoloski, Jiannan Ouyang, Jack Lange University of Pittsburgh.
Hosting Virtual Networks on Commodity Hardware VINI Summer Camp.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Support for Smart NICs Ian Pratt. Outline Xen I/O Overview –Why network I/O is harder than block Smart NIC taxonomy –How Xen can exploit them Enhancing.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Achieving 10 Gb/s Using Xen Para-virtualized.
Using Virtualization in the Classroom. Using Virtualization in the Classroom Session Objectives Define virtualization Compare major virtualization programs.
Benefits: Increased server utilization Reduced IT TCO Improved IT agility.
Xen Overview for Campus Grids Andrew Warfield University of Cambridge Computer Laboratory.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Xen I/O Overview. Xen is a popular open-source x86 virtual machine monitor – full-virtualization – para-virtualization para-virtualization as a more efficient.
Xen I/O Overview.
Improving Network I/O Virtualization for Cloud Computing.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
Xen and The Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt & Andrew Warfield.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
Nathanael Thompson and John Kelm
Srihari Makineni & Ravi Iyer Communications Technology Lab
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
VTurbo: Accelerating Virtual Machine I/O Processing Using Designated Turbo-Sliced Core Embedded Lab. Kim Sewoog Cong Xu, Sahan Gamage, Hui Lu, Ramana Kompella,
Introduction to virtualization
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
Full and Para Virtualization
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Rethinking Access Networks with High Performance Virtual Software BRASes Roberto Bifulco, Thomas Dietz, Felipe Huici, Mohamed Ahmed, Joao Martins, Saverio.
On-the-Fly TCP Acceleration with Miniproxy Giuseppe Siracusano 12, Roberto Bifulco 1, Simon Kuenzer 1, Stefano Salsano 2, Nicola Blefari Melazzi 2, Felipe.
NFP: Enabling Network Function Parallelism in NFV
Middlebox, SDN and NFV Middlebox
Virtualization for Cloud Computing
Xin Li, Chen Qian University of Kentucky
Virtual Machine Monitors
NFV Compute Acceleration APIs and Evaluation
BESS: A Virtual Switch Tailored for NFV
Arrakis: The Operating System is the Control Plane
Xen and the Art of Virtualization
Why VT-d Direct memory access (DMA) is a method that allows an input/output (I/O) device to send or receive data directly to or from the main memory, bypassing.
Presented by Yoon-Soo Lee
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
My VM is Lighter (and Safer) than your Container
Xen: The Art of Virtualization
NFP: Enabling Network Function Parallelism in NFV
Overview Introduction VPS Understanding VPS Architecture
Software Defined Networking (SDN)
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
NFP: Enabling Network Function Parallelism in NFV
Xen Network I/O Performance Analysis and Opportunities for Improvement
Universal Serial Bus (USB)
A Closer Look at NFV Execution Models
Presentation transcript:

ClickOS and the Art of Network Function Virtualization Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*, Felipe Huici* * NEC Labs Europe, Heidelberg, Germany § University Politehnica of Bucharest Presenter : YoungGyoun Moon EE807 - Software-defined Networked Computing Some slides are brought from the authors’ presentation.

The Idealized Network 2 Application Transport Network Datalink Physical Datalink Physical Network Datalink Physical Application Transport Network Datalink Physical

A Middlebox World 3 carrier-grade NAT load balancer DPI QoE monitor ad insertion BRAS session border controller transcoder WAN accelerator DDoS protection firewall IDSIDS

Typical Enterprise Networks Internet

Hardware Middleboxes - Drawbacks Expensive equipment/power costs Difficult to add new features (vendor lock-in) Difficult to manage Cannot be scaled on demand (peak planning) 5

Shifting Middlebox Processing to Software Can share the same hardware across multiple users/tenants Reduced equipment/power costs through consolidation Safe to try new features on a operational network/platform But can it be built using commodity hardware while still achieving high performance? ClickOS: tiny Xen-based virtual machine that runs Click 6

Requirements for Software Middlebox Fast Instantiation Small footprint Isolation Flexibility Performance 7 30 msec boot times 5MB when running Provided by Xen Provided by Click 10Gb/s line rate 45 μsec delay per packet ClickOS

What’s ClickOS ? Work consisted of:  Build system to create ClickOS images (5 MB in size)  Emulating a Click control plane over MiniOS/Xen  Reducing boot times (roughly 30 milliseconds)  Optimizations to the data plane (10 Gb/s for almost all pkt sizes)  Implementation of a wide range of middleboxes 8 apps guest OS paravirt Click mini OS paravirt domU ClickOS

9 Programming Abstraction for MBox Building MBox using Click  We can use 300+ stock Click elements!

Click middlebox inside Linux VM  (+) Can achieve domain isolation  (-) Memory-hungry (128MB or more)  (-) Large boot time (5 seconds to boot) 10 ClickOS Click guest OS paravirt domU

Why we use MiniOS?  Single address space (= no kernel/user separation) Completely eliminates system call costs  Runs a single application per VM Runs on single-core only (enough for 10Gbps packet processing)  Cooperative scheduler Reduces context switch overhead  Small footprint and minimal boot time 11 ClickOS Click guest OS paravirt Click mini OS paravirt domU ClickOS

ClickOS Architecture 12 Console Xen store = communication path (Dom0 - VM)  Allows the users to install Click configs in ClickOS VM  Set and retrieve state in elements (e.g. packet counter)

ClickOS Boot Time TasksTime spent Issue and carry out the hypercall to create the VM 5.2 ms Build and boot VM image9.0 ms Set up Xen store2.2 ms Create the console4.4 ms Attach VM to back-end switch1.4 ms Install the Click configuration6.6 ms Total28.8 ms 13 (optimized) Re-designed the MiniOS toolchain for faster boot. (optimized) Re-designed the MiniOS toolchain for faster boot. (added) Provide a console through which Click configurations can be controlled (added) Provide a console through which Click configurations can be controlled

Booting Large Number of ClickOS VMs msec 220 msec contention on the Xen store

ClickOS Memory Footprint 15 Usage Basic memory footprint of a ClickOS image 5 MB Netmap packet ring buffers (e.g slot ring) 8 MB Middlebox application states (e.g rules for IP router and firewall rules for IDS) 137 KB Total< 14 MB (optimized) Includes 200+ supported Click elements and minimal set of libraries (optimized) Includes 200+ supported Click elements and minimal set of libraries

ClickOS Work consisted of:  Build system to create ClickOS images (5 MB in size)  Emulating a Click control plane over MiniOS/Xen  Reducing boot times (roughly 30 milliseconds)  Optimizations to the data plane (10 Gb/s for almost all pkt sizes)  Implementation of a wide range of middleboxes 16

Performance Analysis 17 Packet size (bytes)10 Gbit/s rate Mpps Mpps Mpps Mpps Mpps Kpps vif Xen bus/store Event channel Xen ring API (data) netback netfront OVS native driver Click FromDevice ToDevice 300 Kpps 350 Kpps 225 Kpps (maximum sized packets)

Performance Analysis Copying packets between guests greatly affects packet I/O (1) Packet metadata allocations (2) Backend switch is slow (3) Mini netfront not as good as Linux 18 vif Xen bus/store Event channel Xen ring API (data) netback netfront OVS native driver Click FromDevice ToDevice 772 ns (1) ~ 600 ns (2) ~ 3.4 us (3) Driver domain (Dom 0)ClickOS domain

Optimizing Network I/O NF(netfront)-MiniOS-opt  Reuse memory grants  Now uses polling for Rx NB-vale  Introduce VALE as the backend switch Netmap-supported software switch for VMs (Luigi Rizzo, CoNEXT 2012)  Increase I/O requests batch size 19 port Xen bus/store Event channel netback netfront VALE netmap driver Click FromDevice ToDevice Xen ring API (data) Driver domain (Dom 0)ClickOS domain

Optimizing Network I/O Optimized Xen network I/O pipe  netback : no longer involved with packet transfer  netfront : use netmap API for better performance To map the packet buffers into its space directly More optimizations: asynchronous transmit, grant re-use Optimized VALE  Static MAC address-to-port mapping (instead of learning bridge) 20 Xen bus/store Event channel netback netfront VALE netmap driver Click FromDevice ToDevice Directly mapped packet buffers Driver domain (Dom 0)ClickOS domain

ClickOS Prototype Overview Click changes are minimal (~ 600 LoC)  New toolstack for fast boot times  Cross compile toolchain for MiniOS-based apps netback changes comprise ~ 500 LoC netfront changes comprise ~ 600 LoC  We have implemented netfront driver for both MiniOS and Linux 21

EVALUATION 22

Experiments Delay compared with other systems Switch performance for 1+ NICs ClickOS/MiniOS performance Chaining experiments Scalability over multiple guests Scalability over multiple NICs Implementation and evaluation of middleboxes Linux Performance 23

ClickOS Base TX Performance Intel Xeon E core 3.2GHz (Sandy bridge) 16GB RAM, 1x Intel x520 10Gbps NIC One CPU core assigned to VMs, the test to the Domain-0 Linux ClickOS Measurement Box 10Gb/s direct cable

ClickOS Base TX Performance 25

ClickOS Tx vs. Rx Performance 26

ClickOS (virtualized) Middlebox Performance Intel Xeon E core 3.2GHz (Sandy bridge) 16GB RAM, 1x Intel x520 10Gbps NIC One CPU core assigned to VMs, 3 CPU cores for Domain-0 Linux Host 1 Host 2 10Gb/s direct cable ClickOS

ClickOS (virtualized) Middlebox Performance 28 Line rate for 512B pkts

Chaining ClickOS VMs (back-to-back) 29 1 CPU core for running all the VMs

Scaling out – 100VMs Aggregate Throughput 30 1 CPU core for running all the VMs (Tx only)

Conclusions Virtual machines can do flexible high speed networking ClickOS: Tailor-made operating system for network processing  Small is better: Low footprint is the key to heavy consolidation  Memory footprint: 5MB  Boot time: 30 ms Future work:  Massive consolidation of VMs (thousands)  Improved Inter-VM communication for service chaining  Reactive VMs (e.g., per-flow) 31

Split/Merge (NSDI ‘13) Virtualized middleboxes often require elasticity  The ability to scale in/out to handle variations in workloads Challenges  Load distribution across VM replicas  Middleboxes are mostly stateful How to keep consistency? 32

33 Virtual Network Interface! Split Replica 1 VM Internal (e.g. cache) VM Coherent (e.g. packet counter) Partitionable (Flow States) Replica 2Replica 3 VM Unchanged 123 Coherency is Interfaces! maintained Merge Replica 1Replica 2+3 VM 1 2 Split/Merge

Discussion Poor chaining performance  NFV often requires service chain composed of multiple middleboxes Example  LB – NAT – WEB FW – INTERNAL LB – INTERNAL FW -..  What’s the solution? NetVM (NSDI ‘14) : zero-copy delivery for inter-VM communication 34

Discussion Design choice for Middlebox VMs 35 Linux VMMiniOS Memory footprint128 MB5 MB Boot time5 s30 ms Applications Multiple apps can be run on a VM Single app per VM PerformanceDPDK Open vSwitch Optimized Xen I/O stack (netmap)

Discussion Virtualized middlebox  Why we need virtualization? (+) isolation, easy migration, suitable for multi-tenant (-) performance!! / ClickOS (Rx+TX) : 3.7 Gbps for 64B pkt  What’s the pros and cons for non-virtualized SW middlebox? 36

Thank you for Listening! 37

Backup Slides 38

Linux Guest Performance Note that our Linux optimizations apply only to netmap-based applications 39

Scaling out – Multiple NICs/VMs 40 Host 1 Host 2 6x 10Gb/s direct cable ClickOS