Netslice: Enabling Critical Network Infrastructure with Commodity Routers Prof. Hakim Weatherspoon, Cornell University Joint with Tudor Marian, Ki Suh.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

NetSlices: Scalable Multi-Core Packet Processing in User-Space Tudor Marian, Ki Suh Lee, Hakim Weatherspoon Cornell University Presented by Ki Suh Lee.
1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Design and Implementation of a Consolidated Middlebox Architecture 1 Vyas SekarSylvia RatnasamyMichael ReiterNorbert Egi Guangyu Shi.
Multiple Processor Systems
Chapter 8 Hardware Conventional Computer Hardware Architecture.
Background Computer System Architectures Computer System Software.
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
1 In VINI Veritas: Realistic and Controlled Network Experimentation Jennifer Rexford with Andy Bavier, Nick Feamster, Mark Huang, and Larry Peterson
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
Dawson R. Engler, M. Frans Kaashoek, and James O'Tool Jr.
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
Software Routers: NetMap Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking October 8, 2014.
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
Data Center Traffic and Measurements: Available Bandwidth Estimation Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance.
1 Computer Networks Course: CIS 3003 Fundamental of Information Technology.
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
Revisiting Network Interface Cards as First-Class Citizens Wu-chun Feng (Virginia Tech) Pavan Balaji (Argonne National Lab) Ajeet Singh (Virginia Tech)
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Software Routers: NetSlice Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance Systems and Networking October 15,
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
Computer System Architectures Computer System Software
Hosting Virtual Networks on Commodity Hardware VINI Summer Camp.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
9/14/2015B.Ramamurthy1 Operating Systems : Overview Bina Ramamurthy CSE421/521.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
 Introduction, concepts, review & historical perspective  Processes ◦ Synchronization ◦ Scheduling ◦ Deadlock  Memory management, address translation,
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania
Smoke and Mirrors: Shadowing Files at a Geographically Remote Location Without Loss of Performance Hakim Weatherspoon Joint with Lakshmi Ganesh, Tudor.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
Boosting Event Building Performance Using Infiniband FDR for CMS Upgrade Andrew Forrest – CERN (PH/CMD) Technology and Instrumentation in Particle Physics.
1 Liquid Software Larry Peterson Princeton University John Hartman University of Arizona
Multi-core architectures. Single-core computer Single-core CPU chip.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Processes and OS basics. RHS – SOC 2 OS Basics An Operating System (OS) is essentially an abstraction of a computer As a user or programmer, I do not.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
 Virtual machine systems: simulators for multiple copies of a machine on itself.  Virtual machine (VM): the simulated machine.  Virtual machine monitor.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Chapter 13 – I/O Systems (Pgs ). Devices  Two conflicting properties A. Growing uniformity in interfaces (both h/w and s/w): e.g., USB, TWAIN.
Full and Para Virtualization
1 Isolating Web Programs in Modern Browser Architectures CS6204: Cloud Environment Spring 2011.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.
UDI Technology Benefits Slide 1 Uniform Driver Interface UDI Technology Benefits.
Presented by: Xianghan Pei
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Background Computer System Architectures Computer System Software.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Introduction to Operating Systems Concepts
Computer System Structures
15-744: Computer Networking
CS490 Windows Internals Quiz 2 09/27/2013.
Software Defined Networking (SDN)
OSDI ‘14 Best Paper Award Adam Belay George Prekas Ana Klimovic
Presentation transcript:

Netslice: Enabling Critical Network Infrastructure with Commodity Routers Prof. Hakim Weatherspoon, Cornell University Joint with Tudor Marian, Ki Suh Lee TRUST Autumn 2010 Conference, Stanford University November 10, 2010

Commodity Datacenters  Datacenters are becoming a commodity  Unit of replacement  Datacenter in a box: already set up with commodity hardware & software (Intel, Linux, petabyte of storage)  Plug network, power & cooling and turn on  typically connected via optical fiber  may have network of such datacenters

Commodity Datacenters Titan tech boom, randy katz, /10/2010Critical Network Infrastructure, by Hakim Weatherspoon

IBM Visit, Critical Infrastructure, by Hakim Weatherspoon Network Of Globally Distributed Datacenters  Cloud Computing—Datacenters interconnected via fiber  Long Fat Networks (LFN) or λ -networks  Packet processors and extensible routers—middleboxes  Increase functionality, performance, reliability, and security of network  E.g. DPI, IDS, PEP, protocol accelerators, overlay routers, multimedia servers, security appliances, and network monitors, etc packet processor middleboxes packet processor middleboxes  11/10/2010 4

Network Of Globally Distributed Datacenters  Packet processors  Maelstrom [NSDI’08,TONS’10], SMFS [FAST’09]  FEC,TCP-Split, De-duplication, Network-sync  OS abstractions for packet processing in user-space  Netslice  Lambda networks  Cornell NLR Rings [DSN10]  SDNA/BiFocals [IMC’10] Maelstrom Netslice SDNA/Bifocals TCP-Split SMFS Cornell NLR Rings 5Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Network Of Globally Distributed Datacenters  Packet processors  Maelstrom [NSDI’08,TONS’10], SMFS [FAST’09]  FEC,TCP-Split, De-duplication, Network-sync  OS abstractions for packet processing in user-space  Netslice  Lambda networks  Cornell NLR Rings [DSN10]  SDNA/BiFocals [IMC’10] 6Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Challenges  Large traffic volume processed / second (10Gb/s)  Typical packet processors realized in hardware  Trade off flexibility, programmability for speed  Goal: Improve datacenter communication  Packet processors and extensible routers  Abundance of commodity servers readily available Commodity hardware Software Proprietary specialized hardware 7Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Takeaway  Raw socket cannot take advantage of multicore/multiqueue  Need new OS abstraction to take advantage of parallelism 9.7G 2.25G 8Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Outline—Packet Processing Abstractions  The case for user-space packet processors  What’s wrong with the raw socket in multicore environment?  Hardware and software overheads  Contention and lack of application control of resources  Need new OS abstraction to take advantage of parallelism  Netslice  Evaluate  Group intruduction  Conclude 9Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

The Case Against Low-level Packet Processors  Idiosyncrasies of memory allocator  Small virtual address spaces  Inability to swap out pages  Limit on contiguous memory chunks  Execution contexts and preemptive precedence  Interrupt, bottom half, task/user context  Synchronization primitives  Tightly coupled with execution contexts (e.g. can I block?)  Lack of development tools  Lack of fault isolation  A bug in the kernel is lethal Hardware Operating System Kernel Network Stack Application User-space Application 10Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

 Opportunity: exploit hardware parallelism  Overheads  Contention, contention, contention!  Memory wall  OS Design Overheads  System calls  Context switches  Scheduling  Blocking High-level Packet Processors: Where Have All My Cycles Gone? CPU Memory 11Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Contention: Amdahl’s Law  Bounds maximum expected parallelism speedup  Fraction P of a program parallelized to run on N CPUs  The speedup is Serial fraction (1-P) Parallel fraction (P) Program: 1 Serial execution + Parallel execution = 12Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Contention: Cache-coherent Architecture  Effects of cache coherency & memory accesses  Cores read and write blocks of data concurrently  Commodity system: Xeon X533 with 4MB L2 cache 13Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Contention: Peripheral NIC  Slow cores fast network interface cards (NICs)  More cores exhibit contention and overheads  Hardware transmit/receive multi-queue support tx/rx queues 14Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Software overheads in the Conventional Network Stack  Raw socket: all traffic from all NICs to user-space  Hardware and software are loosely coupled  Applications have no (end-to-end) control over resources tx/rx queues Network Stack Application Raw socket Network Stack Network Stack Network Stack Network Stack Network Stack Network Stack Network Stack Network Stack Application Network Stack 15Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Software overheads in the Conventional Network Stack  API too general, hence complex network stack too bloated  Raw sockets, end-point sockets, files: all the same  Path taken by a packet is unnecessarily expensive  Hides information from applications  Limited functionality: least common denominator API  Inefficient API: issue one system call per packet Network Stack Application Raw socket Application TCP socket UDP socket AF_UNIX socket Application File API TCP socket Application File access 16Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice—user-space packet processor  Give power to the application  Packet processing in user-space  Four-pronged approach (high level)  Contention prevention  Spatially partition hardware  End-to-end control  Provide fine-grained control over hardware  Streamline path for packets  Export a rich, efficient, backwards compatible API Hardware Operating System Kernel Network Stack Netslice Application User-space Application 17Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice Spatial Partitioning  Contention prevention  Independent (parallel) execution contexts  Split each Network Interface Controller (NIC)  One NIC queue per NIC per context  Group and split the CPU cores  Implicit resources (bus and memory bandwidth) Temporal partitioning (time-sharing) Spatial partitioning (exclusive-access) 18Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice Spatial Partitioning Example  2x quad core Intel Xeon X5570 (Nehalem)  Two simultaneous hyperthreads – OS sees 16 CPUs  Non Uniform Memory Access (NUMA)  QuickPath point-to-point interconnect  Shared L3 cache 19Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Fine-grained Hardware Control  End-to-end control  App controls NIC queue and CPU slice allocation  NIC hardware interrupt routing & NIC queue  Kernel execution context  User-space execution context  Tight coupling of software and hardware components 20Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Streamlined Path for Packets  Inefficient conventional network stack  One network stack “to rule them all”  Performs too many memory accesses  Pollutes cache, context switches, synchronization, system calls, blocking API 21Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice API  Expresses fine-grained hardware control  Flexible: based on ioctl  ioctl(fd, NETSLICE_CPUMASK_GET, &mask);  sched_setaffinity(getpid(), sizeof(cpu_set_t), &mask.u_peer);  Backwards compatible (read/write)  fd = open("/dev/netslice-1", O_RDWR);  read(fd, iov, IOVS)  Efficient: batch send/receive (read/write)  Amortize overhead of protection domain crossing 22Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Experimental Setup  R710 packet processors  dual socket quad core 2.93GHz Xeon X5570 (Nehalem)  8MB of shared L3 cache and 12GB of RAM  6GB connected to each of the two CPU sockets  Two Myri-10G NICs  R900 client end-hosts  four socket 2.40GHz Xeon E7330 (Penryn)  6MB of L2 cache and 32GB of RAM 23Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice Evaluation  Compare against state-of-the-art  RouteBricks in-kernel, Click & pcap-mmap user-space  Additional baseline scenario  All traffic through single NIC queue (receive-livelock)  What is the basic forwarding performance?  How efficient is the streamlining of one Netslice?  What is the benefit of batching?  How is Netslice scaling with the number of cores?  Can build high-speed complex packet processors? 24Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Simple Packet Routing  End-to-end throughput, MTU (1500 byte) packets  Error bars (always present) denote standard error of mean 9.7G 2.25G 7.6G 7.5G 5.6G 74% of kernel 1/11 of Netslice 25Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Single Netslice user and kernel context CPU placement  There are several choices 26Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Linear Scaling with CPUs # of CPUs used IPsec with 128 bit key—typically used by VPN – AES encryption in Cipher-block Chaining mode 9.1G 8G 27Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Netslice Implementation of Maelstrom  In-kernel reference version: 8432 lines of C  Netslice version: 1197 lines of user-space C  Forwarding throughput: ±37.25 Mbps  Maelstrom/Netslice goodput: ±35.7 Mbps  27.27% FEC overhead (for r=8, c=3)  Mbps × (1 + c ⁄ (r+c)) = 8901 Mbps 28Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

10Gbps and beyond  Netslice  Flexible API and spatial partitioning  Nehalem CPUs  FSB not the bottleneck any longer  Multiqueue NICs  Each core carves a private slice of every NIC  Batching  Userspace multi-read / multi-write instead of ossified conventional read / write  Traditional tricks  Pin down memory, minimize LLC contention, etc. 29Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Conclusion  Network layer is fundamental to Datacenter operations  Packet processors enhance network functionality and performance  Improve network performance with software packet processors running on commodity servers in userspace  OS support to build packet processing applications  Harness implicit parallelism of modern hardware to scale  Solution completely portable; kernel module load at runtime 30Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Paper Trail Theme: “Datacenter Middleboxes”  BiFocals/SDNA in IMC-2010  NLR study in DSN-2010  SMFS in FAST-2009  Maelstrom (FEC) in TONS-2010 and NSDI-2008  FWP, NSDI-2008 Poster Session  More at 31Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010

Questions 32Critical Network Infrastructure, by Hakim Weatherspoon11/10/2010