OSDI ‘14 Best Paper Award Adam Belay George Prekas Ana Klimovic

Slides:



Advertisements
Similar presentations
Supercharging PlanetLab : a high performance, Multi-Application, Overlay Network Platform Written by Jon Turner and 11 fellows. Presented by Benjamin Chervet.
Advertisements

Introduction CSCI 444/544 Operating Systems Fall 2008.
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
CS533 Concepts of Operating Systems Jonathan Walpole.
Operating System Support for Virtual Machines Sam King George Dunlap Peter Chen CoVirt Project, University of Michigan.
MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.
The NE010 iWARP Adapter Gary Montry Senior Scientist
Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015.
Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
GIS in the cloud: implementing a Web Map Service on Google App Engine Jon Blower Reading e-Science Centre University of Reading United Kingdom
Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.
Concepts and Structures. Main difficulties with OS design synchronization ensure a program waiting for an I/O device receives the signal mutual exclusion.
Presented by: Xianghan Pei
Considerations for Benchmarking Virtual Networks Samuel Kommu, Jacob Rapp, Ben Basler,
The Docker Container Approach to Build Scalable and Performance Testing Environment Pankaj Rodge, VMware.
Network Requirements for Resource Disaggregation
Mohit Aron Peter Druschel Presenter: Christopher Head
Organizations Are Embracing New Opportunities
Introduction to Operating Systems
GPUNFV: a GPU-Accelerated NFV System
Arrakis: The Operating System is the Control Plane
Kernel Design & Implementation
Non Contiguous Memory Allocation
Protecting Memory What is there to protect in memory?
Specialized systems are  Inevitable  Already the norm  Practical
Protecting Memory What is there to protect in memory?
Protecting Memory What is there to protect in memory?
Lesson Objectives Aims Key Words
The Multikernel: A New OS Architecture for Scalable Multicore Systems
Alternative system models
Scheduler activations
High-performance tracing of many-core systems with LTTng
Swapping Segmented paging allows us to have non-contiguous allocations
Multi-PCIe socket network device
TYPES OFF OPERATING SYSTEM
PASTE: A Networking API for Non-Volatile Main Memory
CMSC 611: Advanced Computer Architecture
GEN: A GPU-Accelerated Elastic Framework for NFV
Chapter 4: Threads.
George Prekas, Marios Kogias, Edouard Bugnion
Chapter 2: System Structures
Operating System Support for Virtual Machines
Xen Network I/O Performance Analysis and Opportunities for Improvement
Azure Accelerated Networking: SmartNICs in the Public Cloud
Lecture Topics: 11/1 General Operating System Concepts Processes
Fast Congestion Control in RDMA-Based Datacenter Networks
RDMA over Commodity Ethernet at Scale
Prof. Leonardo Mostarda University of Camerino
Network Systems and Throughput Preservation
Sai Krishna Deepak Maram, CS 6410
Microsoft Virtual Academy
Lecture Topics: 11/1 Hand back midterms
Running C# in the browser
SocksDirect: Datacenter Sockets can be Fast and Compatible
Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016.
CS Introduction to Operating Systems
Presentation transcript:

IX: A Protected Dataplane Operating System for High Throughput and Low Latency OSDI ‘14 Best Paper Award Adam Belay George Prekas Ana Klimovic Samuel Grossman Christos Kozyrakis Edouard Bugnion Presented by Jeremy Erickson EECS 582 – W16

What do we want? When do we want it? We want our networks fast! (Specifically, packet processing on commodity servers) When do we want it? Now (Or back in 2014) https://memegenerator.net/instance/64946097

Challenges Tail Latency Some SLAs require µs tail latency High Packet Rates Key-value stores - tons of small packets Protection Multiple services share a single server Resource Efficiency Load varies significantly over time https://memegenerator.net/instance/57260030

What else is out there? Basic Linux Networking Stack Potentially tuned for parallel incoming connections User-space Networking Stacks mTCP (NSDI ‘14 - April compared to IX’s OSDI ‘14 in October) Not secure Infiniband (RDMA) Requires special hardware -- datacenter only Some others...

Why don’t we just use those? Basic Linux Networking Stack Too slow User-space Networking Stacks mTCP (NSDI ‘14 - April compared to IX’s OSDI ‘14 in October) Not secure Infiniband (RDMA) Requires special hardware -- datacenter only Some others...

Lessons learned from Middleboxes Integrate the networking stack and application into dataplane Application gets (almost) direct access to hardware Run each packet to completion No context-switching to introduce latency (tail-latency, yuck) Adaptive batching when congested for higher throughput Also prevents Head-of-Line blocking

Design The paper provides this diagram: I thought it was a little hard to see what’s going on.

Design (maybe this is easier to visualize?) Similar to Exokernel, but with system protection and backwards compatibility httpd (libix) memcached (libix) Ring 3 Ring 0 (Dataplane) IX IX IXCP App1 ... AppN Ring 3 Linux kernel VMX Ring 0 Dune Hardware 40Gbps NIC 40Gbps NIC 1Gbps NIC Core 0 Core 1 Core 2 Core 3

Important details Zero-Copy API Packets are grabbed from NIC by IX and put in memory buffer Memory buffer is exposed to dataplane application Read-only through virtual memory protection Incoming traffic is randomly distributed among threads Uses NIC hardware Receive Side Scaling (RSS) For outgoing traffic, they probe ports to find same thread for return traffic The non-IX portion of the machine can still run legacy code! Backwards compatible!

Performance

Performance

Performance

Questions What potential performance downsides are there to this approach? What potential usability downsides are there to this approach? How could they make it even faster? They mention at the end of the paper that they could take advantage of Single-root input/output virtualization (SR-IOV) to share the NICs between multiple dataplanes and scale even harder

Questions What potential performance downsides are there to this approach? Large packets + Run-to-Completion + Slow Application ?= Delay What potential usability downsides are there to this approach? Applications must be specifically developed (or ported) to a new API How could they make it even faster? They mention at the end of the paper that they could take advantage of Single-root input/output virtualization (SR-IOV) to share the NICs between multiple dataplanes and scale even harder