OSDI ‘14 Best Paper Award Adam Belay George Prekas Ana Klimovic

Slides:

Advertisements

Similar presentations

Supercharging PlanetLab : a high performance, Multi-Application, Overlay Network Platform Written by Jon Turner and 11 fellows. Presented by Benjamin Chervet.

Advertisements

Introduction CSCI 444/544 Operating Systems Fall 2008.

Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.

Lecture 12 Page 1 CS 111 Online Devices and Device Drivers CS 111 On-Line MS Program Operating Systems Peter Reiher.

ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.

Keith Wiles DPACC vNF Overview and Proposed methods Keith Wiles – v0.5.

Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.

Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.

Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.

RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.

COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.

Microsoft Virtual Academy Module 4 Creating and Configuring Virtual Machine Networks.

Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

CS533 Concepts of Operating Systems Jonathan Walpole.

Operating System Support for Virtual Machines Sam King George Dunlap Peter Chen CoVirt Project, University of Michigan.

MIDeA :A Multi-Parallel Instrusion Detection Architecture Author: Giorgos Vasiliadis, Michalis Polychronakis,Sotiris Ioannidis Publisher: CCS’11, October.

The NE010 iWARP Adapter Gary Montry Senior Scientist

Low-Latency Datacenters John Ousterhout Platform Lab Retreat May 29, 2015.

Swapping to Remote Memory over InfiniBand: An Approach using a High Performance Network Block Device Shuang LiangRanjit NoronhaDhabaleswar K. Panda IEEE.

Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.

Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.

GIS in the cloud: implementing a Web Map Service on Google App Engine Jon Blower Reading e-Science Centre University of Reading United Kingdom

Protection of Processes Security and privacy of data is challenging currently. Protecting information – Not limited to hardware. – Depends on innovation.

Concepts and Structures. Main difficulties with OS design synchronization ensure a program waiting for an I/O device receives the signal mutual exclusion.

Presented by: Xianghan Pei

Considerations for Benchmarking Virtual Networks Samuel Kommu, Jacob Rapp, Ben Basler,

The Docker Container Approach to Build Scalable and Performance Testing Environment Pankaj Rodge, VMware.

Network Requirements for Resource Disaggregation

Mohit Aron Peter Druschel Presenter: Christopher Head

Organizations Are Embracing New Opportunities

Introduction to Operating Systems

GPUNFV: a GPU-Accelerated NFV System

Arrakis: The Operating System is the Control Plane

Kernel Design & Implementation

Non Contiguous Memory Allocation

Protecting Memory What is there to protect in memory?

Specialized systems are  Inevitable  Already the norm  Practical

Protecting Memory What is there to protect in memory?

Protecting Memory What is there to protect in memory?

Lesson Objectives Aims Key Words

The Multikernel: A New OS Architecture for Scalable Multicore Systems

Alternative system models

Scheduler activations

High-performance tracing of many-core systems with LTTng

Swapping Segmented paging allows us to have non-contiguous allocations

Multi-PCIe socket network device

TYPES OFF OPERATING SYSTEM

PASTE: A Networking API for Non-Volatile Main Memory

CMSC 611: Advanced Computer Architecture

GEN: A GPU-Accelerated Elastic Framework for NFV

Chapter 4: Threads.

George Prekas, Marios Kogias, Edouard Bugnion

Chapter 2: System Structures

Operating System Support for Virtual Machines

Xen Network I/O Performance Analysis and Opportunities for Improvement

Azure Accelerated Networking: SmartNICs in the Public Cloud

Lecture Topics: 11/1 General Operating System Concepts Processes

Fast Congestion Control in RDMA-Based Datacenter Networks

RDMA over Commodity Ethernet at Scale

Prof. Leonardo Mostarda University of Camerino

Network Systems and Throughput Preservation

Sai Krishna Deepak Maram, CS 6410

Microsoft Virtual Academy

Lecture Topics: 11/1 Hand back midterms

Running C# in the browser

SocksDirect: Datacenter Sockets can be Fast and Compatible

Kernel Bypass Sujay Jayakar (dsj36) 11/17/2016.

CS Introduction to Operating Systems

Presentation transcript:

IX: A Protected Dataplane Operating System for High Throughput and Low Latency OSDI ‘14 Best Paper Award Adam Belay George Prekas Ana Klimovic Samuel Grossman Christos Kozyrakis Edouard Bugnion Presented by Jeremy Erickson EECS 582 – W16

What do we want? When do we want it? We want our networks fast! (Specifically, packet processing on commodity servers) When do we want it? Now (Or back in 2014) https://memegenerator.net/instance/64946097

Challenges Tail Latency Some SLAs require µs tail latency High Packet Rates Key-value stores - tons of small packets Protection Multiple services share a single server Resource Efficiency Load varies significantly over time https://memegenerator.net/instance/57260030

What else is out there? Basic Linux Networking Stack Potentially tuned for parallel incoming connections User-space Networking Stacks mTCP (NSDI ‘14 - April compared to IX’s OSDI ‘14 in October) Not secure Infiniband (RDMA) Requires special hardware -- datacenter only Some others...

Why don’t we just use those? Basic Linux Networking Stack Too slow User-space Networking Stacks mTCP (NSDI ‘14 - April compared to IX’s OSDI ‘14 in October) Not secure Infiniband (RDMA) Requires special hardware -- datacenter only Some others...

Lessons learned from Middleboxes Integrate the networking stack and application into dataplane Application gets (almost) direct access to hardware Run each packet to completion No context-switching to introduce latency (tail-latency, yuck) Adaptive batching when congested for higher throughput Also prevents Head-of-Line blocking

Design The paper provides this diagram: I thought it was a little hard to see what’s going on.

Design (maybe this is easier to visualize?) Similar to Exokernel, but with system protection and backwards compatibility httpd (libix) memcached (libix) Ring 3 Ring 0 (Dataplane) IX IX IXCP App1 ... AppN Ring 3 Linux kernel VMX Ring 0 Dune Hardware 40Gbps NIC 40Gbps NIC 1Gbps NIC Core 0 Core 1 Core 2 Core 3

Important details Zero-Copy API Packets are grabbed from NIC by IX and put in memory buffer Memory buffer is exposed to dataplane application Read-only through virtual memory protection Incoming traffic is randomly distributed among threads Uses NIC hardware Receive Side Scaling (RSS) For outgoing traffic, they probe ports to find same thread for return traffic The non-IX portion of the machine can still run legacy code! Backwards compatible!

Performance

Performance

Performance

Questions What potential performance downsides are there to this approach? What potential usability downsides are there to this approach? How could they make it even faster? They mention at the end of the paper that they could take advantage of Single-root input/output virtualization (SR-IOV) to share the NICs between multiple dataplanes and scale even harder

Questions What potential performance downsides are there to this approach? Large packets + Run-to-Completion + Slow Application ?= Delay What potential usability downsides are there to this approach? Applications must be specifically developed (or ported) to a new API How could they make it even faster? They mention at the end of the paper that they could take advantage of Single-root input/output virtualization (SR-IOV) to share the NICs between multiple dataplanes and scale even harder