LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
CCU EE&CTR1 Software Architecture Overview Nick Wang & Ting-Chao Hou National Chung Cheng University Control Plane-Platform Development Kit.
ECE 526 – Network Processing Systems Design Software-based Protocol Processing Chapter 7: D. E. Comer.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
CS533 Concepts of Operating Systems Class 14 Virtualization.
INTRODUCTION OS/2 was initially designed to extend the capabilities of DOS by IBM and Microsoft Corporations. To create a single industry-standard operating.
I/O Hardware n Incredible variety of I/O devices n Common concepts: – Port – connection point to the computer – Bus (daisy chain or shared direct access)
1 Network Packet Generator Characterization presentation Supervisor: Mony Orbach Presenting: Eugeney Ryzhyk, Igor Brevdo.
RDMA ENABLED WEB SERVER Rajat Sharma. Objective  To implement a Web Server serving HTTP client requests through RDMA replacing the traditional TCP/IP.
Figure 1.1 Interaction between applications and the operating system.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
1 I/O Management in Representative Operating Systems.
Why Threads Are A Bad Idea (for most purposes) John Ousterhout Sun Microsystems Laboratories
Copyright Arshi Khan1 System Programming Instructor Arshi Khan.
Chapter 13: I/O Systems I/O Hardware Application I/O Interface
TCP Servers: Offloading TCP/IP Processing in Internet Servers
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
SEDA: An Architecture for Well-Conditioned, Scalable Internet Services
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
LWIP TCP/IP Stack 김백규.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
LWIP TCP/IP Stack 김백규.
Kernel, processes and threads Windows and Linux. Windows Architecture Operating system design Modified microkernel Layered Components HAL Interacts with.
Overview of implementations openBGP (and openOSPF) –Active development Zebra –Commercialized Quagga –Active development XORP –Hot Gated –Dead/commercialized.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
CS533 Concepts of Operating Systems Jonathan Walpole.
Getting Started with the µC/OS-III Real Time Kernel Akos Ledeczi EECE 6354, Fall 2015 Vanderbilt University.
Processes and Threads Processes have two characteristics: – Resource ownership - process includes a virtual address space to hold the process image – Scheduling/execution.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
Computers Operating System Essentials. Operating Systems PROGRAM HARDWARE OPERATING SYSTEM.
Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Chapter 13: I/O Systems. 13.2/34 Chapter 13: I/O Systems I/O Hardware Application I/O Interface Kernel I/O Subsystem Transforming I/O Requests to Hardware.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
1 Combining Events and Threads for Scalable Network Services Peng Li and Steve Zdancewic University of Pennsylvania PLDI 2007, San Diego.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
Examples of Operating Systems.
Full and Para Virtualization
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
Lecture 4 Page 1 CS 111 Online Modularity and Virtualization CS 111 On-Line MS Program Operating Systems Peter Reiher.
Operating-System Structures
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
1 Why Threads are a Bad Idea (for most purposes) based on a presentation by John Ousterhout Sun Microsystems Laboratories Threads!
Silberschatz, Galvin, and Gagne  Applied Operating System Concepts Module 12: I/O Systems I/O hardwared Application I/O Interface Kernel I/O.
Virtual Machines Mr. Monil Adhikari. Agenda Introduction Classes of Virtual Machines System Virtual Machines Process Virtual Machines.
Unit - I Real Time Operating System. Content : Operating System Concepts Real-Time Tasks Real-Time Systems Types of Real-Time Tasks Real-Time Operating.
Operating Systems A.Biswas Architecture. Computer Startup.
Introduction to Operating Systems Concepts
Kernel Design & Implementation
LWIP TCP/IP Stack 김백규.
Operating System Structure
KERNEL ARCHITECTURE.
Real-time Software Design
Chapter 4: Threads.
Threads, SMP, and Microkernels
CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #
Chapter 4: Threads.
QNX Technology Overview
CSE 451: Operating Systems Autumn Module 24 Virtual Machine Monitors
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Why Threads Are A Bad Idea (for most purposes)
Lecture Topics: 11/1 Hand back midterms
Presentation transcript:

LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania

The Network Processor environment Many types of network device include a processor High performance host NICs e.g., gigabit cards Remote management (ILO) devices Router line cards Virtual network devices (more on this later...)

Example applications High-speed packet processing Traffic shaping Firewall/intrusion detection Monitoring/logging Remote management

Why a network processor OS? Device-independent portability layer –run the same application on diverse network processors Hardware abstraction –hide details of network processor environment from app. Isolation between components –protect services and drives from bugs Multitasking –simplifies the programming task Library of common functionality –memory management, profiling, logging, etc.

LiNK: The Lightweight Network Kernel Simple, event-driven main loop –interrupt handlers scheduled as events –synchronous processing reduces complexity of kernel Network service components –take advantage of simple, uniform application structure Posted service requests –efficient communication between LiNK and clients

Network Service Components Fundamental structure shared by all applications Three elements: 1.Transmit filter – add packet headers, packet scheduling 2.Receive handler – generate response packets 3.Timeout callback – flush caches, packet retransmit Service functions scheduled by kernel –functions run to completion or explicit yield Many services fit into this model –ARP, ICMP, traffic shaping

Posted Service Request object

Posted Service Request processing Client posts to shared object - asynchrony Kernel polls and processes - concurrency Chaining required for efficiency Speculation and reference arguments Events for completion and failure notification Alloc_Mem Cap_Ref_New 12 Alloc_Mem Cap_Ref_New 12 Make_Frameset Cap_Ref_New Set_Peer Speculation Data Dependency

Virtual Network Processors Multi-core processors are cheaply available –multi-threaded and/or multi-core CPUs Virtualisation technology has matured too –low overhead, enhanced by hardware support Use a processing element as a network processor A perfect prototyping and development environment –testing and debugging on real hardware is hard But maybe something more...

Current Implementation Hybrid Linux/LiNK system –Linux as host operating system, LiNK as kernel module –LiNK accessible to Linux as standard ethernet device LiNK provides user-space network subsystem –Linux provides filesystem, processes, scheduling, VM, etc. Simple way to prototype new NP feature –e.g., a direct user-space interface to the (virtual) NP

Evaluating the Virtual NIC Ported the Flash webserver to Linux+LiNK Provided TCP/IP protocol stack as user-space library –webserver used unmodified WebStone 2.5 HTTP benchmark –simulates realistic workload with multiple clients –small number of files, representative size and distribution Compare performance of Linux and Linux+LiNK

Evaluation: Communication Overhead

Evaluation: Flash performance

Conclusions Network processors need operating systems The Lightweight Network Kernel is one alternative –Simple, specialised structure –Asynchronous, high performance communication The future: virtual network processors

Questions/comments

Evaluation: Network Throughput

Evaluation: Polling Performance

Evaluation: Polling Scalability

Related Work Network device polling –Click modular router [Kohler] - 4x performance increase –Scout/IXP1200 router [Peterson] - similar Parallel network protocol stacks [Bjornberg, Naburn] –processor-per-packet scales well for simple protocols –complex protocols => severe lock contention Network appliance optimisations –I/O Lite - unified buffer management [Pai] –Soft timers - polling in an interrupt-driver kernel [Aron]

Future Work Responsiveness of polling –dynamic specialisation e.g., run-time code generation Scalability –how many processors can Piglet support? –how many applications/devices? Alternative applications for Piglet –network processor e.g., IXP1200, operating system