CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 U-Net: A User-Level Network Interface for Parallel and Distributed Computing T. von Eicken, A. Basu,

Slides:



Advertisements
Similar presentations
Operating Systems Components of OS
Advertisements

Threads, SMP, and Microkernels
Distributed Processing, Client/Server and Clusters
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Improving IPC by Kernel Design Jochen Liedtke Slides based on a presentation by Rebekah Leslie.
Umut Girit  One of the core members of the Internet Protocol Suite, the set of network protocols used for the Internet. With UDP, computer.
High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Supporting Parallel Applications on Clusters of Workstations: The Intelligent Network Interface Approach.
Extensibility, Safety and Performance in the SPIN Operating System Presented by Allen Kerr.
User-Level Interprocess Communication for Shared Memory Multiprocessors Bershad, B. N., Anderson, T. E., Lazowska, E.D., and Levy, H. M. Presented by Akbar.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Fast Communication Firefly RPC Lightweight RPC  CS 614  Tuesday March 13, 2001  Jeff Hoy.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Computer Systems/Operating Systems - Class 8
Presented By Srinivas Sundaravaradan. MACH µ-Kernel system based on message passing Over 5000 cycles to transfer a short message Buffering IPC L3 Similar.
1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Scheduler Activations Effective Kernel Support for the User-Level Management of Parallelism.
3.5 Interprocess Communication Many operating systems provide mechanisms for interprocess communication (IPC) –Processes must communicate with one another.
CS 550 Amoeba-A Distributed Operation System by Saie M Mulay.
Inter Process Communication:  It is an essential aspect of process management. By allowing processes to communicate with each other: 1.We can synchronize.
Haoyuan Li CS 6410 Fall /15/2009.  U-Net: A User-Level Network Interface for Parallel and Distributed Computing ◦ Thorsten von Eicken, Anindya.
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
3.5 Interprocess Communication
CS533 - Concepts of Operating Systems
Realizing the Performance Potential of the Virtual Interface Architecture Evan Speight, Hazim Abdel-Shafi, and John K. Bennett Rice University, Dep. Of.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
1 I/O Management in Representative Operating Systems.
PRASHANTHI NARAYAN NETTEM.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
Ethan Kao CS 6410 Oct. 18 th  Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth.
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
CS533 Concepts of Operating Systems Class 9 Lightweight Remote Procedure Call (LRPC) Rizal Arryadi.
1 Instant replay  The semester was split into roughly four parts. —The 1st quarter covered instruction set architectures—the connection between software.
Chapter 51 Threads Chapter 5. 2 Process Characteristics  Concept of Process has two facets.  A Process is: A Unit of resource ownership:  a virtual.
ATM and Fast Ethernet Network Interfaces for User-level Communication Presented by Sagwon Seo 2000/4/13 Matt Welsh, Anindya Basu, and Thorsten von Eicken.
Computer System Architectures Computer System Software
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
LWIP TCP/IP Stack 김백규.
Jozef Goetz, Application Layer PART VI Jozef Goetz, Position of application layer The application layer enables the user, whether human.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
Unconventional Networking Makoto Bentz October 13, 2010 CS 6410.
The Socket Interface Chapter 21. Application Program Interface (API) Interface used between application programs and TCP/IP protocols Interface used between.
Ihr Logo Operating Systems Internals & Design Principles Fifth Edition William Stallings Chapter 2 (Part II) Operating System Overview.
Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand Design and Implementation of MPICH-2 over InfiniBand with.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
Processes Introduction to Operating Systems: Module 3.
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy. Presented by: Tim Fleck.
Department of Computer Science and Software Engineering
The Mach System Silberschatz et al Presented By Anjana Venkat.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS Moontae Lee (Nov 20, 2014)Part 1 CS6410.
Introduction Contain two or more CPU share common memory and peripherals. Provide greater system throughput. Multiple processor executing simultaneous.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.
1 load [2], [9] Transfer contents of memory location 9 to memory location 2. Illegal instruction.
Process Management Process Concept Why only the global variables?
Improving IPC by Kernel Design
Multithreaded Programming
Chapter 2: Operating-System Structures
Presented by Neha Agrawal
Chapter 2: Operating-System Structures
Process-to-Process Delivery: UDP, TCP
Presentation transcript:

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 U-Net: A User-Level Network Interface for Parallel and Distributed Computing T. von Eicken, A. Basu, V. Buch and W. Vogels Cornell University Appears in SIGOPS 1995 Presented by: Joseph Paris

2 Introduction There has been a shift in local area network bottleneck –Traditionally, limited bandwidth –Now we see an issue in the message path through software Taking a look at the UNIX networking architecture –Message path through the kernel consists of Several Copies Crossing multiple levels of abstraction between device drivers and user applications Resulting in…. Overhead –We observe that the processing overheads limit the peak communication bandwidth and result in high latency –So, the upgrades in networking technology largely go unnoticed to the general user community Vendor supplied problem? –May think of large data-stream cases and less about per message overhead

3 Observation Most applications use relatively small messages and rely heavily on quick round-trip requests and replies –Distributed shared memory –Remote procedure calls –Remote object-oriented method invocations –Distributed cooperative file caches And, they could also benefit from more flexible interfaces to the network –Traditional architecture cannot easily support new protocols/interfaces –Integrating application specific information into protocol processing Higher efficiency Greater flexibility –I.e. Video, Audio, Transferring directly from data structures

4 Motivation Low end-to-end communication latencies Separating processing overhead from network latency –Distributed Systems Object-oriented Technology –Objects are generally small (100 bytes vs. Kbytes) Electronic workplace –Simple database servers that handle object naming, location, authentication, protection. (20-80bytes for requests, bytes for response) Cache Coherence –Keeping copies consistent introduces a large number of small coherence messages. Fault-tolerance Algorithms/Group Communication –Global locks, scheduling, coherence RPC’s, file systems, etc.

5 Motivation Small message Bandwidth –Same trends that demand low latencies also demand high bandwidth for small messages Object-oriented Technology, Electronic workplace, Cache Coherence, RPC’s, etc –Part of decreasing the overall end-to-end latency is having high-bandwidth technology for small messages –Basically, we want full network bandwidth with as small messages as possible Protocol Interface Flexibility –Traditionally protocol stacks are implemented as part of the kernel Lack of integration of kernel and application buffer management –Solution Remove the comm. Subsystem’s boundary with the application specific protocols Tight coupling between the comm. Protocol and the application

6 Solution - Unet Why? –Focus on low latency and high bandwidth using small messages –Emphasis on protocol design and integration flexibility –Desire to meet goals on widely available ‘off the shelf’ hardware How? –Simply, remove the kernel from the critical path of sending and receiving messages Eliminates the system call overhead Offers opportunities to streamline the buffer management –What’s required? Virtualizing the network interface among processes Protection such that processes using the network cannot interfere with each other –Message Multiplexing and De-Multiplexing Managing communication resources without the kernel Efficient and Versatile programming interface to the network

7 Design & Implementation of U-Net Virtualize the network interface in such a way that a combination of OS and hardware mechanisms can provide the illusion of owning the interface –In hardware Components manipulated by a process correspond to real hardware –In software Memory locations are interpreted by the OS –Both The Role of U-Net is limited to –Multiplexing the actual network interface among all processes –Enforcing protection boundaries –Enforcing consumption limits This leaves the process with control over –Contents of the message –Management of send and receive resources (such as buffers)

8 Design & Implementation of U-Net We have 3 main building blocks –Endpoints Serve as an applications handle into the network and contain… –Communication Segments Regions of memory that hold message data –Message Queues Holds descriptors for messages that are to be sent or have been received –Each process that wants to access the network Creates one or more endpoints Associates a communication Segment with the endpoint And a set of send, receive, free message queues

9 Design & Implementation of U-Net Sending –User process composes the data in the communication segment –Pushes a descriptor for the message onto the send queue –At this point the network interface is expected to pick the message up and insert it into the network If there is a back-up –Leave the descriptor in the queue –Eventually exert back-pressure to the user process when the queue becomes full Receiving –Messages are de-multiplexed based on their destination Data is transferred to the appropriate comm. Segment The message descriptor is pushed onto the corresponding receive queue –Receive model notification Polling –Blocks waiting for the next message to arrive via the UNIX select call Event Driven –Register an Up-Call »Signals the state of the receive queue that satisfies a certain condition –Only two conditions currently supported »Queue is non-empty »Queue is almost full –In order to keep performance high (and cost low) all messages can be consumed on a single up-call

10 Design and Implementation of U-Net Multiplexing and De-Multiplexing Messages –Uses a tag in each incoming message to determine destination endpoint Comm. Segment Message queue descriptor –Exact form of the message tag depends on the network substrate i.e. ATM uses virtual channel identifiers Getting the tag via an OS level service assists in –An application in determining the correct tag to use based on a specification of the destination process and the route between the two nodes route discovery Switch-path setup other signaling that is specific to the network technology –Authentication and authorization Performs checks to ensure that the application is allowed to access specific network resources Also checks to make sure there are no conflicts with other applications

11 Design and Implementation of U-Net Base-level Architecture –Hardware cannot support Direct-Access “True Zero-Copy” where data can be sent directly out of the applications data structures without intermediate buffering Requires special memory mapping to span the entire processes address space into the network interface –So we only get “Zero-Copy” support for now Which in reality requires a single copy, namely between the application’s data structures and a bugger in the communication segment –Queue based interface to the network Stages messages in a limited size comm. Segment on their way between application data structures and the network Send and Receive queues hold descriptors with information about the destination, origin, endpoints, length, as well as offsets within the comm. segment –Management of the send buffer is entirely up to the process »Must be properly aligned for the requirements of the network interface –Cannot control order in which messages are received into the Recv Buffer Free queues hold descriptors for free buggers that are made available to the network interface for storing arriving messages –Small Message Optimization Send and recv queues may hold entire messages in descriptors (instead of pointers to data) Avoids buffer management and can improve round-trip latency

12 Evaluation Two U-Net implementations –SBA-100 Non-programmable, completely done in software Performance sucks –33-40% increase in overhead due to ATM header CRC calculation being done in software –SBA-200 Programmable, custom firmware Reflects the base-level U-Net architecture in hardware Three tests –U-Net Active Messages Implementation (UAM) »Active messages is a mechanism that allows efficient over-lapping of communication with computation in multi-processors »Communication in form of requests and matching replies –Split-C »Parallel extension to C for programming distributed memory machines using a global address space abstraction »Comprises of one thread of control per process from a single code image and the threads interact through reads and writes on shared data »Implemented with U-Net Active Messages –TCP/UDP

13 Evaluation Active Message (UAM) U-Net round-trip time as a function of message size U-Net bandwidth as a function of message size

14 Evaluation Split-C Using UAM CPU and Network Breakdown for two applications Overall Execution Time

15 Evaluation TCP/UDP U-Net UDP Bandwidth as a function of Message Size U-Net TCP Bandwidth as a function of Message Size

16 Evaluation TCP/UDP Latency as a function of Message Size

17 Conclusion Processing overhead on messages has been minimized Latency experienced by the application is once again dominated by the actual message transmission time Simple networking interface that supports traditional inter-networking protocols and abstractions such as Active Messages Demonstrates that removing the kernel from the communication path can offer new flexibility in addition to high performance –TCP/UDP protocols achieve latencies and throughput close to the raw maximum