Unconventional Networking Makoto Bentz October 13, 2010 CS 6410.

Slides:



Advertisements
Similar presentations
Operating Systems Components of OS
Advertisements

Multiple Processor Systems
Low Latency Messaging Over Gigabit Ethernet Keith Fenech CSAW 24 September 2004.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 U-Net: A User-Level Network Interface for Parallel and Distributed Computing T. von Eicken, A. Basu,
CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Supporting Parallel Applications on Clusters of Workstations: The Intelligent Network Interface Approach.
AMLAPI: Active Messages over Low-level Application Programming Interface Simon Yau, Tyson Condie,
Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented by Alana Sweat.
Background Computer System Architectures Computer System Software.
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
Extensibility, Safety and Performance in the SPIN Operating System Brian Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gun Sirer, Marc E. Fiuczynski,
1: Operating Systems Overview
OPERATING SYSTEM OVERVIEW
Haoyuan Li CS 6410 Fall /15/2009.  U-Net: A User-Level Network Interface for Parallel and Distributed Computing ◦ Thorsten von Eicken, Anindya.
Extensibility, Safety and Performance in the SPIN Operating System Brian Bershad, Stefan Savage, Przemyslaw Pardyak, Emin Gun Sirer, Marc E. Fiuczynski,
Active Messages: a Mechanism for Integrated Communication and Computation von Eicken et. al. Brian Kazian CS258 Spring 2008.
Parallel Processing Architectures Laxmi Narayan Bhuyan
Exokernel: An Operating System Architecture for Application-Level Resource Management Dawson R. Engler, M. Frans Kaashoek, and James O’Toole Jr. M.I.T.
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
COM S 614 Advanced Systems Novel Communications U-Net and Active Messages.
Ethan Kao CS 6410 Oct. 18 th  Active Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Optimizing RPC “Lightweight Remote Procedure Call” (1990) Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, Henry M. Levy (University of Washington)
1 Lightweight Remote Procedure Call Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska and Henry M. Levy Presented by: Karthika Kothapally.
ATM and Fast Ethernet Network Interfaces for User-level Communication Presented by Sagwon Seo 2000/4/13 Matt Welsh, Anindya Basu, and Thorsten von Eicken.
Computer System Architectures Computer System Software
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
LOGO OPERATING SYSTEM Dalia AL-Dabbagh
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Introduction, background, jargon Jakub Yaghob. Literature T.G.Mattson, B.A.Sanders, B.L.Massingill: Patterns for Parallel Programming, Addison- Wesley,
Department of Computer Science University of the West Indies.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3.
Minimizing Communication Latency to Maximize Network Communication Throughput over InfiniBand Design and Implementation of MPICH-2 over InfiniBand with.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
Processes Introduction to Operating Systems: Module 3.
A summary by Nick Rayner for PSU CS533, Spring 2006
EXTENSIBILITY, SAFETY AND PERFORMANCE IN THE SPIN OPERATING SYSTEM
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
1: Operating Systems Overview 1 Jerry Breecher Fall, 2004 CLARK UNIVERSITY CS215 OPERATING SYSTEMS OVERVIEW.
LRPC Firefly RPC, Lightweight RPC, Winsock Direct and VIA.
M. Accetta, R. Baron, W. Bolosky, D. Golub, R. Rashid, A. Tevanian, and M. Young MACH: A New Kernel Foundation for UNIX Development Presenter: Wei-Lwun.
1 Isolating Web Programs in Modern Browser Architectures CS6204: Cloud Environment Spring 2011.
1Thu D. NguyenCS 545: Distributed Systems CS 545: Distributed Systems Spring 2002 Communication Medium Thu D. Nguyen
Software Overhead in Messaging Layers Pitch Patarasuk.
Lecture 4 Mechanisms & Kernel for NOSs. Mechanisms for Network Operating Systems  Network operating systems provide three basic mechanisms that support.
Caltech CS184 Spring DeHon 1 CS184b: Computer Architecture (Abstractions and Optimizations) Day 14: May 7, 2003 Fast Messaging.
HIGH-PERFORMANCE NETWORKING :: USER-LEVEL NETWORKING :: REMOTE DIRECT MEMORY ACCESS Moontae Lee (Nov 20, 2014)Part 1 CS6410.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
Background Computer System Architectures Computer System Software.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
ECE 259 / CPS 221 Advanced Computer Architecture II (Parallel Computer Architecture) Interactions with Microarchitectures and I/O Copyright 2004 Daniel.
Chapter 2: Operating-System Structures
Operating Systems: A Modern Perspective, Chapter 3
Chapter 2: Operating-System Structures
Presented by Roberto Starc
Presentation transcript:

Unconventional Networking Makoto Bentz October 13, 2010 CS 6410

Papers U-Net: A User-Level Network Interface for Parallel and Distributed Computing,U-Net: A User-Level Network Interface for Parallel and Distributed Computing, Von Eicken, Basu, Buch and Werner Vogels. 15th SOSP, December U-Net: A User-Level Network Interface for Parallel and Distributed Computing, Active Messages: A Mechanism for Integrated Communication and ControlActive Messages: A Mechanism for Integrated Communication and Control, Thorsten von Eicken, David E. Culler, Seth Copen Goldstein, and Klaus Erik Schauser. In Proceedings of the 19th Annual International Symposium on Computer Architecture, Active Messages: A Mechanism for Integrated Communication and Control Evaluation of the Virtual Interface Architecture (VIA)Evaluation of the Virtual Interface Architecture (VIA). Xin Liu June 8, 1999 Department of Computer Science and Engineering University of California, San Diego Evaluation of the Virtual Interface Architecture (VIA)

Overview Active Messages: Instructions between multiprocessors U-Net: Networking beyond the kernel, separation into the User Space VIA: Fast Messaging on virtualized standard network interfaces (VIA)

Parallel versus Distributed CPUCPU CacheCache MemoryMemory CPUCPU CacheCache Parallel Computing Multiple processors connected in a computer Tightly coupled Shared memory

Parallel versus Distributed CPUCPU CacheCache... MemoryMemory CPUCPU CacheCache MemoryMemory Distributed Computing Multiple autonomous computers connected by network Loosely coupled Distributed memory Network

Active Messages Written at by von Eicken, Cohen and at University of California Berkeley, in David E. Culler’s research group Other projects by group include Split-C, Castle and Ninja Ninja

Active Messages: Authors Seth Copen Cohen is now a professor at CMU Programmable matter Klaus Erik Schauser is now a professor at UCSB Parallel computing

Thorston von Eicken Advised by David E. Culler (MIT, teaches at U of C Berkeley) Former Cornell Assistant Professor Active Messages becomes his Ph.D. thesis Now teaches at UCSB Worked at ExpertCity.com, worked on GoToMyPC and GoToMeeting Now works at RightScale, a cloud computing company

Active Messages: Motivation Message passing machines have reasonable hardware costs but suffer from Poor overlap between communication and computation High overhead for messages And this is not due to hardware Comm. Proc. Wasted Time task task task

Active Messages:Hardware nCube Gflops/s Hypercube layout Thinking Machines Connection Machines (CM)-5 Non-standard von Neumann architecture Fat-tree network MIMD NSA FROSTBURG a CM-5 machine

Active Messages:Design Get around the comm/proc delay Active messages are messaging objects that can perform its own processing Each message carries a userspace handler or code The message contains the arguments to pass to the handler Single Program, Multiple Data

Active Messages:Design Program pre-allocates receiving structures, eliminating buffering The network itself acts as a buffer for small messages Creates low overhead on both sides Message handlers are not allowed to block

Active Messaging:Testing nCube/2 11us send + 15us receive (nearly an order of magnitude reduction) Almost near minimal instructions CM-5 1.6us send + 1.7us receive

Active Messaging:Split-C Spit-C PUT and GET perform split-phase copies of memory blocks between nodes For high performance, should overlap communication and computation

Active Messaging:Split-C

Active Messaging:Message Driven Architectures Message Driven Architectures Support languages with support for dynamic parallelism Active Message handlers execute immediately upon message arrival, cannot suspend or block

Active Messaging:Further Work: TAM Threaded Abstract Machine is a fine-grained parallel execution model based on Active Message No testing results

Active Messaging: Discussion This paper becomes von Eicken’s thesis I personally disliked the style of this paper, but liked the content Too much in one article, works much better as a thesis

U-Net: Authors Assistant Professor von Eicken Anindya Basu (advised by von Eicken) Vineet Buch (M.S. from Cornell) co-founds like.com (now part of Google) worked with BlueRun (VC) Werner Vogels former Researcher at Cornell now Amazon CTO

U-Net: Motivation Networks are now so fast the biggest delay is kernel processing of the message stacks Developing in kernel space limits development of new message send/receive interfaces

U-Net: Motivation Removing the kernel from the critical path would allow specialization

U-Net: Motivation Previously, Kernel’s Responsibility, and opaque to the application Want to make this applicationspecific Would like to be able to integrate application specific behavior to optimize performance

U-Net: Design U-Net must be able to Multiplex the network between processes Provide isolation and protection between processes Manage resources without kernel paths Allow the development of an interface

U-Net: Design Concerns Focus on low overheads to optimize small message transmission latency and bandwidth Emphasis on protocol design and integration To do both on standard off-the-shelf hardware

U-Net: Design: Low Overhead U-Net components map to Real hardware in the NI -and/or- Memory locations interpreted by the OS Zero-Copy tries to limit intermediate buffering between NI and User-Level

U-Net: Design: Protocols Tags specific to the network substrate used for addressing ATM (VCI) Ethernet (Physical addressing) Endpoints, communication channels and queues are only accessible by owners

U-Net: Design: Hardware Implemented on SPARCstation Fore SBA-100 interfaces Fore SBA-200 interfaces Could use on-board processor on SBA-200, custom firmware due to poor bundled firmware

U-Net: Testing Results

U-Net: Discussion Feasible technology Good performance on off the shelf hardware with existing standards “This encouraging result should, however, not obscure the fact that significant additional system resources, such as parallel process schedulers and parallel file systems, still need to be developed before the cluster of workstations can be viewed as a unified resource.”

FM/VIA Virtual Interface Architecture standardizes System Area Networks (Microsoft, Intel, Compaq) VIA however, has no simple programming interface Fast Messages provided layering for small messages

FM/VIA :: Testing Results

FM/VIA :: Testing

Effects of U-Net FM/VIA was not useful because it did not meet the requirement of fast end points FM/VIA goes on to influence InfiniBand, iWarp Zero-copy