The Multikernel: A new OS architecture for scalable multicore systems Andrew Baumann et al. 2009. 10. 08. CS530 Graduate Operating System Presented by.

Slides:



Advertisements
Similar presentations
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Advertisements

Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Kernel-Kernel Communication in a Shared- memory Multiprocessor Eliseu Chaves, et. al. May 1993 Presented by Tina Swenson May 27, 2010.
Background Computer System Architectures Computer System Software.
Introduction to Operating Systems CS-2301 B-term Introduction to Operating Systems CS-2301, System Programming for Non-majors (Slides include materials.
1. Overview  Introduction  Motivations  Multikernel Model  Implementation – The Barrelfish  Performance Testing  Conclusion 2.
Multiprocessors CS 6410 Ashik Ratnani, Cornell University.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Bugnion et al. Presented by: Ahmed Wafa.
G Robert Grimm New York University Disco.
CS533 Concepts of Operating Systems Class 14 Virtualization.
OCT1 Principles From Chapter One of “Distributed Systems Concepts and Design”
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine, and Mendel Rosenblum, Stanford University, 1997.
Figure 1.1 Interaction between applications and the operating system.
Introduction Operating Systems’ Concepts and Structure Lecture 1 ~ Spring, 2008 ~ Spring, 2008TUCN. Operating Systems. Lecture 1.
Slide 3-1 Copyright © 2004 Pearson Education, Inc. Operating Systems: A Modern Perspective, Chapter 3 Operating System Organization.
Xen and the Art of Virtualization Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield.
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Zen and the Art of Virtualization Paul Barham, et al. University of Cambridge, Microsoft Research Cambridge Published by ACM SOSP’03 Presented by Tina.
Computer System Architectures Computer System Software
2017/4/21 Towards Full Virtualization of Heterogeneous Noc-based Multicore Embedded Architecture 2012 IEEE 15th International Conference on Computational.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
The Multikernel: A new OS architecture for scalable multicore systems
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Jonathan Walpole (based on a slide set from Vidhya Sivasankaran)
CS533 Concepts of Operating Systems Jonathan Walpole.
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
Distributed Shared Memory: A Survey of Issues and Algorithms B,. Nitzberg and V. Lo University of Oregon.
Ed Nightingale, Orion Hodson, Ross McIlroy, Chris Hawblitzel, Galen Hunt MICROSOFT RESEARCH Helios: Heterogeneous Multiprocessing with Satellite Kernels.
Windows 2000 Course Summary Computing Department, Lancaster University, UK.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Types of Operating Systems
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
CE Operating Systems Lecture 3 Overview of OS functions and structure.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Edouard et al. Madhura S Rama.
Copyright © cs-tutorial.com. Overview Introduction Architecture Implementation Evaluation.
Processes Introduction to Operating Systems: Module 3.
Disco : Running commodity operating system on scalable multiprocessor Edouard et al. Presented by Vidhya Sivasankaran.
The Mach System Abraham Silberschatz, Peter Baer Galvin, Greg Gagne Presentation By: Agnimitra Roy.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
EXTENSIBILITY, SAFETY AND PERFORMANCE IN THE SPIN OPERATING SYSTEM
CS533 - Concepts of Operating Systems 1 The Mach System Presented by Catherine Vilhauer.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Types of Operating Systems 1 Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
System Components ● There are three main protected modules of the System  The Hardware Abstraction Layer ● A virtual machine to configure all devices.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Full and Para Virtualization
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
Lecture 27 Multiprocessor Scheduling. Last lecture: VMM Two old problems: CPU virtualization and memory virtualization I/O virtualization Today Issues.
The Multikernel: A New OS Architecture for Scalable Multicore Systems By (last names): Baumann, Barham, Dagand, Harris, Isaacs, Peter, Roscoe, Schupbach,
Distributed Computing Systems CSCI 6900/4900. Review Definition & characteristics of distributed systems Distributed system organization Design goals.
Background Computer System Architectures Computer System Software.
CMSC 611: Advanced Computer Architecture Shared Memory Most slides adapted from David Patterson. Some from Mohomed Younis.
The Multikernel: A new OS architecture for scalable multicore systems Dongyoung Seo Embedded Software Lab. SKKU.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
T HE M ULTIKERNEL : A NEW OS ARCHITECTURE FOR SCALABLE MULTICORE SYSTEMS Presented by Mohammed Mustafa Distributed Systems CIS*6000 School of Computer.
Computer System Structures
The Multikernel: A New OS Architecture for Scalable Multicore Systems
Alternative system models
Chapter 3: Windows7 Part 4.
CMSC 611: Advanced Computer Architecture
The Multikernel A new OS architecture for scalable multicore systems
Page Replacement.
CS510 - Portland State University
Presentation transcript:

The Multikernel: A new OS architecture for scalable multicore systems Andrew Baumann et al CS530 Graduate Operating System Presented by Jaeung Han, Changdae Kim SOSP’09 1

Introduction Mix of cores, caches, interconnect links.. –Increase scalability & correctness challenges for OS designers –No longer acceptable to tune a general-purpose OS design Rethinking the structure of the OS –build the OS as a distributed system Multikernel –Allow us to apply insights from distributed system 2/26

Observation The architecture of future computer –Rising core counts –Increasing hardware diversity 3/26

Future computer - Many cores Many cores –Sharing within the OS is becoming a problem Cache-coherence protocol limits scalability –Prevents effective use of heterogeneous cores Scaling existing OSes –Increasingly difficult to scale conventional OSes Removal of dispatcher lock in Windows7 6k line of code in 58 files –Optimizations are specific to hardware platforms Cache hierarchy, consistency model, access costs 4/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Future computer – Increasing hardware diversity Non-uniformity –Memory hierarchy becomes more complicated NUMA.. Many levels of cache sharing –Device access –Interconnect increasingly looks like a network Core diversity –Architectural differences on a single die: Streaming instructions(SIMD, SSE, etc) Virtualization support, power management –Within a system Programmable NICs GPUs FPGAs (in CPU sockets) 5/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Future computer – Increasing hardware diversity System diversity 6/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Future computer – Increasing hardware diversity System diversity 7/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Future computer – Increasing hardware diversity System diversity 8/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Observation The architecture of future computer –Rising core counts –Increasing hardware diversity → Monolithic OS need to delicate balancing between resources Increasing node heterogeneity –Prevents memory structure optimization at source code level –Need to adapt its communication patterns at run time → Future general-purpose system will have limited support for cache coherence or shared memory → Time to reconsider how the OS should be reconstructed 9/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

The multikernel model Multikernel: –OS as a distributed system of cores Communicate using messages No memory is shared –Design principle Make all inter-core communication explicit Make OS structure hardware- neutral View state as replicated instead of shared 10/26

Traditional OS vs. multikernel Traditional OSes scale up by: –Reducing lock granularity –Partitioning state Multikernel –State partitioned/replicated by default rather then shared 11/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Why message-passing? Decouples system structure from inter-core communication mechanism –Communication patterns explicitly expressed –Naturally supports heterogeneous cores –Naturally supports non-coherent interconnects Better match for future hardware –With cheap explicit message passing –Without cache-coherence 12/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Make inter-core communication explicit No memory is shared between each core Explicit communication facilitates.. –Reasoning about the use of the system interconnect –The OS to deploy well-known networking optimization –The OS to provide isolation and resource management on heterogeneous cores –Decoupling the requests and responses –The human or automated analysis 13/26

Make OS structure hardware-neutral Separate the OS structure as much as possible from the hardware –Adapting the OS to run on hardware with new performance characteristics will not require extensive changes to the code base –Isolate the distributed communication algorithms from hardware implementation details –Enable late binding of both the protocol implementation & message transport 14/26

View state as replicated The state is replicated and consistency is maintained by exchanging messages –Improve system scalability By reducing load on the system interconnect Contention for memory Overhead for synchronization Replication is.. –Required to support domains that do not share memory –A useful framework within which to support changes to the set of running cores in an OS 15/26

Barrelfish A substantial prototype operating system structured according to the multikernel model Goals for Barrelfish –Give comparable performance –Demonstrates evidence of scalability –Can be re-targeted to different hardware without refactoring –Can exploit the message-passing abstraction –Can exploit the modularity of the OS 16/26

Implementation of Barrelfish (1/4) System structure –Factored the OS instance on each core into a privileged-mode CPU driver and a distinguished user mode monitor process 17/26

Implementation of Barrelfish (2/4) CPU drivers –Enforces protection, performs authorization, time-slices processes, mediates access to the core and hardware –Serially handles traps and exceptions –Shares no state with other cores Completely event-driven, single-threaded, nonpreemptable Monitors –Collectively coordinate system-wide state –Encapsulate much of the mechanism and policy that would be found in the kernel of a traditional OS –Mediates local operations on global state –Replicated data structures are kept globally consistent 18/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Implementation of Barrelfish (3/4) Process structure –Represented by a collection of dispatcher objects –Communication is occur between dispatchers Inter-core communication –All communication occurs with messages Cache-coherent shared memory Memory management –Physical memory must be managed as a global resource All memory management is performed explicitly through system calls 19/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Implementation of Barrelfish (4/4) System knowledge base –Maintains knowledge of the underlying hardware –Runs as on OS service –Used by OS to derive system policies 20/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Evaluation Unmap (TLB shootdown) –Send a message to every core with a mapping, wait for all to be acknowledged –Linux/Windows: 1.Kernel sends IPIs 2.Spins on acknowledgement –Barrelfish: 1.User request to local monitor 2.Single-phase commit to remote monitors 21/26 Borrowed from The Barrelfish operating system for heterogeneous multicore systems

Results of unmap (TLB shootdown) 22/26

Evaluation IP loopback 23/26

Evaluation Compute-bound workloads –NAS OpenMP, SPLASH-2 24/26

Evaluation IO workloads –Network throughput Mbit/s vs. 951 Mbit/s UDP echo –Web server and relational DB requests per second vs requests per second for lihttpd/Linux 25/26

Conclusion Current OS structure is poorly suited for future hardware architectures –Poor at managing diversity and scale Multicore machines resemble networked system –Need to view the OS as a distributed system Concurrency, communication, heterogeneity Tailor messaging mechanisms and algorithms to the machine Hide sharing as an optimization Borrowed from The Barrelfish operating system for heterogeneous multicore systems 26/26

Heterogeneous core works well rather than homogeneous core – Heterogeneous core consume less power than homogeneous core – 27/26