6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris

Slides:



Advertisements
Similar presentations
WHAT IS AN OPERATING SYSTEM? An interface between users and hardware - an environment "architecture ” Allows convenient usage; hides the tedious stuff.
Advertisements

High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
Multiple Processor Systems
Distributed Processing, Client/Server, and Clusters
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Distributed systems Programming with threads. Reviews on OS concepts Each process occupies a single address space.
Contiki A Lightweight and Flexible Operating System for Tiny Networked Sensors Presented by: Jeremy Schiff.
G Robert Grimm New York University Disco.
Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.
User Level Interprocess Communication for Shared Memory Multiprocessor by Bershad, B.N. Anderson, A.E., Lazowska, E.D., and Levy, H.M.
Computer Science Lecture 2, page 1 CS677: Distributed OS Last Class: Introduction Distributed Systems – A collection of independent computers that appears.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
User-Level Interprocess Communication for Shared Memory Multiprocessors Brian N. Bershad, Thomas E. Anderson, Edward D. Lazowska, and Henry M. Levy Presented.
PRASHANTHI NARAYAN NETTEM.
DISTRIBUTED COMPUTING
I/O Tanenbaum, ch. 5 p. 329 – 427 Silberschatz, ch. 13 p
Basics of Operating Systems March 4, 2001 Adapted from Operating Systems Lecture Notes, Copyright 1997 Martin C. Rinard.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
Maxim Shevertalov Jay Kothari William M. Mongan
9/14/2015B.Ramamurthy1 Operating Systems : Overview Bina Ramamurthy CSE421/521.
 What is an operating system? What is an operating system?  Where does the OS fit in? Where does the OS fit in?  Services provided by an OS Services.
Operating System Review September 10, 2012Introduction to Computer Security ©2004 Matt Bishop Slide #1-1.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania
Hardware Definitions –Port: Point of connection –Bus: Interface Daisy Chain (A=>B=>…=>X) Shared Direct Device Access –Controller: Device Electronics –Registers:
Operating Systems ECE344 Ashvin Goel ECE University of Toronto Threads and Processes.
Recall: Three I/O Methods Synchronous: Wait for I/O operation to complete. Asynchronous: Post I/O request and switch to other work. DMA (Direct Memory.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Background: Operating Systems Brad Karp UCL Computer Science CS GZ03 / M th November, 2008.
OS2- Sem ; R. Jalili Introduction Chapter 1.
Supporting Multi-Processors Bernard Wong February 17, 2003.
OPERATING SYSTEM SUPPORT DISTRIBUTED SYSTEMS CHAPTER 6 Lawrence Heyman July 8, 2002.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
ITGS Network Architecture. ITGS Network architecture –The way computers are logically organized on a network, and the role each takes. Client/server network.
Disco: Running Commodity Operating Systems on Scalable Multiprocessors Presented by: Pierre LaBorde, Jordan Deveroux, Imran Ali, Yazen Ghannam, Tzu-Wei.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Major OS Components CS 416: Operating Systems Design, Spring 2001 Department of Computer Science Rutgers University
TEXT: Distributed Operating systems A. S. Tanenbaum Papers oriented on: 1.OS Structures 2.Shared Memory Systems 3.Advanced Topics in Communications 4.Distributed.
Running Commodity Operating Systems on Scalable Multiprocessors Edouard Bugnion, Scott Devine and Mendel Rosenblum Presentation by Mark Smith.
CSCI/CMPE 4334 Operating Systems Review: Exam 1 1.
1 Chapter 2: Operating-System Structures Services Interface provided to users & programmers –System calls (programmer access) –User level access to system.
Introduction to Operating Systems Concepts
Last Class: Introduction
Operating Systems : Overview
CS 6560: Operating Systems Design
Andy Wang COP 5611 Advanced Operating Systems
Advanced Operating Systems
Operating Systems : Overview
CS703 - Advanced Operating Systems
Multiple Processor Systems
Operating Systems : Overview
Lecture Topics: 11/1 General Operating System Concepts Processes
Introduction to Operating Systems
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Operating Systems : Overview
Presentation transcript:

6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris TA: Jinyang Li

Operating System Software that turns silicon into something useful –Provides applications with a programming interface –Manages hardware resources on behalf of applications

Distributed Operating System The holy grail: transparency –provide applications with a virtual machine consisting of many processors distributed around the network. Distributed OS engineering is difficult: –Failures –High-degree of concurrency –Long latencies –New classes of security attacks

Client/Server Architecture A modular architecture to structure distributed systems –Clients request services from servers –Client and servers communicate with messages –Servers are typically trusted Other architectures –Peer-to-peer (decentralized) –Single address space

6.894 topics Client-server components –Remote procedure call, threads, address spaces, etc. Storage –File systems, transactions Security –Confidentiality, authentication, etc. Scalable servers

6.894 is an advanced Perform actual systems research –Perform a research project –Study recent research papers Design systems for real workloads –New abstractions, protocols, datastructures, algorithms, etc. Build a real system (lab) –Real enough that you can use it

Internet video-on-demand server Example to study issues and overview Requirements: –Low and high-quality video –Many users, spread around the Internet –Last mile bandwidth may be low –Access control

Client and server structure Client() { fd = connect(“server”); write (fd, “video.mpg”); while (!eof(fd)){ read (fd, buf); display (buf); } Server() { while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) { read(fd, block); write (cfd, block); } close (cfd); close (fd); }}

Performance “analysis” Server capacity: –Network (100 Mbit/s) –Disk (20 Mbyte/s) Obtained performance: one client stream Server is limited by software structure If a video is 200 Kbit/s, server should be able to support more than one client.

Better single-server performance Goal: run at server’s hardware speed –Disk or network should be bottleneck Method: –Pipeline blocks of each request –Multiplex requests from multiple clients Two implementation approaches: –Multithreaded server –Asynchronous I/O

Multithreaded server server() { while (1) { cfd = accept(); read (cfd, name); fd = open (name); while (!eof(fd)) { read(fd, block); write (cfd, block); } close (cfd); close (fd); }} for (i = 0; i < 10; i++) fork (server); When waiting for I/O, thread scheduler runs another thread All shared data must protected by locks Release locks when blocking

Asynchronous I/O struct callback { bool (*is_ready)(); void (*cb)(arg); void *arg; } main() { while (1) { for (c = each callback) { if (c->is_ready()) c->handler(c->arg); } Code is structured as a collection of handlers Handlers are nonblocking Create new handlers for blocking operations When operation completes, call handler

Asychronous server init() { on_accept(accept_cb); } accept_cb() { on_readable(cfd,name_cb); } on_readable(fd, fn) { c = new callback(test_readable, fn, fd); add c to callback list; } name_cb(cfd) { read(cfd,name); fd = open(name); on_readable(fd, read_cb); } read_cb(cfd, fd) { read(fd, block); on_writeeable(fd, write_cb); } write_cb(cfd, fd) { write(cfd, block); on_readable(fd, read_cb); }

Multithreaded vs. Async Hard to program –Locking code –Need to know what blocks Coordination explicit State stored on thread’s stack –Memory allocation implicit Context switch may be expensive Multiprocessors Hard to program –Callback code –Need to know what blocks Coordination implicit State passed around explicitly –Memory allocation explicit Lightweight context switch Uniprocessors

Coordination example Threaded server: –Thread for network interface –Interrupt wakes up network thread –Protected (locks and conditional variables) shared buffer shared between server threads and network thread Asynchronous I/O –Poll for packets How often to poll? –Or, interrupt generates an event Be careful: disable interrupts when manipulating callback queue.

Scheduling: polling vs. interrupts Maintain peak performance under heavy load –Interrupts model can lead to livelock Solution: –Use interrupts under low load (good latency) –Use polling under heavy load (good throughput) Polling is typically more efficient than interrupts –Fits naturally into asynchronous I/O model

Other design issues Disk scheduling –Elevator algorithm Memory management –File system buffer cache Address spaces (VM management) –Fault isolate different servers Efficient local communication? Efficient transfers between disk and networks –Avoid copies

More than one processor Problem: single machine may not scale to enough clients Solutions: –Multiprocessors Helps when CPU is bottleneck –Server clusters Helps when bandwidth between server and backbone is high –Distributed server clusters Helps when bandwidth between client and distant server is low

Clusters Naming transparency –Server cluster transparent to client? Server selection –Metrics: CPU load, presence of data Consistency –Partition data Availability –More processors can decrease reliability –Replicate data (makes consistency more difficult)

Distributed clusters Replication policies Data distribution Consistency Network monitoring and modeling Global load balancing Tradeoff between accuracy, latency, and network load

Making it secure: access control Redo design: don’t add on –Firewalls: insecure and break many things CPU cycles is an issue –A secure HTTP server can do about connections a second Pulls in other global issues –Name to key binding –Key management infrastructure

Example summary Pipelining of disk and network requests –Need a lot of sophisticated software infrastructure Replication for reliability and performance –Need sophisticated protocols Difficult: We did it for one application –What if data changes rapidly? –Lack of abstractions!

6.894 lab: real systems Multi-finger (due next week) –Asynchronous I/O HTTP proxy –High-performance proxy –Cache, consistency, etc. Open-ended file system project –Research