Introduction to CS739: Distribution Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.

Slides:



Advertisements
Similar presentations
INF5040 (Open Distributed Systems)
Advertisements

Teaser - Introduction to Distributed Computing
Computer Science Dr. Peng NingCSC 774 Adv. Net. Security1 CSC 774 Advanced Network Security Preparation for In-class Presentations.
CS 345 Distributed Systems Fabián E. Bustamante, Winter 2004 Welcome to Advanced OS Fabián E. Bustamante (Instructor) Yi Qiao (Ad Honorem TA) Communication.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Distributed Systems.
Piccolo – Paper Discussion Big Data Reading Group 9/20/2010.
CMPT Dr. Alexandra Fedorova Distributed Systems.
: Distributed Systems Dr. Rajkumar Buyya Senior Lecturer and Director of MEDC Course Grid Computing and Distributed Systems (GRIDS) Laboratory Dept.
Computer Science Lecture 17, page 1 CS677: Distributed OS Last Class: Fault Tolerance Basic concepts and failure models Failure masking using redundancy.
CS603 Advanced Topics in Distributed Systems MWF 13:30-14:30 RHPH 162 Professor Chris Clifton.
EJB Design. Server-side components Perform –complex algorithms –high volume transactions Run in –highly available environment (365 days/year) –fault tolerant.
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
COMP 14 – 02: Introduction to Programming Andrew Leaver-Fay August 31, 2005 Monday/Wednesday 3-4:15 pm Peabody 217 Friday 3-3:50pm Peabody 217.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
1 EEL 6935: Embedded Systems Seminar. 2 General Information Instructor: Ann Gordon-Ross Office: Benton Office Hours – By appointment.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
B 葉彥廷 B 林廷韋 B 王頃恩. Why we choose this topic Introduction Programming Model Example Implementation Conclusion.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
ECE 284: Special Topics in Computer Engineering On-Chip Interconnection Networks Prof. Bill Lin Spring 2014.
1 EEL 6935: Embedded Systems Seminar. 2 General Information Instructor: Ann Gordon-Ross Office: Benton Office Hours – By appointment.
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
CS 103 Discrete Structures Lecture 01 Introduction to the Course
Dionicio D. Gante, Genevev G. Reyes & Vanylive T. Galima DDistributed Operating Systems.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
Distributed systems [Fall 2014] G Lec 1: Course Introduction.
What is a Distributed System? n From various textbooks: l “A distributed system is a collection of independent computers that appear to the users of the.
Proposal for Term Project Operating Systems, Fall 2015 J. H. Wang Sep. 18, 2015.
Lecture 0 Anish Arora CSE 6333 Introduction to Distributed Computing.
Introduction to Operating Systems J. H. Wang Sep. 18, 2015.
Distributed Systems II TDA297(CTH), DIT290 (GU) LP hec
Copyright © Clifford Neuman and Dongho Kim - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE Advanced Operating Systems Lecture.
1Thu D. NguyenCS 545: Distributed Systems CS 545: Distributed Systems Spring 2006 Department of Computer Science Rutgers University Thu D. Nguyen.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Distributed Systems II TDA297(CTH), INN290 (GU) LP hec
CS 858 – Hot Topics in Computer and Communications Security Winter 2009 Introduction.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Wrap-up Steve Ko Computer Sciences and Engineering University at Buffalo.
Introduction to Operating Systems J. H. Wang Sep. 15, 2010.
A. Haeberlen Fault Tolerance and the Five-Second Rule 1 HotOS XV (May 18, 2015) Ang Chen Hanjun Xiao Andreas Haeberlen Linh Thi Xuan Phan Department of.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Wrap-up Steve Ko Computer Sciences and Engineering University at Buffalo.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
CSE 60641: Operating Systems Implementing Fault-Tolerant Services Using the State Machine Approach: a tutorial Fred B. Schneider, ACM Computing Surveys.
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
Operating Systems: Wrap-Up Questions answered in this lecture: What is an Operating System? Why are operating systems so interesting? What techniques can.
UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department
Parallel and Distributed Computing Overview and Syllabus Professor Johnnie Baker Guest Lecturer: Robert Walker.
CS614: Advanced Course in Computer Systems (Spring’04) Instructor: Ken Birman TA: non assigned (yet)
Cheating The School of Network Computing, the Faculty of Information Technology and Monash as a whole regard cheating as a serious offence. Where assignments.
Introduction to Operating Systems J. H. Wang Sep. 13, 2013.
CSci6702 Parallel Computing Andrew Rau-Chaplin
CS 425/ECE 428 Distributed Systems Nitin Vaidya. T.A.s – Persia Aziz – Frederick Douglas – Su Du – Yixiao Lin.
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
Fail-Stop Processors UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau One paper: Byzantine.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Snapshots, checkpoints, rollback, and restart
CS 664 Sample Presentation
CS/CE/TE 6378 Advanced Operating Systems
Introduction to Operating Systems
Distributed Operating Systems Spring 2004
Distributed Operating Systems
CS533 Concepts of Operating Systems Class 1
CS 194: Lecture 1 University of California Berkeley
Advanced Operating Systems
IS 651: Distributed Systems Midterm
Proposal for Term Project Operating Systems, Fall 2018
Introduction To Distributed Systems
CS 6640 Sample Presentation
Distributed Systems (15-440)
CS533 Concepts of Operating Systems Class 1
Presentation transcript:

Introduction to CS739: Distribution Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau What are distributed systems? What are the benefits and challenges? How will CS739 be structured? Readings, Writeups, Presentations Projects

Goals of Course Learn about challenges and existing techniques for building distributed systems and services Read and discuss influential papers from SOSP, OSDI, NSDI Gain some experience programming in distributed environment Warm-up project Final project

What is a Distributed System? Leslie Lamport says: “You know you have one when the crash of a computer you never heard of stops you from doing any work” More technical definition: “Collection of independent computers that appears to its users as a single coherent system” How are parallel, distributed, networked systems different? All contain nodes (processing, memory, disk) connected with network parallel distributednetworked Consider distributed services as well… More unified Less unified

Benefits of Distributed Systems Great price/performance Leverage commodity components (nodes and networks) Use many, many of them Incremental scalability Can add x% new nodes (or disks or memory) to improve performance x% Improved availability Continue operating when some nodes stop working Improved reliability Deliver correct results when some nodes misbehave, corrupt data Allow geographically-distributed individuals to share data or cooperate

Distributed System Challenges Lack of global state information Different nodes have different view of system –What are the contents of file A? –How many jobs are running on node X? –Which nodes are currently part of the system? See delays, different ordering of messages, lost messages, network partitions Tension with goal of “single coherent system” Handling slow, failed and misbehaving nodes How do you avoid slow nodes? How do you get back data or work from failed node? When nodes disagree, how do you know who is wrong? Tension with goal of “available and reliable” When is it okay to have some centralized components? Simplifies state management, but single point-of-failure and performance bottleneck

Content of 739 Distributed system courses can be very different… Theoretical: distributed algorithms (e.g., to allow nodes to come to consensus or agreement) 4 lectures Practical: distributed programming (e.g., using RPC, JAVA RMI, CORBA, DCOM, MPI, PVM) Warm-up project Research systems: new ideas for making distributed systems better Focus of course Implemented systems with new conceptual ideas Recent papers in top systems conferences (SOSP, OSDI, NSDI)

Learning by Reading Intense reading list; assume sophisticated reader (736) Usually cover 1 fascinating paper per class No exams Three types of classes 1)Formal lecture: Only for 4 theory topics 2)Discussions: Most papers –I ask questions, expect everyone to enthusiastically participate; fairly casual –Task 1: Read paper 2-3 times before class –Task 2: write-up to me BEFORE class –Task 3: Take turns being scribe (about 2 times in semester) Write-up notes from discussion in latex Post to web page within 72 hours

Learning by Reading (cont) Types of classes (cont) 3)Group-led lectures: 4 topics –Small group gives overview of about 3-4 related papers –Topics: Distributed system analysis Process migration Programming environments Specialized distributed services –Advantages Good practice for giving presentations Learn about topic in slightly more depth –Tasks Group: »Finalize related papers (1 week before) »Present to me (2 days before) »Use slides Everyone else: Skim papers –Handout: State preferences by next week

Course Topics: Reading List Distributed Operating Systems (Survey, Amoeba vs Sprite) Network File Systems (NFS, Coda, LBFS) Theory: Time, Ordering, and Distributed Snapshots (2 Lamport papers) Analysis of Distributed Systems (1 + Group Presentation) Programming Environments (DSM, MapReduce, Group) Process Migration (1 + Group) Specialized Distributed Services (Porcupine + Group) SPRING BREAK Theory: Consensus (Byzantine failures and fail-stop processors) Cluster-based File Systems (Petal+Frangipani and GoogleFS) Communication Primitives (RPC vs U-Net) P2P Systems (Measurement, CFS, Amazon, Pangaea, LOCKSS) Miscellaneous: Trust, Recovery, Mistakes, Speculation, Sensor Networks

Learning by Doing Warm-up Project Goal: Become familiar with existing distributed programming environments Examples: Hadoop (open-source MapReduce), MPI, PVM Task 0: Get environment running Task 1: Implement simple application (e.g., sorting) Task 2: Report sufficient numbers to indicate did something Final Project Goal 1: Experience with “research process” in general –Work on open-ended project, unknown result –New idea where don’t know if it will work Goal 2: Learn about specific topic in depth Topic from my list or your own choice; work with project partner Deliverables: 20 minute talk, short research paper

Agenda for Next Class See website: Read: Survey : Distributed Operating Systems Andrew S. Tanenbaum and Robbert Van Renesse ACM Computing Surveys, Volume 17, Issue 4 (December 1985), pp Long paper: Focus on Sections 1 and 2 Answer question: What were the goals of distributed systems at this time? Which design issue (I.e., communication primitives, naming and protection, resource management, fault tolerance, services) seems most challenging (or interesting)? Why? answer to me with Subject cs739: Survey Think about group presentation papers