Distributed systems [Fall 2014] G22.3033-002 Lec 1: Course Introduction.

Slides:



Advertisements
Similar presentations
Spark: Cluster Computing with Working Sets
Advertisements

Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
City University London
Distributed systems [Fall 2009] G Lec 1: Course Introduction & Lab Intro.
Lecture 8 Epidemic communication, Server implementation.
Distributed (storage) systems G Lec 1: Course Introduction & Lab Intro.
DISTRIBUTED COMPUTING
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Distributed Systems Lecture 1: Overview CS425/CSE424/ECE428 Fall 2011 Nikita Borisov.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Computer System Architectures Computer System Software
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / st October, 2007.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
Advanced Topics in Distributed Systems Fall 2011 Instructor: Costin Raiciu.
Cloud Distributed Computing Environment Content of this lecture is primarily from the book “Hadoop, The Definite Guide 2/e)
CS 162 Discussion Section Week 1 (9/9 – 9/13) 1. Who am I? Kevin Klues Office Hours:
Distributed File Systems
1 Introduction to Operating Systems 9/16/2008 Lecture #1.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Transparency In Distributed Systems Hiremath,Naveen
CT 1503 Network Operating Systems Instructor: Dr. Najla Al-Nabhan 2014.
Course Information Andy Wang Operating Systems COP 4610 / CGS 5765.
Advanced Principles of Operating Systems (CE-403).
Operating Systems and Systems Programming CS162 Teaching Staff.
Operating Systems Lecture 1 Jinyang Li. Class goals Understand how an OS works by studying its: –Design principles –Implementation realities Gain some.
Welcome to CPS 210 Graduate Level Operating Systems –readings, discussions, and programming projects Systems Quals course –midterm and final exams Gateway.
Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Unit 9: Distributing Computing & Networking Kaplan University 1.
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
Copyright ©: Nahrstedt, Angrave, Abdelzaher1 University of Illinois at Urbana-Champaign Welcome to CS 241 Systems Programming University of Illinois at.
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
Introduction to CS739: Distribution Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Distributed systems [Fall 2012] G Lec 1: Course Introduction & Lab Intro.
Distributed Systems and Security: An Introduction Brad Karp and Steve Hailes UCL Computer Science CS Z03 / nd October, 2006.
CMPT 401 Distributed Systems Concepts And Design.
Operating Systems CMPSC 473 Introduction and Overview August 24, Lecture 1 Instructor: Bhuvan Urgaonkar.
COT 4600 Operating Systems Fall 2010 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:30-4:30 PM.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Copyright ©: Nahrstedt, Angrave, Abdelzaher, Caccamo1 University of Illinois at Urbana-Champaign Welcome to CS 241 Systems Programming University of Illinois.
BIG DATA/ Hadoop Interview Questions.
COMPSCI 110 Operating Systems
CS/CE/TE 6378 Advanced Operating Systems
Distributed Operating Systems Spring 2004
CMSC 621: Advanced Operating Systems Advanced Operating Systems
Distributed systems [Fall 2010] G
Hadoop Aakash Kag What Why How 1.
Distributed Operating Systems
Operating Systems and Systems Programming
Advanced Operating Systems
湖南大学-信息科学与工程学院-计算机与科学系
EECS 498 Introduction to Distributed Systems Fall 2017
Cse 344 May 4th – Map/Reduce.
Advanced Operating Systems – Fall 2009
Andy Wang Operating Systems COP 4610 / CGS 5765
Ch 4. The Evolution of Analytic Scalability
Distributed systems [Fall 2016] G Jinyang Li
Operating System Overview
Presentation transcript:

Distributed systems [Fall 2014] G Lec 1: Course Introduction

Waitlist status Course admittance priority: Ph.D., M.S. If you are not going to take the class, drop early to let others in

Class staff Instructor: Prof. Jinyang Li (me) –Office Hour: Wed 4-5pm (715 Bway Rm 708) Instructional Assistant: Yang Cui –Office Hour: Thu 4-5pm (715 Bway Rm 707)

Background What I assume you already know: –OS organization –Programming experience in C or C++ –Concurrency and threading –Programming w/ sockets, TCP/IP

Course readings No official textbook Lectures are based on research papers –Check webpage for schedules Useful reference books –Principles of Computer System Design. (Saltzer and Kaashoek) –Distributed Systems (Tanenbaum and Steen) –Advanced Programming in the UNIX environment (Stevens) –UNIX Network Programming (Stevens)

Meeting times & Lecture structure Tuesdays 5:10-7pm –With a 10-minute break in the middle Lecture will do basic concepts followed by paper discussion –Read assigned papers before lecture Sometimes instructional assistant will do a 30-min discussion on labs.

Important addresses URL: –Check regularly for schedule We’ll use Piazza.com for making announcements and conducting discussion

How are you evaluated? Participation 10% Labs 40% Quizzes 50% –mid-term and final (90 minutes each)

Using Piazza Please post all questions on Piazza instead of ing course staff You can make your post as either private (only staff can see it) or public (visible to the whole class) We encourage you to make public posts –Whole class benefits from seeing your question and its answer

Participation Participation is 10% of your final grade 1.Paper summary submitted (before lecture) via Piazza Summarize the assigned paper before class –3 things you’ve learnt from the paper –1 weakness of the paper –Answer the assigned question (if there’s any) 2.In class participation 3.Piazza discussion Asking questions and answering others’ questions

Questions?

What are distributed systems? Examples? Multiple hosts A local or wide area network Machines communicate to provide some service for applications

Why distributed systems? for ease-of-use Handle geographic separation Provide users (or applications) with location transparency: –Web: access information with a few “clicks” –Network file system: access files on remote servers as if they are on a local disk, share files among multiple computers

Why distributed systems? for availability Build a reliablesystem out of unreliable parts –Hardware can fail: power outage, disk failures, memory corruption, network switch failures… –Software can fail: bugs, mis-configuration, upgrade … –How to achieve availability?

Why distributed systems? for scalable capacity Aggregate resources of many computers –CPU: MapReduce, Spark, Grid computing –Bandwidth: Akamai CDN, BitTorrent –Disk: Google file system, Hadoop File System

Why distributed systems? for modular functionality Only need to build a service to accomplish a single task well. –Authentication server –Backup server. Compose multiple simple services to achieve sophisticated functionality –A distributed file system: a block service + a meta-data lookup service

Challenges System design –What is the right interface or abstraction? –How to partition functions for scalability? Consistency –How to share data consistently among multiple readers/writers? Fault Tolerance –How to keep system available despite node or network failures?

Challenges (continued) Different deployment scenarios –Clusters –Wide area distribution –Sensor networks Implementation –How to maximize concurrency? –What’s the bottleneck? –How to reduce load on the bottleneck resource?

The downside A distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.” -- Leslie Lamport Much more complex

The important things in distributed systems design

#1 Abstraction & Interface Application users access your service via some interface An example, a storage service’s API: –File system (mkdir, readdir, write, read) –Database (create tables, SQL queries) –Disk (read block, write block) Conflicting goals: –simple vs. efficient to implement

#2: Fault Tolerance How to keep the system running when some machine is down? Does the system still give “correct” service? How to incorporate recovered machine correctly?

#3: Consistency Contract with apps/users about meaning of operations. Difficult due to: –Failure, multiple copies of data, concurrency E.g. how to keep 2 replicas “identical” –If one is down, it will miss updates –If net is broken, both might process different updates

#4 Performance Latency & Throughput To increase throughput, exploit parallelism –Many resources exist in multiples CPU cores, IO and CPU To reduce latency, –Figure out what takes time: queuing, network, storage, some expensive algorithm, many serial steps? How much performance is enough?