Download presentation
Presentation is loading. Please wait.
Published byStephany Ursula Logan Modified over 9 years ago
1
Distributed systems [Fall 2014] G22.3033-002 Lec 1: Course Introduction
2
Waitlist status Course admittance priority: Ph.D., M.S. If you are not going to take the class, drop early to let others in
3
Class staff Instructor: Prof. Jinyang Li (me) –jinyang@cs.nyu.edujinyang@cs.nyu.edu –Office Hour: Wed 4-5pm (715 Bway Rm 708) Instructional Assistant: Yang Cui –cuiyang1125@gmail.comcuiyang1125@ –Office Hour: Thu 4-5pm (715 Bway Rm 707)
4
Background What I assume you already know: –OS organization –Programming experience in C or C++ –Concurrency and threading –Programming w/ sockets, TCP/IP
5
Course readings No official textbook Lectures are based on research papers –Check webpage for schedules Useful reference books –Principles of Computer System Design. (Saltzer and Kaashoek) –Distributed Systems (Tanenbaum and Steen) –Advanced Programming in the UNIX environment (Stevens) –UNIX Network Programming (Stevens)
6
Meeting times & Lecture structure Tuesdays 5:10-7pm –With a 10-minute break in the middle Lecture will do basic concepts followed by paper discussion –Read assigned papers before lecture Sometimes instructional assistant will do a 30-min discussion on labs.
7
Important addresses URL: http://www.news.cs.nyu.edu/~jinyang/fa14-ds –Check regularly for schedule We’ll use Piazza.com for making announcements and conducting discussion
8
How are you evaluated? Participation 10% Labs 40% Quizzes 50% –mid-term and final (90 minutes each)
9
Using Piazza Please post all questions on Piazza instead of emailing course staff You can make your post as either private (only staff can see it) or public (visible to the whole class) We encourage you to make public posts –Whole class benefits from seeing your question and its answer
10
Participation Participation is 10% of your final grade 1.Paper summary submitted (before lecture) via Piazza Summarize the assigned paper before class –3 things you’ve learnt from the paper –1 weakness of the paper –Answer the assigned question (if there’s any) 2.In class participation 3.Piazza discussion Asking questions and answering others’ questions
11
Questions?
12
What are distributed systems? Examples? Multiple hosts A local or wide area network Machines communicate to provide some service for applications
13
Why distributed systems? for ease-of-use Handle geographic separation Provide users (or applications) with location transparency: –Web: access information with a few “clicks” –Network file system: access files on remote servers as if they are on a local disk, share files among multiple computers
14
Why distributed systems? for availability Build a reliablesystem out of unreliable parts –Hardware can fail: power outage, disk failures, memory corruption, network switch failures… –Software can fail: bugs, mis-configuration, upgrade … –How to achieve 0.99999 availability?
15
Why distributed systems? for scalable capacity Aggregate resources of many computers –CPU: MapReduce, Spark, Grid computing –Bandwidth: Akamai CDN, BitTorrent –Disk: Google file system, Hadoop File System
16
Why distributed systems? for modular functionality Only need to build a service to accomplish a single task well. –Authentication server –Backup server. Compose multiple simple services to achieve sophisticated functionality –A distributed file system: a block service + a meta-data lookup service
17
Challenges System design –What is the right interface or abstraction? –How to partition functions for scalability? Consistency –How to share data consistently among multiple readers/writers? Fault Tolerance –How to keep system available despite node or network failures?
18
Challenges (continued) Different deployment scenarios –Clusters –Wide area distribution –Sensor networks Implementation –How to maximize concurrency? –What’s the bottleneck? –How to reduce load on the bottleneck resource?
19
The downside A distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.” -- Leslie Lamport Much more complex
20
The important things in distributed systems design
21
#1 Abstraction & Interface Application users access your service via some interface An example, a storage service’s API: –File system (mkdir, readdir, write, read) –Database (create tables, SQL queries) –Disk (read block, write block) Conflicting goals: –simple vs. efficient to implement
22
#2: Fault Tolerance How to keep the system running when some machine is down? Does the system still give “correct” service? How to incorporate recovered machine correctly?
23
#3: Consistency Contract with apps/users about meaning of operations. Difficult due to: –Failure, multiple copies of data, concurrency E.g. how to keep 2 replicas “identical” –If one is down, it will miss updates –If net is broken, both might process different updates
24
#4 Performance Latency & Throughput To increase throughput, exploit parallelism –Many resources exist in multiples CPU cores, IO and CPU To reduce latency, –Figure out what takes time: queuing, network, storage, some expensive algorithm, many serial steps? How much performance is enough?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.