1 CS525 Advanced Distributed Systems Spring 2015 Indranil Gupta (Indy) Lecture 1 January 20, 2015 All Slides © IG.

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy)
Distributed Systems 1 Topics  What is a Distributed System?  Why Distributed Systems?  Examples of Distributed Systems  Distributed System Requirements.
Chapter 5 Operating Systems. 5 The Operating System When working with multimedia, the operating system is perhaps the most important, the most complex,
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 1: Welcome, and Introduction Web: courses.engr.illinois.edu/cs425/ All slides.
Objektorienteret Middleware Presentation 2: Distributed Systems – A brush up, and relations to Middleware, Heterogeneity & Transparency.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) August 26 – December 9, 2014 Lecture 1-29 Web: courses.engr.illinois.edu/cs425/ All.
A. Frank 1 Internet Resources Discovery (IRD) Peer-to-Peer (P2P) Technology (1) Thanks to Carmit Valit and Olga Gamayunov.
Networks 1 CS502 Spring 2006 Network Input & Output CS-502 Operating Systems Spring 2006.
CS-3013 & CS-502, Summer 2006 Network Input & Output1 CS-3013 & CS-502, Summer 2006.
11 CS525 Advanced Distributed Systems Spring 2015 Indranil Gupta (Indy) Wrap-Up January 20 – May 5, 2015 All Slides © IG.
Operating Systems: Principles and Practice
Distributed Systems Lecture 1: Overview CS425/CSE424/ECE428 Fall 2011 Nikita Borisov.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Computer System Architectures Computer System Software
Communication (II) Chapter 4
11 CS525 Advanced Distributed Systems Spring 2014 Indranil Gupta (Indy) Wrap-Up January 21 – May 6, 2014 All Slides © IG.
Web: courses.engr.illinois.edu/cs425/
1 CS525 Advanced Distributed Systems Spring 2011 Indranil Gupta (Indy) Lecture 1 January 18, 2011 All Slides © IG.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Distributed systems [Fall 2014] G Lec 1: Course Introduction.
1 CS 425 Distributed Systems Fall 2011 Slides by Indranil Gupta Measurement Studies All Slides © IG Acknowledgments: Jay Patel.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Distributed Systems and Algorithms Sukumar Ghosh University of Iowa Spring 2011.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
Lecture 28-1 Computer Science 425 Distributed Systems Lecture 28 “Wrap-Up” Klara Nahrstedt.
Lecture 1-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) August 28, 2012 Lecture 1  2012, I. Gupta,
Amit Warke Jerry Philip Lateef Yusuf Supraja Narasimhan Back2Cloud: Remote Backup Service.
Most of contents are provided by the website Introduction TJTSD66: Advanced Topics in Social Media Dr.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
11 CS525 Advanced Distributed Systems Spring 2013 Indranil Gupta (Indy) Wrap-Up January 15 – April 30, 2013 All Slides © IG.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Unit 9: Distributing Computing & Networking Kaplan University 1.
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
Application Software System Software.
Lecture 1-1  2002, M. T. Harandi, J. Hou, and I. Gupta (modified Y. Hu) ECE 428 Distributed Systems Yih-Chun Hu August 25, 2005.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) August 25 – December 8, 2015 Lecture 1-29 Web: courses.engr.illinois.edu/cs425/ All.
CS614: Advanced Course in Computer Systems (Spring’04) Instructor: Ken Birman TA: non assigned (yet)
Introduction to CS739: Distribution Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Introduction TO Network Administration
11 CS525 Advanced Distributed Systems Spring 2011 Indranil Gupta (Indy) Wrap-Up January 18 – May 3, 2011 All Slides © IG.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Lecture 29-1 Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2012 Indranil Gupta (Indy) August 27-December 11, 2012 Lecture 1-29.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Indranil Gupta Spring 09 January 20, 2009 – May 5, 2009 CS525 Advanced Distributed Systems Spring 2009.
CSE 486/586 CSE 486/586 Distributed Systems Gossiping Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2013 CSE 486/586 Distributed Systems The Internet in 2 Hours: The First Hour Steve Ko Computer Sciences and Engineering University.
Computer Science 425 Distributed Systems CS 425 / CSE 424 / ECE 428 Fall 2011 Gossiping Reading: Section 15.4 / 18.4  2011, N. Borisov, I. Gupta, K. Nahrtstedt,
Chapter SOFTWARE Are the programs which are written by different programming languages. These programs are: a series of instruction that tells.
UNIX U.Y: 1435/1436 H Operating System Concept. What is an Operating System?  The operating system (OS) is the program which starts up when you turn.
For more course tutorials visit NTC 406 Entire Course NTC 406 Week 1 Individual Assignment Network Requirements Analysis Paper NTC 406.
Lecture 1 Book: Hadoop in Action by Chuck Lam Online course – “Cloud Computing Concepts” lecture notes by Indranil Gupta.
CSE 486/586 Distributed Systems Gossiping
CS/CE/TE 6378 Advanced Operating Systems
CS525 Advanced Distributed Systems Spring 2013
CS525 Advanced Distributed Systems Spring 2009
CS 425 / ECE 428 Distributed Systems Fall 2017 Indranil Gupta (Indy)
Ministry of Higher Education
湖南大学-信息科学与工程学院-计算机与科学系
湖南大学-信息科学与工程学院-计算机与科学系
Software Defined Networking (SDN)
Global State and Gossip
CSC 724 Advanced Distributed Systems Introduction
Web: courses.engr.illinois.edu/cs425/
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Web: courses.engr.illinois.edu/cs425/
CS 425 / ECE 428 Distributed Systems Fall 2018 Indranil Gupta (Indy)
Presentation transcript:

1 CS525 Advanced Distributed Systems Spring 2015 Indranil Gupta (Indy) Lecture 1 January 20, 2015 All Slides © IG

2 What is a Distributed System? (examples) The InternetGnutella peer to peer system Datacenter/Cloud A Sensor Network

3 Can you name some examples of Operating Systems?

4 … Linux WinXP Vista Unix FreeBSD Mac OSX 2K Aegis Scout Hydra Mach SPIN OS/2 Express Flux Hope Spring AntaresOS EOS LOS SQOS LittleOS TINOS PalmOS WinCE TinyOS …

5 What is an Operating System?

6 User interface to hardware (device driver) Provides abstractions (processes, file system) Resource manager (scheduler) Means of communication (networking) …

7 FOLDOC definition The low-level software which handles the interface to peripheral hardware, schedules tasks, allocates storage, and presents a default interface to the user when no application program is running. The OS may be split into a kernel which is always present and various system programs which use facilities provided by the kernel to perform higher-level house-keeping tasks, often acting as servers in a client-server relationship. Some would include a graphical user interface and window system as part of the OS, others would not. The operating system loader, BIOS, or other firmware required at boot time or when installing the operating system would generally not be considered part of the operating system, though this distinction is unclear in the case of a roamable operating system such as RISC OS. The facilities an operating system provides and its general design philosophy exert an extremely strong influence on programming style and on the technical cultures that grow up around the machines on which it runs.

8 Can you name some examples of Distributed Systems?

9 Client-server (e.g., NFS) The Internet The Web A sensor network DNS BitTorrent (peer to peer overlay) Datacenters Hadoop

10 What is a Distributed System?

11 FOLDOC definition A collection of (probably heterogeneous) automata whose distribution is transparent to the user so that the system appears as one local machine. This is in contrast to a network, where the user is aware that there are several machines, and their location, storage replication, load balancing and functionality is not transparent. Distributed systems usually use some kind of client-server organization.

12 Textbook definitions A distributed system is a collection of independent computers that appear to the users of the system as a single computer. [Andrew Tanenbaum] A distributed system is several computers doing something together. Thus, a distributed system has three primary characteristics: multiple computers, interconnections, and shared state. [Michael Schroeder]

13 Unsatisfactory Why are these definitions short? Why do these definitions look inadequate to us? Because we are interested in the insides of a distributed system –algorithmics –design and implementation –maintenance –study

14 I shall not today attempt further to define the kinds of material I understand to be embraced within that shorthand description; and perhaps I could never succeed in intelligibly doing so. But I know it when I see it… [Potter Stewart, Associate Justice, US Supreme Court (talking about his interpretation of a technical term laid down in the law, case Jacobellis versus Ohio 1964) ]

15 A working definition for us A distributed system is a collection of entities, each of which is autonomous, programmable, asynchronous and failure-prone, and which communicate through an unreliable communication medium. Our interest in distributed systems involves –algorithmics, design and implementation, maintenance, study Entity=a process on a device (PC, PDA, mote) Communication Medium=Wired or wireless network

16 A range of interesting problems for Distributed System designers P2P systems [Gnutella, Kazaa, BitTorrent] Cloud Infrastructures [AWS, Azure, GCE] Cloud Storage [Key-value stores, NoSQL, BigTable] Cloud Programming [MapReduce, Pig, Hive, Storm, Pregel] Coordination [Paxos] Routing [Sensor Networks, Internet]

17 A range of challenges Failures: no longer the exception, but rather a norm Scalability: 1000s of machines, Terabytes of data Asynchrony: clock skew and clock drift Security: of data, users, computations, etc.

18 Multicast

19 Multicast Distributed Group of “Nodes”= “Nodes”=Processes at Internet- based hosts Node with a piece of information to be communicated to everyone

20 Fault-tolerance and Scalability Multicast sender Multicast Protocol Nodes may crash Nodes may crash Packets may Packets may be dropped be dropped 1000’s of nodes 1000’s of nodes X X

21 Centralized UDP/TCP packets Simplest Simplest implementation implementation Problems? Problems?

22 Tree-Based UDP/TCP packets e.g., IPmulticast, SRM e.g., IPmulticast, SRM RMTP, TRAM,TMTP RMTP, TRAM,TMTP Lower load per node Lower load per node Tree setup Tree setup and maintenance and maintenance Problems? Problems?

23 A Third Approach Multicast sender

24 Gossip messages (UDP) Periodically, transmit to b random targets

25 Other nodes do same after receiving multicast Gossip messages (UDP)

26

27 “Epidemic” Multicast (or “Gossip”) Protocol rounds (local clock) Protocol rounds (local clock) b random targets per round b random targets per round Uninfected Uninfected Infected Infected Gossip Message (UDP)

28 Properties Claim that this simple protocol Is lightweight in large groups Spreads a multicast quickly Is highly fault-tolerant

29 Analysis From old mathematical branch of Epidemiology [Bailey 75] Population of (n+1) individuals mixing homogeneously Contact rate between any individual pair is At any time, each individual is either uninfected (numbering x) or infected (numbering y) Then, and at all times Infected–uninfected contact turns latter infected, and it stays infected

30 Analysis (contd.) Continuous time process Then (why?) with solution (correct? can you derive it?)

31 Epidemic Multicast Protocol rounds (local clock) Protocol rounds (local clock) b random targets per round b random targets per round Uninfected Uninfected Infected Infected Gossip Message (UDP)

32 Epidemic Multicast Analysis (why?) Substituting, at time t=clog(n), num. infected is (correct? can you derive it?)

33 Analysis (contd.) Set c,b to be small numbers independent of n Within clog(n) rounds, [ low latency ] –all but number of nodes receive the multicast [reliability] –each node has transmitted no more than cblog(n) gossip messages [lightweight]

34 Fault-tolerance Packet loss –50% packet loss: analyze with b replaced with b/2 –To achieve same reliability as 0% packet loss, takes twice as many rounds Node failure –50% of nodes fail: analyze with n replaced with n/2 and b replaced with b/2 –Same as above

35 Fault-tolerance With failures, is it possible that the epidemic might die out quickly? Possible, but improbable: –Once a few nodes are infected, with high probability, the epidemic will not die out –So the analysis we saw in the previous slides is actually behavior with high probability [Galey and Dani 98] Think: why do rumors spread so fast? why do infectious diseases cascade quickly into epidemics? why does a virus or worm spread rapidly?

36 So,… Is this all theory and a bunch of equations? Or are there implementations yet?

37 Some implementations Clearinghouse and Bayou projects: and database transactions [PODC ‘87] refDBMS system [Usenix ‘94] Bimodal Multicast [ACM TOCS ‘99] Sensor networks [Li Li et al, Infocom ’02, and PBBF, ICDCS ‘05] Usenet NNTP (Network News Transport Protocol) ! [‘79] AWS EC2 and S3 Cloud (rumored). [’00s]

38 NNTP Inter-server Protocol Server retains news posts for a while, transmits them lazily, deletes them after a while 1.Each client uploads and downloads news posts from a news server 2.

39 We’ll cover some of these other implementations during the course But let’s dwell on the big picture of the course

40 Angles of Distributed Systems Distributed System (D.S.) Theory Infrastructured D.S.’s e.g., Internet-based Non-infrastructured D.S.’s e.g., ad-hoc network based

41 CS 525 and Distributed Systems D.S. Theory Peer to peer systems Cloud Computing Sensor Networks

42 CS 525 and Distributed Systems Causality, snapshots, consensus,… …DHTs, apps, … …Smart Dust, TinyOS, Aggregation, In-network processing… …MapReduce, NoSQL, …

43 Interesting: Area Overlaps Epidemics NNTP Gossip-based ad-hoc routing

44 Interesting: Area Overlaps The InternetGnutella peer to peer system Clouds A Sensor Network Do projects that are either entrepreneurial or research

Entrepreneurial Project Proposes new ideas that can be accommodated into a (your own!) startup –Company –Or non-profit Has to be a marketable product or services to users –Need to write a short Business Plan You don’t actually need to found a company; you need to develop ideas for it. What you do later with it is up to you only. EnterpriseWorks incubator at Illinois Has to use or leverage concepts from the CS525 class To help you get leverage the current and bleeding edge of d.s. research, we will read 2 research papers per class 45

Research Project Your project has to be related to distributed systems Different from your ongoing research projects (ask Indy) Must show keen awareness of the current state of the art Must solve thoroughly at least one practical research problem Must have innovative ideas and originality (algorithms) Must build a real system and evaluate it in deployment You will write a conference-quality research paper as a part of your project We will submit the best papers from this class to top conferences/workshops in the area of distributed systems –Past versions of CS525 highly successful in getting papers into conferences and journals (see course website) To help you get insight into the current and bleeding edge of d.s. research, we will read 2 research papers per class 46

Research vs. Users Initial direction =/= Final outcome –Apple I and II (Wozniak and Jobs): Research challenge was to minimize cost of chips in the PC. Users loved Apple II and III because it had color and it had flexibility for users to write their own software (until then, every new game was done in hardware!) –Flickr (Caterina Fake): Initially were writing “Game Nevernding”. Research challenges included scalability. Users loved it because of social network, and tagging. Tagging enabled groups (Squared Circle group), news feeds, and find photos of anything. –TiVo (Mike Ramsay): Initially were writing a network server for video content. Research challenges included disk management, n/w management, security. Users were amazed by pausing live TV and being given significant flexibility but without needing to be a “techie”. –Mosaic (Andreesen) was originally an NSF-funded project at UIUC, and then became a startup Be ready to change direction Both in research project and entrepreneurial project 47

Materials for Course All readings available on the course website (you don’t need to buy anything) 48

Project Buildup To ensure semester-wide progress, project is structured into systematic stages: –Initial meeting in Feb (+ open office hours) –Survey report due late-Feb (proposal + survey) –Midterm report due Mar-end (first prototype of system built + initial experimental results) –Final report due early May (final version of project and paper) Project groups: 2-3 students 49

50 Let’s Look at the Course Information Sheet… No exams Paper Reading –Presentations (groups of 2) –Reviews (2 papers per lecture, on and after Feb 12 th ) –See instructions on website for presentations and reviews –Preferable to do a presentation: you can skip 6 review sessions –Without presentation, you have to review all sessions from Feb 12 th Project –You can get access to Google cloud (limited, so use wisely!) –Limited access to multiple testbeds: PlanetLab, Emulab, and Cloud Computing Tesbed (CCT) Ask us! Class Participation a must (and fun!) TA: Mainak Ghosh (mghosh4) My office hours: right after lecture/class (3112 SC), until about 5.30 pm Please read instructions on course website – you’re responsible for following them!

51 Things for you to do today Look at the course website Take the survey on website (10 minutes): Follow “Schedule / Papers and Presentations link” and read instructions – –Need to sign up for a presentation slot by Jan 29 Take a look at conference papers arising out of previous versions of this course (CS598IG/CS525) –Many CS525 project papers published in conferences and journals

Prerequisites Background in OS’es is required (CS241/CS423) Distributed Systems/Algorithms is recommended –If you haven’t taken CS425/ECE428 or ECE526, then … –… you should highly consider taking the Coursera course on Cloud Computing Concepts (Parts 1 and 2). It’s a free course. –Starting Feb 2 nd : register here: ng 52

53 Next Lecture Cloud Computing –Take a look at all papers on website for that session –Read at least one of those papers completely –Try to read all of them completely –(no reviews required yet)

54 Backup Slides

55 Analysis (contd.) (why?) Substituting, at time t=clog(n)