Distributed Systems CS 15-440 Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud.

Slides:



Advertisements
Similar presentations
Introduction Dr. Ying Lu Schorr Center 104 CSCE990 Advanced Distributed Systems Seminar.
Advertisements

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 1 Characterization.
ITCS 3181 Logic and Computer Systems
Introduction Dr. Ying Lu CSCE455/855 Distributed Operating Systems.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Introduction Steve Ko Computer Sciences and Engineering University at Buffalo.
Distributed Systems CS Overview and Introduction Lecture 1, Sep 5, 2011 Majd F. Sakr, Mohammad Hammoud, Vinay Kolar.
Distributed Systems CS Programming Models- Part I Lecture 13, Oct 13, 2014 Mohammad Hammoud 1.
DISTRIBUTED COMPUTING
 MODERN DATABASE MANAGEMENT SYSTEMS OVERVIEW BY ENGINEER BILAL AHMAD
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Distributed Systems Lecture 1: Overview CS425/CSE424/ECE428 Fall 2011 Nikita Borisov.
New experiences with teaching Java as a second programming language Ioan Jurca “Politehnica” University of Timisoara/Romania
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Introduction to Operating Systems J. H. Wang Sep. 18, 2012.
Distributed Systems (15-440) Mohammad Hammoud December 4 th, 2013.
Chapter 1 Characterization of Distributed Systems
Computer Network Fundamentals CNT4007C
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Winter 2015 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University Introduction and Overview.
Introduction to DISTRIBUTED SYSTEMS Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
Computer Networks CEN 5501C Spring, 2008 Ye Xia (Pronounced as “Yeh Siah”)
ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006outline.1 ITCS 4145/5145 Parallel Programming (Cluster Computing) Fall 2006 Barry Wilkinson.
Exercises for Chapter 2: System models
Introduction to Operating Systems J. H. Wang Sep. 18, 2015.
Syllabus. Instructor Dr. Hanan Lutfiyya Middlesex College 418 Ext Office Hours: Tuesday from 12:05-1:05 and Thursday from 11:05-1:05.
Operating Systems Lecture 02: Computer System Overview Anda Iamnitchi
Introduction to DISTRIBUTED COMPUTING Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
CS355 Advanced Computer Architecture Fatima Khan Prince Sultan University, College for Women.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Course Information Sarah Diesburg Operating Systems COP 4610.
Advanced Principles of Operating Systems (CE-403).
Introduction to Operating Systems J. H. Wang Sep. 15, 2010.
11/21/20151 Operating Systems Design (CS 423) Elsa L Gunter 2112 SC, UIUC Based on slides by Sam King and Andrew.
Syllabus. Instructor Dr. Hanan Lutfiyya Middlesex College 418 Ext Office Hours: Wednesday 5-6; Thursdays 4-6 or by appointment.
January 16, 2007 COMS 4118 (Operating Systems I) Henning Schulzrinne Dept. of Computer Science Columbia University
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.
CS-495 Distributed Systems Fabián E. Bustamante, Winter 2004 Welcome to Distributed Systems.
Distributed Systems: Principles and Paradigms By Andrew S. Tanenbaum and Maarten van Steen.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Distributed systems [Fall 2015] G Lec 1: Course Introduction.
Cheating The School of Network Computing, the Faculty of Information Technology and Monash as a whole regard cheating as a serious offence. Where assignments.
Introduction to Operating Systems J. H. Wang Sep. 13, 2013.
Computer Networks CNT5106C
Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Chapter 1 Characterization of Distributed Systems.
Introduction to Web Technologies Module Introduction to Web Technologies – CS th January 2005 Dr Bogdan L. Vrusias
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Distributed Systems CS Overview and Introduction Lecture 1, Aug 22, 2016 Mohammad Hammoud.
Chapter 1 Characterization of Distributed Systems
Computer Network Fundamentals CNT4007C
CS/CE/TE 6378 Advanced Operating Systems
Introduction to Operating Systems
CS 450/550 Operating Systems Loc & Time: MW 1:40pm-4:20pm, 101 ENG
CPE741: Distributed Systems Course Introduction
Computer Networks CNT5106C
Distributed Systems CS
CPE741: Distributed Systems Course Introduction
Course Information Mark Stanovich Principles of Operating Systems
CPE741: Distributed Systems Course Introduction
Slides for Chapter 1 Characterization of Distributed Systems
Advanced Operating Systems
Computer Networks CNT5106C
Slides for Chapter 1 Characterization of Distributed Systems
Slides for Chapter 1 Characterization of Distributed Systems
Distributed Systems CS
Introduction To Distributed Systems
Computer Networks CNT5106C
Distributed Systems (15-440)
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Presentation transcript:

Distributed Systems CS Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud

Why Should You Study Distributed Systems? Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis

Definition of a Distributed System A distributed system is: A collection of independent computers that appear to its users as a single coherent system (Tanenbaum book) One in which components located at networked computers communicate and coordinate their actions only by passing messages (Coulouris book)

Why Distributed Systems? Big data continues to grow:  In mid-2010, the information universe carried 1.2 zettabytes  2020 predictions expect nearly 44 times more Applications are becoming data-intensive and compute-intensive

Why Distributed Systems? Individual computers have limited resources compared to the scale of current day problems & application domains: 1.Caches and Memory: L1 Cache L2 Cache L3 Cache Main Memory 16KB- 64KB, 2-4 cycles 512KB- 8MB, 6-15 cycles 4MB- 32MB, cycles 1GB- 4GB, 300+ cycles

Why Distributed Systems? Individual computers have limited resources compared to the scale of current day problems & application domains: 2.Hard Disk Drive:  Limited capacity  Limited number of channels  Limited bandwidth

Why Distributed Systems? P L1 L2 P L1 L2 Cache P L1 P P Interconnect Individual computers have limited resources compared to the scale of current day problems & application domains: 3.Processor:  The number of transistors that can be integrated on a single die has continued to grow at Moore’s pace  Chip Multiprocessors (CMPs) are now available A single Processor Chip A CMP

Why Distributed Systems? 3.Processor (cont’d):  Up until a few years ago, CPU speed grew at the rate of 55% annually, while the memory speed grew at the rate of only 7% [H & P] Memory P M P L1 L2 P L1 L2 Cache P L1 P P Interconnect Processor-Memory speed gap

Why Distributed Systems? 3.Processor (cont’d):  Even if 100s or 1000s of cores are placed on a CMP, it is a challenge to deliver input data to these cores fast enough for processing A Data Set of 4 TBs 4 100MB/S IO Channels seconds (or 3 hours) to load data Memory P L1 L2 Cache P L1 P P Interconnect

Why Distributed Systems? Only 3 minutes to load data A Data Set (data) of 4 TBs Splits Memory P L1 L2 Memory P L1 L2 100 Machines

Requirements  But this requires:  A way to express the problem as parallel processes and execute them on different machines (Programming Models and Concurrency)  A way for processes on different machines to exchange information (Communication)  A way for processes to cooperate, synchronize with one another and agree on shared values (Synchronization)  A way to enhance reliability and improve performance (Consistency and Replication)

Requirements  But this requires (cont’d):  A way to recover from partial failures (Fault Tolerance)  A way to secure communication and ensure that a process gets only those access rights it is entitled to (Security)  A way to extend interfaces so as to mimic the behavior of another system, reduce diversity of platforms, and provide a high degree of portability and flexibility (Virtualization)

An Introductory Course on Distributed Systems.0. Introduction.1. Processes and Communications.2. Naming.3. Synchronization.4. Consistency and Replication.5. Fault Tolerance.6. Programming Models.7. Distributed File Systems.8. Security.9. Virtualization Considered: a reasonably critical and comprehensive perspective. Thoughtful: Fluent, flexible and efficient perspective. Masterful: a powerful and illuminating perspective.

Intended Learning Outcomes Introduction ILO0: Outlining the characteristics of distributed systems and the challenges that must be addressed in their design Processes and Communication ILO1: Explain and contrast the communication mechanisms between processes and systems Naming ILO2: Identify why entities and resources in distributed systems should be named, and examine the naming conventions and the naming-resolution mechanisms Synchronization ILO3: Describe and analyze how multiple machines and services should cooperate and synchronize to correctly solve a problem

Intended Learning Outcomes Consistency and Replication ILO4: Identify how replication of resources improve performance in distributed systems, and explain algorithms to maintain consistent copies of replicas Fault Tolerance ILO5: Explain how a distributed system can be made fault tolerant Programming Models ILO6: Explain and apply the shared memory, message passing, MapReduce, Pregel and GraphLab programming models and describe the important differences between them Distributed File Systems ILO7: Explain distributed file systems as a paradigm for general- purpose distributed systems, analyze its various aspects and architectures, and contrast against parallel file systems

Intended Learning Outcomes Security ILO8: Explain the various concepts and mechanisms that are generally incorporated in distributed systems to support security Virtualization ILO9:Explain resource virtualization, how it applies to distributed systems, and how it allows distributed resource management and scheduling

Course Objectives The course aims at providing an in- depth and hands-on understanding on Principles on which distributed systems are based Principles on which distributed systems are optimized Distributed system programming models and analytics engines How modern distributed systems meet the demands of contemporary distributed applications

Teaching Team Instructor: Mohammad Hammoud (MHH) Teaching Assistant: Dania Abed Rabbou (DA) Teaching Team Office Hours MHH Wednesday, 4:30- 5:30PM Welcome when my office door is open By appointment Office Hours DA Tuesday, 9:30AM- 12:00PM and Thursday, 10:30AM- 12:00PM Welcome when her office door is open By appointment

Teaching Methods Lectures (28) Motivate learning Provide a framework or roadmap to organize the information of the course Explain subjects and reinforce the critical big ideas Recitations (14) Get students to reveal what they don’t understand, so we can help them Allow students to practice skills they will need to become competent/expert

Assignments & Projects Assignments 5 required problem solving and reading assignments Projects 4 large programming projects

Projects  For all the projects except the final one, the following rules apply:  If you submit one day late, 25% will be deducted from your project score as a penalty  If you are two days late, 50% will be deducted  The project will not be graded (and you will receive a zero score) if you are more than two days late  There will be a 3-grace-day quota

Assessment Methods How do we measure learning? Type#Weight Project445% Exam225% (10% Midterm & 15% Final) Problem Solving and Reading Assignment 515% Quiz210% Class and Recitation Participation and Attendance 425%

Target Audience and Prerequisites  Target Audience:  Seniors  Prerequisites:   Students should have a basic knowledge of computer systems and object-oriented programming

Text Books  The primary textbooks for this course are: 1.Andrew S. Tannenbaum and Maarten Van Steen, Distributed Systems: Principles and Paradigms, 2 nd E, Pearson, George Coulouris, Jean Dollimore, Tim Kindberg, and Gordon Blair, Distributed Systems: Concepts and Design, 5 th E, Addison Wesley, 2011  Reference Book: 4.Tom white, Hadoop: The Definitive Guide, 2 nd E, O’Reilly Media, 2011

Next Lecture We will discuss the trends in distributed systems and the challenges encountered when designing such systems To Do: Read C1 & T1 Attend Next Lecture Attend Recitation on Thursday OOP Programming in Java Questions?