Distributed Systems CS

Slides:



Advertisements
Similar presentations
Microprocessor Dr. Rabie A. Ramadan Al-Azhar University Lecture 1.
Advertisements

Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
Distributed Systems CS Overview and Introduction Lecture 1, Sep 5, 2011 Majd F. Sakr, Mohammad Hammoud, Vinay Kolar.
Introduction to Computing Lecture 1. Instructor: Nadeem Ahmad Khan TA: Haroon Waseem Haroon Waseem.
Massively Distributed Database Systems Spring 2014 Ki-Joune Li Pusan National University.
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
Distributed Systems (15-440) Mohammad Hammoud December 4 th, 2013.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Winter 2015 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University Introduction and Overview.
Distributed Systems CS Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
Distributed Systems 1 CS- 492 Distributed system & Parallel Processing Sunday: 2/4/1435 (8 – 11 ) Lecture (1) Introduction to distributed system and models.
Parallel and Distributed Systems Instructor: Xin Yuan Department of Computer Science Florida State University.
Distributed Systems: Concepts and Design Chapter 1 Pages
1 Recap (from Previous Lecture). 2 Computer Architecture Computer Architecture involves 3 inter- related components – Instruction set architecture (ISA):
Introduction and Overview Summer 2014 COMP 2130 Introduction to Computer Systems Computing Science Thompson Rivers University.
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
Computer Organization & Assembly Language © by DR. M. Amer.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Computer Organization Yasser F. O. Mohammad 1. 2 Lecture 1: Introduction Today’s topics:  Why computer organization is important  Logistics  Modern.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Tackling I/O Issues 1 David Race 16 March 2010.
Background Computer System Architectures Computer System Software.
Silberschatz and Galvin  Operating System Concepts Module 1: Introduction What is an operating system? Simple Batch Systems Multiprogramming.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Distributed Systems CS Overview and Introduction Lecture 1, Aug 22, 2016 Mohammad Hammoud.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
Chapter 1 Characterization of Distributed Systems
COMPSCI 110 Operating Systems
CS/CE/TE 6378 Advanced Operating Systems
COMP7500 Advanced Operating Systems
Welcome to CSE 502 Introduction.
Distributed Operating Systems Spring 2004
CS 450/550 Operating Systems Loc & Time: MW 1:40pm-4:20pm, 101 ENG
Lynn Choi School of Electrical Engineering
University of Technology
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Web: Parallel Computing Rabie A. Ramadan , PhD Web:
Introduction to Distributed Platforms
Distributed Operating Systems
Definition of Distributed System
Course Introduction Dr. Eggen COP 6611 Advanced Operating Systems
Andy Wang COP 5611 Advanced Operating Systems
Distributed Systems CS
Grid Computing.
Software Engineering (CSI 321)
Advanced Operating Systems
Parallel and Multiprocessor Architectures – Shared Memory
Lecture 1: Parallel Architecture Intro
Chapter 17 Parallel Processing
Introduction to Computing
Introduction to Multiprocessors
Architectures of distributed systems Fundamental Models
Chapter 1 Introduction.
Architectures of distributed systems Fundamental Models
Hybrid Programming with OpenMP and MPI
Andy Wang COP 5611 Advanced Operating Systems
Subject Name: Operating System Concepts Subject Number:
Welcome to CSE 502 Introduction.
Andy Wang COP 5611 Advanced Operating Systems
Chapter 4 Multiprocessors
Introduction To Distributed Systems
Architectures of distributed systems Fundamental Models
Database System Architectures
Andy Wang COP 5611 Advanced Operating Systems
Distributed Systems (15-440)
Distributed Systems and Concurrency: Distributed Systems
Lecture Topics: 11/1 Hand back midterms
Assoc. Prof. Hussam Elbehiery
Presentation transcript:

Distributed Systems CS 15-440 Introduction Lecture 1, September 03, 2018 Mohammad Hammoud

On the Verge of A Disruptive Century : Breakthroughs Astronomy Ubiquitous Computing Smaller, Faster, Cheaper Sensors Gene Sequencing and Biotechnology

A Common Theme is Data The amount of data is only growing… 1.2 Zettabytes (1021 B or 1 Billion TB)

We Live in a World of Data…

We want to do these seamlessly... What Do We Do With Data? Store Access Encrypt Share Process …. and more! We want to do these seamlessly...

Using Diverse Interfaces & Devices Mobile Devices Computers Personal Monitors and Sensors …and even appliances Consumer Electronics We also want to access, share and process our data from all of our devices, anytime, anywhere!

Data Becoming Critical to Our Lives Health Science Domains of Data Education Work Environment Finance Icons … and more

How to Store and Process Data at Scale? A system can be scaled: Either vertically (or up) Can be achieved by hardware upgrades (e.g., faster CPU, more memory, and/or larger disk) And/Or horizontally (or out) Can be achieved by adding more machines

Vertical Scaling Caveat: Individual computers can still suffer from limited resources with respect to the scale of today’s problems Caches and Memory: L1 Cache L2 Cache L3 Cache Main Memory 16-32 KB/Core, 4-5 cycles 128-256 KB/Core, 12-15 cycles 512KB- 2 MB/Core, 30-50 cycles 8GB- 128GB, 300+ cycles

Vertical Scaling Caveat: Individual computers can still suffer from limited resources with respect to the scale of today’s problems Disks-- some advancements, but still: Limited capacity Limited number of channels Limited bandwidth

Vertical Scaling Caveat: Individual computers can still suffer from limited resources with respect to the scale of today’s problems Processors: Moore’s law still holds Chip Multiprocessors (CMPs) are now available - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost. P P P P P L1 L1 L1 L1 L1 Interconnect L2 L2 Cache A single Processor Chip A CMP

Vertical Scaling Caveat: Individual computers can still suffer from limited resources with respect to the scale of today’s problems Processors: But up until a few years ago, CPU speed grew at the rate of 55% annually, while the memory speed grew at the rate of only 7% - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost. P Processor-Memory speed gap M

Vertical Scaling Caveat: Individual computers can still suffer from limited resources with respect to the scale of today’s problems Processors: But up until a few years ago, CPU speed grew at the rate of 55% annually, while the memory speed grew at the rate of only 7% Even if 100s or 1000s of cores are placed on a CMP, it is a challenge to deliver input data to these cores fast enough for processing - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost.

10000 seconds (or 3 hours) to load data Vertical Scaling 10000 seconds (or 3 hours) to load data P P P P L1 L1 L1 L1 Interconnect L2 Cache A Data Set of 4 TBs Memory 4 100MB/S IO Channels

How to Store and Process Data at Scale? A system can be scaled: Either vertically (or up) Can be achieved by hardware upgrades (e.g., faster CPU, more memory, and/or larger disk) And/Or horizontally (or out) Can be achieved by adding more machines

Only 3 minutes to load data Horizontal Scaling Only 3 minutes to load data P P Splits 100 Machines L1 L1 L2 L2 Memory Memory A Data Set (data) of 4 TBs

Requirements But, this necessitates: A way to express the problem in terms of parallel processes and execute them on different machines (Programming and Concurrency Models) A way to organize processes (Architectures) A way for distributed processes to exchange information (Communication Paradigms) A way to locate and share resources (Naming Protocols) - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost.

Requirements But, this necessitates: A way for distributed processes to cooperate, synchronize with one another, and agree on shared values (Synchronization) A way to reduce latency, enhance reliability, and improve performance (Caching, Replication, and Consistency) A way to enhance load scalability, reduce diversity across heterogeneous systems, and provide a high degree of portability and flexibility (Virtualization) A way to recover from partial failures (Fault Tolerance) - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost.

So, What is a Distributed System? A distributed system is: A collection of independent computers that appear to its users as a single coherent system One in which components located at networked computers communicate and coordinate their actions only by passing messages

Features 1 2 3 4 Geographical Separation No Common Physical Clock Distributed Systems imply four main features: 1 Geographical Separation 2 No Common Physical Clock 3 No Common Physical Memory 4 Autonomy and Heterogeneity

Parallel vs. Distributed Systems Distributed systems contrast with parallel systems, which entail: 1 Strong Coupling 2 A Common Physical Clock 3 A Shared Physical Memory 4 Homogeneity

Some Administrivia! - Moore’s law states that the number of transistors that can be placed in a processor will double approximately every two years, for half the cost.

Introduction to Distributed Systems Considered: a reasonably critical and comprehensive perspective. Thoughtful: Fluent, flexible and efficient perspective. Masterful: a powerful and illuminating .9. Fault Tolerance (4 Lectures) .8. Distributed Frameworks (4 Lectures) .7. Replication (3 Lectures) .6. Caching (3 Lectures) .5. Synchronization (3 Lectures) .4. Naming (2 Lectures) .3. Remote Procedure Calls (3 Lectures) .2. Architectures (1 Lecture) .1. Networking (2 Lectures) .0. Introduction (1 Lecture)

Course Objectives The course aims at providing an in-depth understanding and hands-on experience on How modern distributed systems meet the demands of contemporary distributed applications Distributed system programming models and analytics engines Principles on which distributed systems are optimized Principles on which distributed systems are based

Instructor: Mohammad Hammoud (MHH) Teaching Assistant: Tamim Jabban Teaching Team Instructor: Mohammad Hammoud (MHH) Teaching Assistant: Tamim Jabban (TJ) 15-440 Teaching Team MHH Office Hours Monday, 10:30- 11:59AM Welcome when my office door is open By appointment TJ Office Hours Sunday and Tuesday, 9:30- 11:59AM Welcome when his office door is open By appointment

Teaching Methods 26 Lectures Motivate learning Provide a framework or roadmap to organize the information of the course Explain subjects and reinforce the critical big ideas 14 Recitations Get you to reveal what you don’t understand, so that we can help you Allow you to practice skills you will need to become an expert

Assignments and Projects 6 required problem solving and reading assignments Projects 4 large programming projects

Projects For all the projects except the final one, the following rules apply: If you submit one day late, 25% will be deducted from your project score as a penalty If you are two days late, 50% will be deducted The project will not be graded (and you will receive a zero score) if you are more than two days late You will have a 3-grace-day quota

Class and Recitation Participation and Attendance Assessment Methods How do we measure learning? Type # Weight Project 4 40% Exam 2 25% (10% Midterm & 15% Final) Problem Set 6 15% Quiz 10% Class and Recitation Participation and Attendance 40 5%

Next Lecture Networking- Part I Questions?