Distributed systems [Fall 2015] G22.3033-002 Lec 1: Course Introduction.

Slides:



Advertisements
Similar presentations
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Advertisements

Spark: Cluster Computing with Working Sets
Business Continuity and DR, A Practical Implementation Mich Talebzadeh, Consultant, Deutsche Bank
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Distributed systems [Fall 2009] G Lec 1: Course Introduction & Lab Intro.
Based on last years lecture notes, used by Juha Takkinen.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Distributed (storage) systems G Lec 1: Course Introduction & Lab Intro.
DISTRIBUTED COMPUTING
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Distributed Systems Lecture 1: Overview CS425/CSE424/ECE428 Fall 2011 Nikita Borisov.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / st October, 2007.
Cloud Distributed Computing Environment Content of this lecture is primarily from the book “Hadoop, The Definite Guide 2/e)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Distributed systems [Fall 2014] G Lec 1: Course Introduction.
Lecture 0 Anish Arora CSE 6333 Introduction to Distributed Computing.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Distributed Systems and Security: An Introduction Brad Karp UCL Computer Science CS GZ03 / M th September, 2008.
1 COMPSCI 110 Operating Systems Who - Introductions How - Policies and Administrative Details Why - Objectives and Expectations What - Our Topic: Operating.
Introduction. Readings r Coulouris, Dollimore and Kindberg Distributed Systems: Concepts and Design Edn. 3 m Note: All figures from this book.
Course Information Andy Wang Operating Systems COP 4610 / CGS 5765.
Advanced Principles of Operating Systems (CE-403).
Operating Systems and Systems Programming CS162 Teaching Staff.
Operating Systems Lecture 1 Jinyang Li. Class goals Understand how an OS works by studying its: –Design principles –Implementation realities Gain some.
Welcome to CPS 210 Graduate Level Operating Systems –readings, discussions, and programming projects Systems Quals course –midterm and final exams Gateway.
Problem-solving on large-scale clusters: theory and applications Lecture 4: GFS & Course Wrap-up.
 Course Overview Distributed Systems IT332. Course Description  The course introduces the main principles underlying distributed systems: processes,
Unit 9: Distributing Computing & Networking Kaplan University 1.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Distributed Computing Systems CSCI 4780/6780. Scalability ConceptExample Centralized servicesA single server for all users Centralized dataA single on-line.
6.894: Distributed Operating System Engineering Lecturers: Frans Kaashoek Robert Morris
Introduction to CS739: Distribution Systems UNIVERSITY of WISCONSIN-MADISON Computer Sciences Department CS 739 Distributed Systems Andrea C. Arpaci-Dusseau.
Distributed systems [Fall 2012] G Lec 1: Course Introduction & Lab Intro.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
CS426: Building Decentralized Systems Mahesh Balakrishnan.
Distributed Systems and Security: An Introduction Brad Karp and Steve Hailes UCL Computer Science CS Z03 / nd October, 2006.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Cloud Distributed Computing Environment Hadoop. Hadoop is an open-source software system that provides a distributed computing environment on cloud (data.
COT 4600 Operating Systems Fall 2010 Dan C. Marinescu Office: HEC 439 B Office hours: Tu-Th 3:30-4:30 PM.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
BIG DATA/ Hadoop Interview Questions.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
COMPSCI 110 Operating Systems
CMSC 621: Advanced Operating Systems Advanced Operating Systems
Distributed systems [Fall 2010] G
Hadoop Aakash Kag What Why How 1.
Advanced Operating Systems
湖南大学-信息科学与工程学院-计算机与科学系
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
Cse 344 May 4th – Map/Reduce.
Distributed systems [Fall 2016] G Jinyang Li
Hadoop Technopoints.
Andy Wang Operating Systems COP 4610 / CGS 5765
Andy Wang Operating Systems COP 4610 / CGS 5765
Database System Architectures
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Distributed systems [Fall 2015] G Lec 1: Course Introduction

Waitlist status Many people are wait-listed If you are not going to take the class, drop early to let others in

Class staff Instructor: Prof. Jinyang Li (me) –Office Hour: Thu 5-6pm (715 Bway Rm 708) Teaching Assistant: Lamont Nelson –Office Hour: TBA or by appointment (715 Bway Rm 707) Grader: Quanlu Zhang

Background What I assume you already know: –Familiar with OS & networking –Substantial programming experience –Concurrency and threading

Course readings No official textbook Lectures are based on research papers –Check webpage for schedules Useful reference books –Principles of Computer System Design. (Saltzer and Kaashoek) –Distributed Systems (Tanenbaum and Steen)

Meeting times & Lecture structure Tuesdays 5:10-7pm –With a 10-minute break in the middle Lecture will do basic concepts followed by paper discussion –Read assigned papers before lecture Instructional assistant will do a few optional recitations on labs.

Potential recitation times Friday 5-6pm Friday 6-7pm Monday 7-8pm Wednesday 7-8pm

Important addresses URL: –Check regularly for schedule We’ll use Piazza.com for making announcements and conducting discussion Please check piazza.com to enter your preferences for the recitation time

How are you evaluated? Participation 5% Labs 35% ( 5 programming assignments in Go) Quizzes 60% –mid-term and final (90 minutes each)

Using Piazza Please post all questions on Piazza You can post either anonymously or as yourself. We encourage you to answer others’ questions. –It counts as your class participation.

Questions?

What are distributed systems? Multiple hosts A local or wide area network Machines communicate to provide some service for applications

Why distributed systems? for ease-of-use Handle geographic separation Provide users (or applications) with location transparency: Examples: –Web: access information with a few “clicks” –Network file system: access files on remote servers as if they are on a local disk, share files among multiple computers

Why distributed systems? for availability Build a reliablesystem out of unreliable parts –Hardware can fail: power outage, disk failures, memory corruption, network switch failures… –Software can fail: bugs, mis-configuration, upgrade … –How to achieve availability? Examples: –Amazon’s S3 key-value store

Why distributed systems? for scalable capacity Aggregate the resources (CPU, bandwidth, disk) of many computers Examples? –CPU: MapReduce, Spark, Grid computing –Bandwidth: Akamai CDN, BitTorrent –Disk: Google file system, Hadoop File System

Why distributed systems? for modular functionality Only need to build a service to accomplish a single task well. –Authentication server –Backup server. Compose multiple simple services to achieve sophisticated functionality –A distributed file system: a block service + a meta-data lookup service

Challenges System design –What is the right interface or abstraction? –How to partition functions for scalability? Consistency –How to share data consistently among multiple readers/writers? Fault Tolerance –How to keep system available despite node or network failures?

Challenges (continued) Different deployment scenarios –Clusters –Wide area distribution –Sensor networks Implementation –How to maximize concurrency? –What’s the bottleneck? –How to reduce load on the bottleneck resource?

The downside A distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.” -- Leslie Lamport Much more complex

Main topics in distributed systems

Distributed systems Local storage Network Local CPU Applications ( web search, Facebook, Whatsapp...) #1 Abstraction & Interface

Application users access your service via some interface An example, a storage service’s API: –File system (mkdir, readdir, write, read) –Database (create tables, SQL queries) –Disk (read block, write block) Conflicting goals: –Simple, Flexible, Efficient to implement

#2 System architecture Where to run the system? –Data center or wide area? How is system functionality spread across machines? –Client/server? –Peer-to-peer? –A single master?

#3: Fault Tolerance How to keep the system running when some machine is down? Dropbox operates on ~10,000 servers, how to read files when some are down? Replication: store the same data on multiple servers. Does the system still give “correct” data?

#4: Consistency Contract with apps/users about meaning of operations. Difficult due to: –Failure, multiple copies of data, concurrency E.g. how to keep 2 replicas “identical” –If one is down, it will miss updates –If net is broken, both might process different updates

#5 Performance Latency & Throughput To increase throughput, exploit parallelism –Many resources exist in multiples CPU cores, IO and CPU –Need ways to divide the load To reduce latency, –Figure out what takes time: queuing, network, storage, some expensive algorithm, many serial steps?