Operating Systems and The Cloud David E. Culler CS162 – Operating Systems and Systems Programming Lecture 39 December 1, 2014 Proj: CP 2 12/3.

Slides:



Advertisements
Similar presentations
The Datacenter Needs an Operating System Matei Zaharia, Benjamin Hindman, Andy Konwinski, Ali Ghodsi, Anthony Joseph, Randy Katz, Scott Shenker, Ion Stoica.
Advertisements

 Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware  Created by Doug Cutting and.
Overview of MapReduce and Hadoop
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
THE DATACENTER NEEDS AN OPERATING SYSTEM MATEI ZAHARIA, BENJAMIN HINDMAN, ANDY KONWINSKI, ALI GHODSI, ANTHONY JOSEPH, RANDY KATZ, SCOTT SHENKER, ION STOICA.
Berkeley Data Analytics Stack (BDAS) Overview Ion Stoica UC Berkeley UC BERKELEY.
Lecture 18-1 Lecture 17-1 Computer Science 425 Distributed Systems CS 425 / ECE 428 Fall 2013 Hilfi Alkaff November 5, 2013 Lecture 21 Stream Processing.
Spark: Cluster Computing with Working Sets
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Discretized Streams: Fault-Tolerant Streaming Computation at Scale Wenting Wang 1.
Data-Intensive Computing with MapReduce/Pig Pramod Bhatotia MPI-SWS Distributed Systems – Winter Semester 2014.
Discretized Streams An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker,
Distributed Computations
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Undergraduate Poster Presentation Match 31, 2015 Department of CSE, BUET, Dhaka, Bangladesh Wireless Sensor Network Integretion With Cloud Computing H.M.A.
April 30, 2014 Anthony D. Joseph CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing.
AMPCamp Introduction to Berkeley Data Analytics Systems (BDAS)
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Apache Spark and the future of big data applications Eric Baldeschwieler.
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Data Mining on the Web via Cloud Computing COMS E6125 Web Enhanced Information Management Presented By Hemanth Murthy.
CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing April 29, 2013 Anthony D. Joseph
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
Tyson Condie.
Cloud Computing Ali Ghodsi UC Berkeley, AMPLab
CS162 Operating Systems and Systems Programming Lecture 24 Capstone: Cloud Computing December 2, 2013 Anthony D. Joseph and John Canny
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
MapReduce M/R slides adapted from those of Jeff Dean’s.
Introduction to Hadoop Owen O’Malley Yahoo!, Grid Team
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Hadoop implementation of MapReduce computational model Ján Vaňo.
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Matthew Winter and Ned Shawa
Operating Systems and The Cloud, Part II: Search => Cluster Apps => Scalable Machine Learning David E. Culler CS162 – Operating Systems and Systems Programming.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Cloud Distributed Computing Environment Hadoop. Hadoop is an open-source software system that provides a distributed computing environment on cloud (data.
Next Generation of Apache Hadoop MapReduce Owen
Spark System Background Matei Zaharia  [June HotCloud ]  Spark: Cluster Computing with Working Sets  [April NSDI.
BIG DATA/ Hadoop Interview Questions.
© 2007 IBM Corporation IBM Software Strategy Group IBM Google Announcement on Internet-Scale Computing (“Cloud Computing Model”) Oct 8, 2007 IBM Confidential.
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Big Data is a Big Deal!.
Hadoop Aakash Kag What Why How 1.
Introduction to Distributed Platforms
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
Spark Presentation.
Introduction to MapReduce and Hadoop
Hadoop Clusters Tess Fulkerson.
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Introduction to Spark.
Cse 344 May 4th – Map/Reduce.
CS110: Discussion about Spark
Hadoop Technopoints.
Overview of big data tools
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Apache Hadoop and Spark
Big-Data Analytics with Azure HDInsight
MapReduce: Simplified Data Processing on Large Clusters
Lecture 29: Distributed Systems
CS639: Data Management for Data Science
Presentation transcript:

Operating Systems and The Cloud David E. Culler CS162 – Operating Systems and Systems Programming Lecture 39 December 1, 2014 Proj: CP 2 12/3

Goals Today Give you a sense of kind of operating systems issues that arise in The Cloud Encourage you to think about graduate studies and creating what is out beyond what you see around you … 12/1/14UCB CS162 Fa14 L39 2

The Datacenter is the new Computer ?? “The datacenter as a computer” is still young –Complete systems as building blocks (PC+Unix+HTTP+SQL+ …) –Higher Level Systems formed as Clusters, e.g., Hadoop cluster –Scale => More reliable than its components –Innovation => Rapid (ease of) development, Predictable Behavior despite variations in demand, etc. = ? 12/1/14UCB CS162 Fa14 L39 3

Datacenter/Cloud Computing OS ??? If the datacenter/cloud is the new computer, what is its Operating System? –Not the host OS for the individual nodes, but for the millions of nodes that form the ensemble of quasi-distributed resources ! Will it be as much of an enabler as the LAMP stack was to the.com boom ? Open source stack for every Web 2.0 company: –Linux OS –Apache web server –MySQL, MariaDB or MongoDB DBMS –PHP, Perl, or Python languages for dynamic web pages 12/1/14UCB CS162 Fa14 L39 4

Classical Operating Systems Data sharing –Inter-Process Communication, RPC, files, pipes, … Programming Abstractions –Storage & I/O Resources, Libraries (libc), system calls, … Multiplexing of resources –Scheduling, virtual memory, file allocation/protection, … 12/1/14UCB CS162 Fa14 L39 5

Datacenter/Cloud Operating System Data sharing –Google File System, key/value stores –Apache project: Hadoop Distributed File System Programming Abstractions –Google MapReduce –Apache projects: Hadoop, Pig, Hive, Spark, … –Nyad, Driad, … Multiplexing of resources –Apache projects: Mesos, YARN (MapReduce v2), ZooKeeper, BookKeeper, … 12/1/14UCB CS162 Fa14 L39 6

Google Cloud Infrastructure Google File System (GFS), 2003 –Distributed File System for entire cluster –Single namespace Google MapReduce (MR), 2004 –Runs queries/jobs on data –Manages work distribution & fault- tolerance –Colocated with file system Apache open source versions: Hadoop DFS and Hadoop MR 12/1/14UCB CS162 Fa14 L39 7

GFS/HDFS Insights Petabyte storage –Files split into large blocks (128 MB) and replicated across many nodes –Big blocks allow high throughput sequential reads/writes Data striped on hundreds/thousands of servers –Scan 100 TB on 1 50 MB/s = 24 days –Scan on 1000-node cluster = 35 minutes Failures will be the norm –Mean time between failures for 1 node = 3 years –Mean time between failures for 1000 nodes = 1 day Use commodity hardware –Failures are the norm anyway, buy cheaper hardware No complicated consistency models –Single writer, append-only data 12/1/14UCB CS162 Fa14 L39 8

MapReduce Insights Restricted key-value model –Same fine-grained operation (Map & Reduce) repeated on huge, distributed (within DC) data –Operations must be deterministic –Operations must be idempotent/no side effects –Only communication is through the shuffle –Operation (Map & Reduce) output saved (on disk) 12/1/14UCB CS162 Fa14 L39 9

What is (was) MapReduce Used For? At Google: –Index building for Google Search –Article clustering for Google News –Statistical machine translation –… At Yahoo!: –Index building for Yahoo! Search –Spam detection for Yahoo! Mail –… At Facebook: –Data mining –Ad optimization –Spam detection –… 12/1/14UCB CS162 Fa14 L39 10

A Time-Travel Perspective 12/1/14UCB CS162 Fa14 L /2004

Research as “Time Travel” Imagine a technologically plausible future Create an approximation of that vision using technology that exists. Discover what is True in that world –Empirical experience »Bashing your head, stubbing your toe, reaching epiphany –Quantitative measurement and analysis –Analytics and Foundations Courage to ‘break trail’ and discipline to do the hard science 12 12/1/14UCB CS162 Fa14 L39

NOW – Scalable Internet Service Cluster Design 13 VAX clusters 12/1/14UCB CS162 Fa14 L39

1993 Massively Parallel Processor is King 12/1/14UCB CS162 Fa14 L39 14

NOW – Scalable High Performance Clusters 15 GSC+ => PCI => ePCI … 10m Ethernet, FDDI, ATM, Myrinet, … VIA, Fast Ethernet, => infiniband, gigEtherNet 12/1/14UCB CS162 Fa14 L39

NOW – Scalable High Performance Clusters 16 12/1/14UCB CS162 Fa14 L39

17 UltraSparc/Myrinet NOW Active Message: Ultra-fast user-level RPC When remote memory is closer than local disk … Global Layer system built over local systems –Remote (parallel) execution, Scheduling, Uniform Naming –xFS – cluster-wide p2p file system –Network Virtual Memory 12/1/14

Inktomi – Fast Massive Web Search Fiat Lux - High Dynamic Range Imaging 18 Paul Gauthier Paul Debevec Lycos infoseek 12/1/14UCB CS162 Fa14 L39

inktomi.berkeley.edu World’s 1 st Massive AND Fast search engine inktomi.com 12/1/14UCB CS162 Fa14 L39

World Record Sort, 1 st Cluster on Top Distributed File Storage stripped over all the disks with fast communication. 12/1/14UCB CS162 Fa14 L39

21 Massive Cheap Storage Serving Fine Art at 12/1/14

… google.com 22 N0 $’ s in Search Big $ ’s in caches ??? $ ’s in mobile N0 $’ s in Search Big $ ’s in caches ??? $ ’s in mobile Yahoo moves from inktomi to Google 12/1/14UCB CS162 Fa14 L39

meanwhile Clusters of SMPs 12/1/14UCB CS162 Fa14 L39 23

Expeditions to the 21 st Century 24 12/1/14UCB CS162 Fa14 L39

Internet Services to support small mobile devices 25 12/1/14UCB CS162 Fa14 L39

Ninja Internet Service Architecture 26 12/1/14UCB CS162 Fa14 L39

Startup of the Week … 27 12/1/14UCB CS162 Fa14 L39

… and … 28 12/1/14UCB CS162 Fa14 L39

29 Gribble, 99 12/1/14UCB CS162 Fa14 L39

Security & Privacy in a Pervasive Web 30 12/1/14UCB CS162 Fa14 L39

A decade before the cloud 31 12/1/14UCB CS162 Fa14 L39

99.9 Club 32 12/1/14UCB CS162 Fa14 L39

10 th ANNIVERSARY REUNION 2008 Network of Workstations (NOW): NOW Team 2008: L-R, front row: Prof. Tom Anderson †‡ (Washington), Prof. Rich Martin ‡ (Rutgers), Prof. David Culler* †‡ (Berkeley), Prof. David Patterson* † (Berkeley). Middle row: Eric Anderson (HP Labs), Prof. Mike Dahlin †‡ (Texas), Prof. Armando Fox ‡ (Berkeley), Drew Roselli (Microsoft), Prof. Andrea Arpaci-Dusseau ‡ (Wisconsin), Lok Liu, Joe Hsu. Last row: Prof. Matt Welsh ‡ (Harvard/Google), Eric Fraser, Chad Yoshikawa, Prof. Eric Brewer* †‡ (Berkeley), Prof. Jeanna Neefe Matthews (Clarkson), Prof. Amin Vahdat ‡ (UCSD), Prof. Remzi Arpaci-Dusseau (Wisconsin), Prof. Steve Lumetta (Illinois). *3 NAE members † 4 ACM fellows ‡ 9 NSF CAREER Awards Google 12/1/14UCB CS162 Fa14 L39

Time Travel It’s not just storing it, it’s what you do with the data 12/1/14UCB CS162 Fa14 L39 34

The Data Deluge Billions of users connected through the net –WWW, Facebook, twitter, cell phones, … –80% of the data on FB was produced last year Clock Rates stalled Storage getting cheaper –Store more data! 12/1/14UCB CS162 Fa14 L39 35

Data Grows Faster than Moore’s Law Projected Growth Increase over /1/14UCB CS162 Fa14 L39 36

Complex Questions Hard questions –What is the impact on traffic and home prices of building a new ramp? Detect real-time events –Is there a cyber attack going on? Open-ended questions –How many supernovae happened last year? 12/1/14UCB CS162 Fa14 L39 37

MapReduce Pros Distribution is completely transparent –Not a single line of distributed programming (ease, correctness) Automatic fault-tolerance –Determinism enables running failed tasks somewhere else again –Saved intermediate data enables just re-running failed reducers Automatic scaling –As operations as side-effect free, they can be distributed to any number of machines dynamically Automatic load-balancing –Move tasks and speculatively execute duplicate copies of slow tasks (stragglers) 12/1/14UCB CS162 Fa14 L39 38

MapReduce Cons Restricted programming model –Not always natural to express problems in this model –Low-level coding necessary –Little support for iterative jobs (lots of disk access) –High-latency (batch processing) Addressed by follow-up research and Apache projects –Pig and Hive for high-level coding –Spark for iterative and low-latency jobs 12/1/14UCB CS162 Fa14 L39 39

UCB / Apache Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing 12/1/14UCB CS162 Fa14 L39 40

Spark Motivation Complex jobs, interactive queries and online processing all need one thing that MR lacks: Efficient primitives for data sharing Stage 1 Stage 2 Stage 3 Iterative job Query 1 Query 2 Query 3 Interactive mining Job 1 Job 2 … Stream processing Problem: in MR, the only way to share data across jobs is using stable storage (e.g. file system)  slow! 12/1/14UCB CS162 Fa14 L39 41

Examples iter. 1 iter Input HDFS read HDFS write HDFS read HDFS write Input query 1 query 2 query 3 result 1 result 2 result 3... HDFS read Opportunity: DRAM is getting cheaper  use main memory for intermediate results instead of disks 12/1/14UCB CS162 Fa14 L39 42

iter. 1 iter Input Goal: In-Memory Data Sharing Distributed memory Input query 1 query 2 query 3... one-time processing × faster than network and disk 12/1/14UCB CS162 Fa14 L39 43

Solution: Resilient Distributed Datasets (RDDs) Partitioned collections of records that can be stored in memory across the cluster Manipulated through a diverse set of transformations (map, filter, join, etc) Fault recovery without costly replication –Remember the series of transformations that built an RDD (its lineage) to recompute lost data 12/1/14UCB CS162 Fa14 L39 44

12/1/14UCB CS162 Fa14 L39 45

Velox Model Serving Tachyon Spark Streamin g SparkSQ L BlinkDB GraphX MLlib MLBa se Spark R Cancer Genomics, Energy Debugging, Smart Buildings Sample Clean Apache Spark Berkeley Data Analytics Stack (open source software) HDFS, S3, … Apache MesosYarn Resource Virtualization Storage Processing Engine Access and Interfaces In-house Apps Tachyon 12/1/14UCB CS162 Fa14 L39 46