MapReduce CSE 454 Slides based on those by Jeff Dean, Sanjay Ghemawat, and Dan Weld.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

MapReduce: Simplified Data Processing on Large Clusters These are slides from Dan Weld’s class at U. Washington (who in turn made his slides based on those.
These are slides with a history. I found them on the web... They are apparently based on Dan Weld’s class at U. Washington, (who in turn based his slides.
MapReduce. 2 (2012) Average Searches Per Day: 5,134,000,000 (2012) Average Searches Per Day: 5,134,000,000.
Distributed Computations
MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.
CS 345A Data Mining MapReduce. Single-node architecture Memory Disk CPU Machine Learning, Statistics “Classical” Data Mining.
An Introduction to MapReduce: Abstractions and Beyond! -by- Timothy Carlstrom Joshua Dick Gerard Dwan Eric Griffel Zachary Kleinfeld Peter Lucia Evan May.
Map Reduce Allan Jefferson Armando Gonçalves Rocir Leite Filipe??
MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.
CS 425 / ECE 428 Distributed Systems Fall 2014 Indranil Gupta (Indy) Lecture 3: Mapreduce and Hadoop All slides © IG.
Distributed Computations MapReduce
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
Introduction to Google MapReduce WING Group Meeting 13 Oct 2006 Hendra Setiawan.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science U NIVERSITY OF M ASSACHUSETTS A MHERST Department of Computer Science MapReduce:
Map Reduce Architecture
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Leroy Garcia. What is Map Reduce?  A patented programming model developed by Google Derived from LISP and other forms of functional programming  Used.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Take a Close Look at MapReduce Xuanhua Shi. Acknowledgement  Most of the slides are from Dr. Bing Chen,
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
The Technology Behind. The World Wide Web In July 2008, Google announced that they found 1 trillion unique webpages! Billions of new web pages appear.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
SLIDE 1IS 240 – Spring 2013 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval Lecture.
MapReduce How to painlessly process terabytes of data.
MapReduce M/R slides adapted from those of Jeff Dean’s.
Mass Data Processing Technology on Large Scale Clusters Summer, 2007, Tsinghua University All course material (slides, labs, etc) is licensed under the.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
MapReduce Programming Model. HP Cluster Computing Challenges  Programmability: need to parallelize algorithms manually  Must look at problems from parallel.
SLIDE 1IS 240 – Spring 2013 MapReduce, HBase, and Hive University of California, Berkeley School of Information IS 257: Database Management.
MapReduce and the New Software Stack CHAPTER 2 1.
Google Cluster Computing Faculty Training Workshop Module III: Nutch This presentation © Michael Cafarella Redistributed under the Creative Commons Attribution.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
MapReduce: Simplified Data Processing on Large Clusters Lim JunSeok.
MapReduce : Simplified Data Processing on Large Clusters P 謝光昱 P 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
Map Reduce. Functional Programming Review r Functional operations do not modify data structures: They always create new ones r Original data still exists.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
MapReduce: simplified data processing on large clusters Jeffrey Dean and Sanjay Ghemawat.
Data Parallel and Graph Parallel Systems for Large-scale Data Processing Presenter: Kun Li.
MapReduce: Simplified Data Processing on Large Cluster Authors: Jeffrey Dean and Sanjay Ghemawat Presented by: Yang Liu, University of Michigan EECS 582.
Dr Zahoor Tanoli COMSATS.  Certainly not suitable to process huge volumes of scalable data  Creates too much of a bottleneck.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
MapReduce: Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc.
Lecture 4. MapReduce Instructor: Weidong Shi (Larry), PhD
Cloud Computing.
Introduction to Google MapReduce
MapReduce: Simplified Data Processing on Large Clusters
The Map-Reduce Framework
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
MapReduce Simplied Data Processing on Large Clusters
CS427 Multicore Architecture and Parallel Computing
MapReduce: Simplified Data Processing on Large Clusters
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

MapReduce CSE 454 Slides based on those by Jeff Dean, Sanjay Ghemawat, and Dan Weld

What’s the Problem? So far… Classification + IR Simple enough, counting a bunch of words… 100 TB datasets Scanning on 1 node – 23 days On 1000 nodes – 33 mins Sounds great, but what about MTBF? 1 node – 3 years 1000 nodes – 1 day Source: Introduction to Hadoop (O’Malley)

Motivation Large-Scale Data Processing Want to use 1000s of CPUs But don’t want the hassle of managing things MapReduce provides Automatic parallelization & distribution Fault tolerance I/O scheduling Monitoring & status updates

MapReduce Functional programming (e.g. Lisp) Many problems can be phrased this way Easy to distribute across nodes Nice retry/failure semantics

Map in Lisp (Scheme) (map f list [list 2 list 3 …]) (map square ‘( )) ( ) (reduce + ‘( )) (+ 16 (+ 9 (+ 4 1) ) ) 30 (reduce + (map square (map – l 1 l 2 )))) Unary operator Binary operator

MapReduce ala Google map(key, val) is run on each item in set emits new-key / new-val pairs reduce(key, vals) is run for each unique key emitted by map() emits final output

Count Words in Documents Input consists of (url, contents) pairs map(key=url, val=contents): For each word w in contents, emit (w, “1”) reduce(key=word, values=uniq_counts): Sum all “1”s in values list Emit result “(word, sum)”

Count, Illustrated map(key=url, val=contents): For each word w in contents, emit (w, “1”) reduce(key=word, values=uniq_counts): Sum all “1”s in values list Emit result “(word, sum)” see bob throw see spot run see 1 bob1 run1 see 1 spot 1 throw1 bob1 run1 see 2 spot 1 throw1

Grep Input consists of (url+offset, single line) map(key=url+offset, val=line): If contents matches regexp, emit (line, “1”) reduce(key=line, values=uniq_counts): Don’t do anything; just emit line

Reverse Web-Link Graph Map For each URL linking to target, … Output pairs Reduce Concatenate list of all source URLs Outputs: pairs

Inverted Index Map Reduce

Example uses: distributed grep distributed sort web link-graph reversal term-vector / hostweb access log statsinverted index construction document clusteringmachine learningstatistical machine translation... Model is Widely Applicable MapReduce Programs In Google Source Tree

Other Users Open Source version Hadoop PowerSet on Amazon’s EC2 system Microsoft Dryad Yahoo (10k machines, 1 petabyte) Hadoop + Pig (related to Google’s Sawzall) Kosmix KFS (open C++ implementation)

Typical cluster: 100s/1000s of 2-CPU x86 machines, 2-4 GB of memory Limited bisection bandwidth Storage is on local IDE disks GFS: distributed file system manages data (SOSP'03) Job scheduling system: jobs made up of tasks, scheduler assigns tasks to machines Implementation is a C++ library linked into user programs Implementation Overview Image: Introduction to Hadoop (O’Malley)

GFS (HDFS) Architecture Client Master Many { Chunk Server Chunk Server Chunk Server } metadata only data only

Execution How is this distributed?  Partition input key/value pairs into chunks, run map() tasks in parallel  After all map()s are complete, consolidate all emitted values for each unique emitted key  Now partition space of output map keys, and run reduce() in parallel If map() or reduce() fails, re-execute!

Job Processing JobTracker TaskTracker 0 TaskTracker 1TaskTracker 2 TaskTracker 3TaskTracker 4TaskTracker 5 1.Client submits “grep” job, indicating code and input files 2.JobTracker breaks input file into k chunks, (in this case 6). Assigns work to trackers. 3.After map(), tasktrackers exchange map-output to build reduce() keyspace 4.JobTracker breaks reduce() keyspace into m chunks (in this case 6). Assigns work. 5.reduce() output may go to NDFS “grep”

Map Design Given a set of keys {k 1 …k n } how do we map? Use a hash(k 1 ) Actually hash(k 1 ) mod R How about getting all URLs from a website to one place? hash(domain(k 1 )) mod R Issues? Assumes equal distribution of data (that’s why hashes work well) Powerlaw data distributions (e.g. searches for “google” visits to “ etc.) Combiner function

Task Granularity & Pipelining Fine granularity tasks: map tasks >> machines Minimizes time for fault recovery Can pipeline shuffling (sorting) with map execution Better dynamic load balancing Often use 200,000 map & 5000 reduce tasks Running on 2000 machines

Worker Failure Handled via re-execution Detect failure via periodic heartbeats Task completion committed through master Robust: Lost 1600/1800 machines once  finished ok Master Failure Could handle, … ? But don't yet (master failure unlikely) Fault Tolerance

Re-execute completed + in-progress map tasks Re-execute in progress reduce tasks Refinement: Redundant Execution

Slow workers significantly delay completion time Other jobs consuming resources on machine Bad disks w/ soft errors transfer data slowly Weird things: processor caches disabled (!!) Solution: Near end of phase, spawn backup tasks Whichever one finishes first "wins" Dramatically shortens job completion time Refinement: Redundant Execution

Normal No backup tasks 200 processes killed MR_Sort Backup tasks reduce job completion time a lot! System deals well with failures

Refinement: Locality Optimization Master scheduling policy: Asks GFS for locations of replicas of input file blocks Map tasks typically split into 64MB (GFS block size) Map tasks scheduled so GFS input block replica are on same machine or same rack Effect Thousands of machines read input at local disk speed Without this, rack switches limit read rate

MR_Grep Locality optimization helps: 1800 machines read 1 TB at peak ~31 GB/s W/out this, rack switches would limit to 10 GB/s Startup overhead is significant for short jobs

Refinement Skipping Bad Records Map/Reduce functions sometimes fail for particular inputs Best solution is to debug & fix Not always possible ~ third-party source libraries On segmentation fault: Send UDP packet to master from signal handler Include sequence number of record being processed If master sees two failures for same record: Next worker is told to skip the record

Sorting guarantees within each reduce partition Final file (combination of reduce partitions) not necessarily sorted! Compression of intermediate data Local execution for debugging/testing User-defined counters Other Refinements

Rewrote Google's production indexing System using MapReduce Set of 10, 14, 17, 21, 24 MapReduce operations New code is simpler, easier to understand 3800 lines C++  700 MapReduce handles failures, slow machines Easy to make indexing faster add more machines Experience

Detect website crawl issues Adoption of new technologies (micro- formats, web 2.0, etc.) Yahoo Experience (OSCON’07)

Related Work Programming model inspired by functional language primitives Partitioning/shuffling similar to many large-scale sorting systems NOW-Sort ['97] Re-execution for fault tolerance BAD-FS ['04] and TACC ['97] Locality optimization has parallels with Active Disks/Diamond work Active Disks ['01], Diamond ['04] Backup tasks similar to Eager Scheduling in Charlotte system Charlotte ['96] Dynamic load balancing solves similar problem as River's distributed queues River ['99]

Comparison RDBMSMapReduc e BigTable access+ online- offline+ online distribution- custom+ native partitioning- static+ dynamic updates- slower+ fastest+ faster schemas- static+ dynamic- static joins- slow/hard+ fast/easy- slow/hard Source: Cutting, OSCON’07 Presentation

Hadoop…

Some Mappings Map/Collect/Reduce  Map/Collect/Reduce GFS  HDFS Namenodes + datanodes Master  TaskTracker/JobTracker

Other Important Objects MapReduceBase Basic implementation, extend for implementing Mappers and Reducers RecordReader/RecordWriter Define how key/values are read and written to file (lines, binary data, etc.) FileSplit What we read from

SequenceFiles Binary key/value pairs Can be: Uncompressed Values compressed Both compressed How?

Getting data in bin/hadoop dfs –ls List bin/hadoop dfs –copyFromLocal f1 f2 Copy etc. See:

Example public class WCMap extends MapReduceBase implements Mapper { private static final IntWritable ONE = new IntWritable(1); public void map(WritableComparable key, Writable value, OutputCollector output, Reporter reporter) throws IOException { StringTokenizer itr = new StringTokenizer(value.toString()); while (itr.hasMoreTokens()) { output.collect(new Text(itr.next()), ONE); } Source: Introduction to Hadoop (O’Malley)

Example public class WCReduce extends MapReduceBase implements Reducer { public void reduce(WritableComparable key, Iterator values, OutputCollector output, Reporter reporter) throws IOException { int sum = 0; while (values.hasNext()) { sum += ((IntWritable) values.next()).get(); } output.collect(key, new IntWritable(sum)); } Source: Introduction to Hadoop (O’Malley)

Example public static void main(String[] args) throws IOException { JobConf conf = new JobConf(WordCount.class); conf.setJobName("wordcount"); conf.setOutputKeyClass(Text.class); conf.setOutputValueClass(IntWritable.class); conf.setMapperClass(WCMap.class); conf.setCombinerClass(WCReduce.class); conf.setReducerClass(WCReduce.class); conf.setInputPath(new Path(args[0])); conf.setOutputPath(new Path(args[1])); JobClient.runJob(conf); } Source: Introduction to Hadoop (O’Malley)

Higher level abstractions Pig – SQL like (pig latin) input = LOAD 'documents' USING StorageText(); words = FOREACH input GENERATE FLATTEN(Tokenize(*)); grouped = GROUP words BY $0; counts = FOREACH grouped GENERATE group, COUNT(words);

Conclusions MapReduce proven to be useful abstraction Greatly simplifies large-scale computations “Fun” to use: focus on problem, let library deal w/ messy details