Download presentation
Presentation is loading. Please wait.
Published byDarrell Warner Modified over 9 years ago
1
MapReduce: Simplified Data Processing on Large Clusters 2009-21146 Lim JunSeok
2
1. Introduction 2. Programming Model 3. Structure 4. Performance & Experience 5. Conclusion Contents 2
3
Introduction 3
4
What is MapReduce? A simple and powerful interface that enables automatic parallelization and distribution of large-scale computations. A programming model executes process in distributed manner exploits large set of commodity computers for large data set(> 1 TB) with underlying runtime System parallelizes the computation across large-scale clusters of machines handles machine failures schedules inter-machine communication to make efficient use of the network and disk 4
5
Motivation Want to process lots of data( > 1TB) E.g. Raw data: crawled documents, Web request logs, … Derived data: inverted indices, summaries of the number of pages, a set of most frequent queries in a given day. Want to parallelize across hundreds/thousands of CPUs And, want to make these easy Google Data Centers – File System DistributedThe Digital Universe 2009-2020 5
6
Motivation Application: Sifting through large amounts of data Used for Generating the Google search index Clustering problems for Google News and Froogle products Extraction of data used to produce reports of popular queries Large scale graph computation Large scale machine learning … Google SearchPageRankMachine learning 6
7
Motivation Platform: clusters of inexpensive machines Commodity computers(15,000 Machines in 2003) Scale to large clusters: thousands of machines Data distributed and replicated across machines of the cluster Recover from machine failure Hadoop, Google File System Hadoop Google File System 7
8
Programming Model 8
9
MapReduce Programming Model Map Reduce Partitioning function 9
10
MapReduce Programming Model Map phase Local computation Process each record independently and locally Reduce phase Aggregate the filtered output Local Storage Map Reduce Result Commodity computers 10
11
Example: Word Counting File 1: Hello World Bye SQL File 2: Hello Map Bye Reduce Map procedure Reduce procedure Partitioning Function 11
12
Example: PageRank PageRank review: Link analysis algorithm 12
13
Example: PageRank Key ideas for Map Reduce RageRank calculation only depends on the PageRank values of previous iteration PageRank calculation of each Web pages can be processed in parallel Algorithm: Map: Provide each page’s PageRank ‘fragments’ to the links Reduce: Sum up the PageRank fragments for each page 13
14
Example: PageRank Key ideas for Map Reduce 14
15
Example: PageRank PageRank calculation with 4 pages 15
16
Example: PageRank Map phase: Provide each page’s PageRank ‘fragments’ to the links PageRank fragment computation of page 1 PageRank fragment computation of page 2 16
17
Example: PageRank Map phase: Provide each page’s PageRank ‘fragments’ to the links PageRank fragment computation of page 3PageRank fragment computation of page 4 17
18
Example: PageRank Reduce phase: Sum up the PageRank fragments for each page 18
19
Structure 19
20
Execution Overview (1) Split the input files into M pieces of 16-64MB per piece. Then start many copies of program (2) Master is special: the rest are workers that are assigned work by the master. M map tasks and R reduce tasks (3) Map phase Assigned worker read the input files Parse the input data into key/value pairs Produce intermediate key/value pairs (7) Return to user code 20
21
Execution Overview (4) Buffered pairs are written to local disk, partitioned into R regions by the partitioning function The locations are passed back to the master Master forwards these locations to the reduce workers (5) Reduce phase 1: read and sort Reduce workers read the data from intermediate data for its partition Sort intermediate key/value pairs to group data by same key (7) Return to user code 21
22
Execution Overview (6) Reduce phase 2: reduce function Iterate over the sorted intermediate data in the reduce function The output is appended to a final output file for the reduce function (7) Return to user code The master wakes up the user program Return back to the user code (7) Return to user code 22
23
Failure Tolerance Handled via re-execution: worker failure Failure detection: heartbeat The master pings every worker periodically Handling Failure: re-execution Map task: Re-execute completed and in-progress map tasks since map tasks are performed in the local Reset the state of map tasks and re-schedule Reduce tasks Re-execute in-progress map tasks since the data is stored in local Completed reduce tasks do NOT need to be re-executed The results are stored in global file system 23
24
Failure Tolerance Master failure: Job state is checkpointed to global file system New master recovers and continues the tasks from checkpoint Robust to large-scale worker failure: Simply re-execute the tasks! Simply make new masters! E.g. Lost 1600 of 1800 machines once, but finished fine. 24
25
Locality Network bandwidth is a relatively scarce resource Input data is stored on the local disks of the machines GFS divides each file into 64MB blocks Store several copies of each block on different machines Local computation: Master takes the information of location of input data’s replica Map task is performed in the local disk that contains the replica of the input data If it fails, master schedules the map task near a replica E.g.: worker on the same network switch Most input data is read locally and consumes no network bandwidth 25
26
Task Granularity 26
27
Backup Tasks Slow workers significantly lengthen completion time Other jobs consuming resources on machine Bad disks with soft errors Data transfer very slowly Weird things Processor cashes disabled Solution: Near end of phase, spawn backup copies of tasks Whichever, one finishes first wins As a result, job completion time dramatically shortened E.g. 44% longer to complete if backup task mechanism is disabled 27
28
Performance & Experience 28
29
Performance Experiment setting 1,800 machines 4 GB of memory Dual-processor 2 GHz Xeons with Hyperthreading Dual 160 GB IDE disks Gigabit Ethernet per machine Approximately 100-200 Gbps of aggregate bandwith 29
30
Performance MR_Grep: Grep task with MapReduce Grep: search relatively rare three-character pattern through 1 terabyte 80 sec to hit zero Computation peaks at over 30GB/s when 1764 workers are assigned Locality optimization helps Without this, rack switches would limit to 10GB/s Data transfer rate over time 30
31
Performance MR_Sort: Sorting task with MapReduce Sort: sort 1 terabyte of 100 byte records Takes about 14 min. Input rate is higher than the shuffle rate and the output rate; locality Shuffle rate is higher than output rate Output phase writes two copies for reliability 31
32
Performance MR_Sort: Backup task and failure tolerance Backup tasks reduce job completion time significantly System deal well with failures 32
33
Experience Large-scale indexing MapReduce used for the Google Web search service As a results, The indexing code is simpler, smaller, and easier to understand Performance is good enough Locality makes it easy to change the indexing process A few months a few days MapReduce takes care of failures, slow machines Easy to make indexing faster by adding more machines 33
34
The number of MapReduce instances grows significantly over time 2003/02: first version 2004/09: almost 900 2006/03: about 4000 2007/01: over 6000 Experience MapReduce instances over time 34
35
New MapReduce Programs Per Month The number of new MapReduce programs increases continuously Experience 35
36
Experience MapReduce statistics for different months Aug. ‘04Mar. ‘06Sep. ‘07 Number of jobs(1000s) 391712,217 Avg. completion time (secs) 634874395 Machine years used 2172,00211,081 Map input data (TB) 3,28852,254403,152 Map output data (TB) 7586,74334,774 Reduce output data(TB) 1932,97014,018 Avg. machines per job 157268394 Unique implementation Map39519584083 Reduce26912082418 36
37
Conclusion 37
38
Are every tasks suitable for MapReduce? Have a cluster; local computation Working with large dataset Working with independent data Information to share across clusters is small e.g. word count, grep, K-means clustering, PageRank Cannot work independently with data Cannot be cast into Map and Reduce Information to share across clusters is large Exponential size or even linear size Suitable if… NOT Suitable if… NOT every tasks are suitable for MapReduce: 38
39
Is it trend? Really? Percentage of matching job postings SQL: 4% MapReduce: 0……% 39
40
Conclusion Focus on problem: let library deal with messy details Automatic parallelization and distribution MapReduce has proven to be a useful abstraction MapReduce Simplifies large-scale computations at Google Functional programming paradigm can be applied to large- scale application 40
41
EOD 41
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.