Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS347: MapReduce CS347 1. 2 Motivation for Map-Reduce Distribution makes simple computations complex Communication Load balancing Fault tolerance … What.

Similar presentations


Presentation on theme: "CS347: MapReduce CS347 1. 2 Motivation for Map-Reduce Distribution makes simple computations complex Communication Load balancing Fault tolerance … What."— Presentation transcript:

1 CS347: MapReduce CS347 1

2 2 Motivation for Map-Reduce Distribution makes simple computations complex Communication Load balancing Fault tolerance … What if we could write “simple” programs that were automatically parallelized?

3 CS 347 3 R’1R’1 R’2R’2 R’3R’3 ko k1 Local sort R1 R2 R3 Result Motivation for Map-Reduce Recall one of our sort strategies: process data & partition additional processing

4 CS 347 4 Another example: Asymmetric fragment + replicate join Ra S S S Rb R1 R2 R3 Sa Sb Local join Result f partition union process data & partition additional processing

5 From our point of view… What if we didn’t have to think too hard about the number of sites or fragments? MapReduce goal: a library that hides the distribution from the developer, who can then focus on the fundamental “nuggets” of their computation CS347 5

6 6 Building Text Index - Part I 1 2 3 rat dog cat rat dog (rat, 1) (dog, 1) (dog, 2) (cat, 2) (rat, 3) (dog, 3) (cat, 2) (dog, 1) (dog, 2) (dog, 3) (rat, 1) (rat, 3) Disk Page stream Tokenizing Sorting Loading FLUSHING Intermediate runs CS347 original Map-Reduce application....

7 7 Building Text Index - Part II (cat, 2) (dog, 1) (dog, 2) (dog, 3) (rat, 1) (rat, 3) Intermediate Runs (ant, 5) (cat, 4) (dog, 4) (dog, 5) (eel, 6) Merge (ant, 5) (cat, 2) (cat, 4) (dog, 1) (dog, 2) (dog, 3) (dog, 4) (dog, 5) (eel, 6) (rat, 1) (rat, 3) Final index (ant: 2) (cat: 2,4) (dog: 1,2,3,4,5) (eel: 6) (rat: 1, 3) CS347

8 8 Generalizing: Map-Reduce 1 2 3 rat dog cat rat dog (rat, 1) (dog, 1) (dog, 2) (cat, 2) (rat, 3) (dog, 3) (cat, 2) (dog, 1) (dog, 2) (dog, 3) (rat, 1) (rat, 3) Disk Page stream Tokenizing Sorting Loading FLUSHING Intermediate runs Map CS347

9 9 Generalizing: Map-Reduce (cat, 2) (dog, 1) (dog, 2) (dog, 3) (rat, 1) (rat, 3) Intermediate Runs Final index (ant, 5) (cat, 4) (dog, 4) (dog, 5) (eel, 6) Merge (ant, 5) (cat, 2) (cat, 4) (dog, 1) (dog, 2) (dog, 3) (dog, 4) (dog, 5) (eel, 6) (rat, 1) (rat, 3) (ant: 2) (cat: 2,4) (dog: 1,2,3,4,5) (eel: 6) (rat: 1, 3) Reduce CS347

10 10 Map Reduce Input: R={r 1, r 2,...r n }, functions M, R –M(r i )  { [k 1, v 1 ], [k 2, v 2 ],.. } –R(k i, valSet)  [k i, valSet’] Let S={ [k, v] | [k, v]  M(r) for some r  R } Let K = {k | [k,v]  S, for any v } Let G(k) = { v | [k, v]  S } Output = { [k, T] | k  K, T=R(k, G(k)) } S is bag G is bag CS347

11 11 References MapReduce: Simplified Data Processing on Large Clusters, Jeffrey Dean and Sanjay Ghemawat, available at http://labs.google.com/papers/mapreduce-osdi04.pdf CS347

12 12 Example: Counting Word Occurrences map(String doc, String value); // doc is document name // value is document content for each word w in value: EmitIntermediate(w, “1”); Example: –map(doc, “cat dog cat bat dog”) emits [cat 1], [dog 1], [cat 1], [bat 1], [dog 1] CS347

13 13 Example: Counting Word Occurrences reduce(String key, Iterator values); // key is a word // values is a list of counts int result = 0; for each v in values: result += ParseInt(v) Emit(AsString(result)); Example: –reduce(“dog”, “1 1 1 1”) emits “4” Becomes (“dog”, 4) CS347

14 Mappers CS347 14 Source data Split data Workers

15 Mappers CS347 15 Split data Workers

16 Mappers CS347 16 Split data Workers

17 Mappers CS347 17 Split data Workers

18 Mappers CS347 18 Split data Workers

19 Mappers CS347 19 Split data Workers

20 Mappers CS347 20 Split data Workers

21 Mappers CS347 21 Split data Workers

22 Mappers CS347 22 Split data Workers

23 Mappers CS347 23 Split data Workers

24 Mappers CS347 24 Split data Workers

25 Mappers CS347 25 Split data Workers

26 Shuffle CS347 26 Workers

27 Shuffle CS347 27 Workers

28 Reduce CS347 28 Workers

29 Another way to think about it: Mapper: (some query) Shuffle: GROUP BY Reducer: SELECT Aggregate() CS347 29

30 Doesn’t have to be relational Mapper: Parse data into K,V pairs Shuffle: Repartition by K Reducer: Transform the V’s for a K into a V final CS347 30

31 Process model CS347 31

32 Process model CS347 32 Can vary the number of mappers to tune performance Reduce tasks bounded by number of reduce shards

33 33 Implementation Issues Combine function File system Partition of input, keys Failures Backup tasks Ordering of results CS347

34 34 Combine Function worker [cat 1], [cat 1], [cat 1]... [dog 1], [dog 1]... worker [cat 3]... [dog 2]... Combine is like a local reduce applied before distribution: CS347

35 35 Data flow worker must be able to access any part of input file; so input on distributed fs reduce worker must be able to access local disks on map workers any worker must be able to write its part of answer; answer is left as distributed file CS347 High-throughput network is essential

36 36 Partition of input, keys How many workers, partitions of input file? 9 3 2 1 worker How many splits? How many workers? Best to have many splits per worker: Improves load balance; if worker fails, easier to spread its tasks Should workers be assigned to splits “near” them? Similar questions for reduce workers CS347

37 What takes time? Reading input data Mapping data Shuffling data Reducing data Map and reduce are separate phases –Latency determined by slowest task –Reduce shard skew can increase latency Map and shuffle can be overlapped –But if lots intermediate data, shuffle may be slow CS347 37

38 38 Failures Distributed implementation should produce same output as would have been produced by a non- faulty sequential execution of the program. General strategy: Master detects worker failures, and has work re-done by another worker. worker split j master ok? redo j CS347

39 39 Backup Tasks Straggler is a machine that takes unusually long (e.g., bad disk) to finish its work. A straggler can delay final completion. When task is close to finishing, master schedules backup executions for remaining tasks. Must be able to eliminate redundant results CS347

40 40 Ordering of Results Final result (at each node) is in key order [k 1, T 1 ] [k 2, T 2 ] [k 3, T 3 ] [k 4, T 4 ] [k 1, v 1 ] [k 3, v 3 ] also in key order: CS347

41 41 Example: Sorting Records W1 W2 W3 W5 W6 Map: extract k, output [k, record] Reduce: Do nothing! one or two records for k=6? CS347

42 42 Other Issues Skipping bad records Debugging CS347

43 43 MR Claimed Advantages Model easy to use, hides details of parallelization, fault recovery Many problems expressible in MR framework Scales to thousands of machines CS347

44 44 MR Possible Disadvantages 1-input 2-stage data flow rigid, hard to adapt to other scenarios Custom code needs to be written even for the most common operations, e.g., projection and filtering Opaque nature of map, reduce functions impedes optimization CS347

45 45 Hadoop Open-source Map-Reduce system Also, a toolkit –HDFS – filesystem for Hadoop –HBase – Database for Hadoop Also, an ecosystem –Tools –Recipes –Developer community CS347

46 MapReduce pipelines Output of one MapReduce becomes input for another –Example: Stage 1: Translate data to canonical form Stage 2: Term count Stage 3: Sort by frequency Stage 4: Extract top-k CS347 46

47 Make it database-y? Simple idea: each operator is a MapReduce How to do: –Select –Project –Group by, aggregate –Join CS347 47

48 Reduce-side join Shuffle puts all values for the same key at the same reducer Mapper –Input: tuples from R; tuples from S –Output: (join value, (R|S, tuple)) Reducer –Local join of all R tuples with all S tuples CS347 48

49 Map-side join Like a hash-join, but every mapper has a copy of the hash table Mapper: –Read in hash table of R –Input: Tuples of S –Output: Tuples of S joined with tuples of R Reducer –Pass through CS347 49

50 Comparison Reduce-side join shuffles all the data Map-side join requires one table to be small CS347 50

51 Semi-join? One idea: –MapReduce 1: Extract the join keys of R; reduce-side join with S –MapReduce 2: Map-side join result of MapReduce 1 with R CS347 51

52 Platforms for SQL-like queries Pig Latin Hive MapR MemSQL … CS347 52

53 Why not just use a DBMS? Many DBMSs exist and are highly optimized A comparison of approaches to large-scale data analysis. Pavlo et al, SIGMOD 2009 CS347 53

54 Why not just use a DBMS? One reason: loading data into a DBMS is hard A comparison of approaches to large-scale data analysis. Pavlo et al, SIGMOD 2009 CS347 54

55 Why not just use a DBMS? Other possible reasons: –MapReduce is more scalable –MapReduce is more easily deployed –MapReduce is more easily extended –MapReduce is more easily optimized –MapReduce is free (that is, Hadoop) –I already know Java –MapReduce is exciting and new CS347 55

56 Data store Instead of HDFS, data could be stored in a database Example: HadoopDB/Hadapt –Data store is PostgresSQL –Allows for indexing, fast local query processing HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads. Azza Abouzeid, Kamil Bajda-Pawlikowski, Daniel J. Abadi, Avi Silberschatz, Alex Rasin. In Proceedings of VLDB, 2009. CS347 56

57 Batch processing? MapReduce materializes intermediate results –Map output written to disk before reduce starts What if we pipelined the data from map to reduce? –Reduce could quickly produce approximate answers –MapReduce could implement continuous queries MapReduce Online. Tyson Condie, Neil Conway, Peter Alvaro, Joe Hellerstein, Khaled Elmeleegy, Russell Sears. NSDI 2010 CS347 57

58 Not just database queries! Any large, partition-able computation –Build an inverted index –Image thumbnailing –Machine translation –… –Anything expressible in the Map/Reduce functions via general purpose language CS347 58


Download ppt "CS347: MapReduce CS347 1. 2 Motivation for Map-Reduce Distribution makes simple computations complex Communication Load balancing Fault tolerance … What."

Similar presentations


Ads by Google