Download presentation
Presentation is loading. Please wait.
Published byDaniela McCarthy Modified over 9 years ago
1
Presenters: Abhishek Verma, Nicolas Zea
2
Map Reduce Clean abstraction Extremely rigid 2 stage group-by aggregation Code reuse and maintenance difficult Google → MapReduce, Sawzall Yahoo → Hadoop, Pig Latin Microsoft → Dryad, DryadLINQ Improving MapReduce in heterogeneous environment
3
k1k1 v1v1 k2k2 v2v2 k1k1 v3v3 k2k2 v4v4 k1k1 v5v5 map k1k1 v1v1 k1k1 v3v3 k1k1 v5v5 k2k2 v2v2 k2k2 v4v4 Output records map reduce Input records Split shuffle k1k1 v1v1 k1k1 v3v3 k2k2 v2v2 Local QSort k1k1 v5v5 k2k2 v4v4
4
Extremely rigid data flow Other flows hacked in Stages Joins Splits Common operations must be coded by hand Join, filter, projection, aggregates, sorting,distinct Semantics hidden inside map-reduce fns Difficult to maintain, extend, and optimize M M R R M M R R M M R R
5
Christopher Olston, Benjamin Reed, Utkarsh Srivastava, Ravi Kumar, Andrew Tomkins Research
6
Pigs Eat Anything Can operate on data w/o metadata : relational, nested, or unstructured. Pigs Live Anywhere Not tied to one particular parallel framework Pigs Are Domestic Animals Designed to be easily controlled and modified by its users. UDFs : transformation functions, aggregates, grouping functions, and conditionals. Pigs Fly Processes data quickly(?) 6
7
Dataflow language Procedural : different from SQL Quick Start and Interoperability Nested Data Model UDFs as First-Class Citizens Parallelism Required Debugging Environment 7
8
Data Model Atom : 'cs' Tuple: ('cs', 'ece', 'ee') Bag: { ('cs', 'ece'), ('cs')} Map: [ 'courses' → { ('523', '525', '599'}] Expressions Fields by position $0 Fields by name f1, Map Lookup # 8
9
Find the top 10 most visited pages in each category URLCategoryPageRank cnn.comNews0.9 bbc.comNews0.8 flickr.comPhotos0.7 espn.comSports0.9 VisitsURL Info UserURLTime Amycnn.com8:00 Amybbc.com10:00 Amyflickr.com10:05 Fredcnn.com12:00
10
Load Visits Group by url Foreach url generate count Foreach url generate count Load Url Info Join on url Group by category Foreach category generate top10 urls Foreach category generate top10 urls
11
visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits by url; visitCounts = foreach gVisits generate url, count(visits); urlInfo = load ‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCounts by url, urlInfo by url; gCategories = group visitCounts by category; topUrls = foreach gCategories generate top(visitCounts,10); store topUrls into ‘/data/topUrls’;
12
visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits by url; visitCounts = foreach gVisits generate url, count(visits); urlInfo = load ‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCounts by url, urlInfo by url; gCategories = group visitCounts by category; topUrls = foreach gCategories generate top(visitCounts,10); store topUrls into ‘/data/topUrls’; Operates directly over files
13
visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits by url; visitCounts = foreach gVisits generate url, count(visits); urlInfo = load ‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCounts by url, urlInfo by url; gCategories = group visitCounts by category; topUrls = foreach gCategories generate top(visitCounts,10); store topUrls into ‘/data/topUrls’; Schemas 0ptional can be assigned dynamically
14
visits = load ‘/data/visits’ as (user, url, time); gVisits = group visits by url; visitCounts = foreach gVisits generate url, count(visits); urlInfo = load ‘/data/urlInfo’ as (url, category,pRank); visitCounts = join visitCounts by url, urlInfo by url; gCategories = group visitCounts by category; topUrls = foreach gCategories generate top(visitCounts,10); store topUrls into ‘/data/topUrls’; UDFs can be used in every construct
15
LOAD: specifying input data FOREACH: per-tuple processing FLATTEN: eliminate nesting FILTER: discarding unwanted data COGROUP: getting related data together GROUP, JOIN STORE: asking for output Other: UNION, CROSS, ORDER, DISTINCT 15
17
Every group or join operation forms a map-reduce boundary Other operations pipelined into map and reduce phases Load Visits Group by url Foreach url generate count Foreach url generate count Load Url Info Join on url Group by category Foreach category generate top10 urls Foreach category generate top10 urls Map 1 Reduce 1 Map 2 Reduce 2 Map 3 Reduce 3
18
Write-run-debug cycle Sandbox dataset Objectives: Realism Conciseness Completeness Problems: UDFs 18
19
Optional “safe” query optimizer Performs only high-confidence rewrites User interface Boxes and arrows UI Promote collaboration, sharing code fragments and UDFs Tight integration with a scripting language Use loops, conditionals of host language
20
Yuan Yu, Michael Isard, Dennis Fetterly, Mihai Budiu, Ulfar Erlingsson, Pradeep Kumar Gunda, Jon Currey
21
Files, TCP, FIFO, Network job schedule data plane control plane NS PD V VV Job managercluster
22
Collection collection; bool IsLegal(Key); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value};
23
Partition Collection C# objects Partitioning: Hash, Range, RoundRobin Apply, Fork Hints
24
Collection collection; bool IsLegal(Key k); string Hash(Key); var results = from c in collection where IsLegal(c.key) select new { Hash(c.key), c.value}; C# collection results C# Vertex code Query plan (Dryad job) Data
25
DryadLINQ Client machine (11) Distributed query plan C# Query Expr Data center Output Tables Results Input Tables Invoke Query Output DryadTabl e Dryad Execution C# Objects JM ToDryadTable foreach
26
LINQ expressions converted to execution plan graph (EPG) similar to database query plan DAG annotated with metadata properties EPG is skeleton of Dryad DFG as long as native operations are used, properties can propagate helping optimization
27
Pipelining Multiple operations in a single process Removing redundancy Eager Aggregation Move aggregations in front of partitionings I/O Reduction Try to use TCP and in-memory FIFO instead of disk space
28
As information from job becomes available, mutate execution graph Dataset size based decisions ▪ Intelligent partitioning of data
29
Aggregation can turn into tree to improve I/O based on locality Example if part of computation is done locally, then aggregated before being sent across network
30
TeraSort - scalability 240 computer cluster of 2.6Ghz dual core AMD Opterons Sort 10 billion 100- byte records on 10- byte key Each computer stores 3.87 GBs
31
DryadLINQ vs Dryad - SkyServer Dryad is hand optimized No dynamic optimization overhead DryadLINQ is 10% native code
32
High level and data type transparent Automatic optimization friendly Manual optimizations using Apply operator Leverage any system running LINQ framework Support for interacting with SQL databases Single computer debugging made easy Strong typing, narrow interface Deterministic replay execution
33
Dynamic optimizations appear data intensive What kind of overhead? EPG analysis overhead -> high latency No real comparison with other systems Progress tracking is difficult No speculation Will Solid State Drives diminish advantages of MapReduce? Why not use Parallel Databases? MapReduce Vs Dryad How different from Sawzall and Pig?
34
LanguageSawzallPig LatinDryadLINQ Built byGoogleYahooMicrosoft ProgrammingImperative Imperative & Declarative Hybrid Resemblance to SQLLeastModerateMost Execution Engine Google MapReduce HadoopDryad Performance *Very Efficient5-10 times slower1.3-2 times slower Implementation Internal, inside Google Open Source Apache-License Internal, inside Microsoft ModelOperate per recordSequence of MRDAGs UsageLog Analysis + Machine Learning + Iterative computations
35
Matei Zaharia, Andy Konwinski, Anthony Joseph, Randy Katz, Ion Stoica University of California at Berkeley
36
Speculative tasks executed only if no failed or waiting avail. Notion of progress 3 phases of execution 1. Copy phase 2. Sort phase 3. Reduce phase Each phase weighted by % data processed Determines whether a job failed or is a straggler and available for speculation
37
1. Nodes can perform work at exactly the same rate 2. Tasks progress at a constant rate throughout time 3. There is no cost to launching a speculative task on an idle node 4. The three phases of execution take approximately same time 5. Tasks with a low progress score are stragglers 6. Maps and Reduces require roughly the same amount of work
38
Virtualization breaks down homogeneity Amazon EC2 - multiple vm’s on same physical host Compete for memory/network bandwidth Ex: two map tasks can compete for disk bandwidth, causing one to be a straggler
39
Progress threshold in Hadoop is fixed and assumes low progress = faulty node Too Many speculative tasks executed Speculative execution can harm running tasks
40
Task’s phases are not equal Copy phase typically the most expensive due to network communication cost Causes rapid jump from 1/3 progress to 1 of many tasks, creating fake stragglers Real stragglers get usurped Unnecessary copying due to fake stragglers Progress score means anything with >80% never speculatively executed
41
Longest Approximate Time to End Primary assumption: best task to execute is the one that finishes furthest into the future Secondary: tasks make progress at approx. constant rate Progress Rate = ProgressScore/T* T = time task has run for Time to completion = (1-ProgressScore)/T
42
Launch speculative jobs on fast nodes best chance to overcome straggler vs using first available node Cap on total number of speculative tasks ‘Slowness’ minimum threshold Does not take into account data locality
43
Sort EC2 test cluster 1.0-1.2 Ghz Opteron/Xeon w/1.7 GB mem
44
Sort Manually slowed down 8 VM’s with background processes
45
Grep WordCount
48
1. Make decisions early 2. Use finishing times 3. Nodes are not equal 4. Resources are precious
49
Focusing work on small vm’s fair? Would it be better to pay for large vm and implement system with more customized control? Could this be used in other systems? Progress tracking is key Is this a fundamental contribution? Or just an optimization? “Good” research?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.