CSCI5570 Large Scale Data Processing Systems

Slides:



Advertisements
Similar presentations
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Advertisements

FlumeJava Easy, Efficient Data-Parallel Pipelines Mosharaf Chowdhury.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
Program Representations. Representing programs Goals.
Spark: Cluster Computing with Working Sets
Distributed Computations
MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.
Distributed Computations MapReduce
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
MapReduce M/R slides adapted from those of Jeff Dean’s.
CS 193G Lecture 7: Parallel Patterns II. Overview Segmented Scan Sort Mapreduce Kernel Fusion.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
1 MapReduce: Theory and Implementation CSE 490h – Intro to Distributed Computing, Modified by George Lee Except as otherwise noted, the content of this.
MAP-REDUCE ABSTRACTIONS 1. Abstractions On Top Of Hadoop We’ve decomposed some algorithms into a map-reduce “workflow” (series of map-reduce steps) –
Department of Computer Science MapReduce for the Cell B. E. Architecture Marc de Kruijf University of Wisconsin−Madison Advised by Professor Sankaralingam.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
ApproxHadoop Bringing Approximations to MapReduce Frameworks
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
 Problem Analysis  Coding  Debugging  Testing.
Item Based Recommender System SUPERVISED BY: DR. MANISH KUMAR BAJPAI TARUN BHATIA ( ) VAIBHAV JAISWAL( )
CS314 – Section 5 Recitation 9
”Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters” Published In SIGMOD '07 By Yahoo! Senthil Nathan N IIT Bombay.
TensorFlow– A system for large-scale machine learning
CS239-Lecture 4 FlumeJava Madan Musuvathi Visiting Professor, UCLA
Some slides adapted from those of Yuan Yu and Michael Isard
Prof: Dr. Shu-Ching Chen TA: Samira Pouyanfar Spring 2017
Hadoop.
Distributed Programming in “Big Data” Systems Pramod Bhatotia wp
MapReduce Types, Formats and Features
Spark Presentation.
CS 326 Programming Languages, Concepts and Implementation
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
Array Array is a variable which holds multiple values (elements) of similar data types. All the values are having their own index with an array. Index.
Java Review: Reference Types
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
Chapter 15 QUERY EXECUTION.
Hash table another data structure for implementing a map or a set
Introduction to cosynthesis Rabi Mahapatra CSCE617
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
Cse 344 May 4th – Map/Reduce.
Weaving Abstractions into Workflows
CS110: Discussion about Spark
Ch 4. The Evolution of Analytic Scalability
Distributed System Gang Wu Spring,2018.
Parallel Computation Patterns (Reduction)
Slides prepared by Samkit
Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz
Lecture 18 (Hadoop: Programming Examples)
COMP60621 Fundamentals of Parallel and Distributed Systems
int [] scores = new int [10];
Charles Tappert Seidenberg School of CSIS, Pace University
MAPREDUCE TYPES, FORMATS AND FEATURES
MapReduce Algorithm Design
Introduction to Spark.
CS639: Data Management for Data Science
Data Structures & Algorithms
CS639: Data Management for Data Science
COMP60611 Fundamentals of Parallel and Distributed Systems
Fast, Interactive, Language-Integrated Cluster Computing
MapReduce: Simplified Data Processing on Large Clusters
SPL – PS1 Introduction to C++.
Map Reduce, Types, Formats and Features
Threads and concurrency / Safety
Presentation transcript:

CSCI5570 Large Scale Data Processing Systems Distributed Data Analytics Systems James Cheng CSE, CUHK Slide Ack.: modified based on the slides from Mosharaf Chowdhury

FlumeJava Easy, Efficient Data-Parallel Pipelines PLDI 2010 Craig Chambers, Ashish Raniwala, Frances Perry, Stephen Adams, Robert R. Henry, Robert Bradshaw, Nathan Weizenbaum

Problem Long and complicated data-parallel pipelines Long chain of MapReduce jobs Iterative jobs … Difficult to program and manage Each MapReduce job needs to keep intermediate results High overhead at synchronization barrier between MapReduce jobs Curse of the last reducer Start-up cost

Solution Exposes a limited set of parallel operations on immutable parallel collections From 16 data-parallel operations to 2

Goals Expressiveness Abstractions Performance Data representation Implementation strategy Performance Lazy evaluation Dynamic optimization Usability & deployability Implemented as a Java library

Write a Java program using the FlumeJava library FlumeJava Workflow 1 3 Write a Java program using the FlumeJava library 2 Optimize FlumeJava.run(); PCollection<String> words = lines.parallelDo(new DoFn<String, String>() { void process(String line, EmitFn<String> emitFn) { for (String word : splitIntoWords(line)) { emitFn.emit(word); } }, collectionOf(strings())); 4 Execute

Core Abstractions Parallel Collections Data-parallel Operations PCollection<T> PTable<K, V> Primitives parallelDo() groupByKey() combineValues() flatten() Derived operations count() join() top()

Parallel Collections PCollection<T> PTable<K, V> An immutable bag of elements of type T Ordered (e.g., sequence) or unordered (e.g., collection) T can be built-in or user-defined PTable<K, V> An immutable unordered bag of key-value pairs (i.e., an immutable multi-map), where keys are of type K and values are of type V Same as PCollection<Pair<K, V>>

Primitive Operations parallelDo() PCollection<String> words = Support elementwise computation over an input PCollection<T> to produce a new output PCollection<S> Take a DoFn<T, S> argument to map each element in PCollection<T> to zero or more elements to appear in PCollection<S> E.g., split lines into words (by DoFn) in parallel (by parallelDo): PCollection<String> words = lines.parallelDo(new DoFn<String,String>() { void process(String line, EmitFn<String> emitFn) { for (String word : splitIntoWords(line)) { emitFn.emit(word); } }, collectionOf(strings()));

Primitive Operations parallelDo() Can express both map and reduce Subclasses of DoFn: MapFn, FilterFn … DoFn functions should not access any global mutable states (for data consistency as they run in parallel) DoFn objects can maintain the states of local variables Multiple DoFn replicas can operate concurrently with no shared state

Primitive Operations groupByKey() Convert a multi-map of type PTable<K,V> (which can have many key/value pairs with the same key) into a uni-map of type PTable<K, Collection<V>> where each key maps to an unordered collection of all the values with that key Capture the essence of the shuffle step of MapReduce E.g. compute a table mapping URLs to the collection of documents that link to each URL: PTable<URL,DocInfo> backlinks = docInfos.parallelDo(new DoFn<DocInfo, Pair<URL,DocInfo>>() { void process(DocInfo docInfo, EmitFn<Pair<URL,DocInfo>> emitFn) { for (URL targetUrl : docInfo.getLinks()) { emitFn.emit(Pair.of(targetUrl, docInfo)); } }, tableOf(recordsOf(URL.class), recordsOf(DocInfo.class))); PTable<URL,Collection<DocInfo>> referringDocInfos = backlinks.groupByKey(); For each url in a doc, output a (url, doc) pair Group all docs with the same url (i.e., key) as a group

Primitive Operations combineValues() Take an input, PTable<K, Collection<V>>, and an associative combining function on the elements of Collection<V>, and return a PTable<K, V> where all elements of each Collection<V> are combined into a single output value Can be implemented using parallelDo, but more efficient first using combiners in MapReduce E.g., count the occurrence of each distinct word: PTable<String,Integer> wordsWithOnes = words.parallelDo(new DoFn<String, Pair<String,Integer>>() { void process(String word, EmitFn<Pair<String,Integer>> emitFn) { emitFn.emit(Pair.of(word, 1)); } }, tableOf(strings(), ints())); PTable<String,Collection<Integer>> groupedWordsWithOnes = wordsWithOnes.groupByKey(); PTable<String,Integer> wordCounts = groupedWordsWithOnes.combineValues(SUM_INTS); For each word, output a (word, 1) pair Group all ‘1’s with the same key (i.e., same word) as a group Combine all ‘1’s with the same key (i.e., same word) into their sum

Primitive Operations flatten() Take a list of “PCollection<T>”s, and return a single PCollection<T> that contains all the elements of the input “PCollection<T>”s Do not actually copy the inputs, but rather creates a view of them as one logical PCollection

Derived Operations count() Take a PCollection<T> and return a PTable<T, Integer>, where PTable<T, Integer> gives the set of distinct elements in PCollection<T> associated with the number of occurrences of each distinct element in PCollection<T> Can be implemented using parallelDo(), groupByKey(), and combineValues() E.g., count the number of occurrences of each distinct word (same result as the code shown in the previous slide): PTable<String,Integer> wordCounts = words.count();

Derived Operations join() top() Take as input a multi-map PTable<K, V1> and a multi-map PTable<K, V2>, return a uni-map PTable<K, Tuple2<Collection<V1>, Collection<V2>>>, such that for each key in either of the input tables, Collection<V1> is the collection of all values with that key in the first table, and Collection<V2> is the collection of all values with that key in the second table Various joins can be computed from Tuple2<Collection<V1>, Collection<V2>> top() Take a comparison function and a count k, and return the greatest k elements according to the comparison function

Write a Java program using the FlumeJava library FlumeJava Workflow DONE! NEXT … NEXT … 1 3 Write a Java program using the FlumeJava library 2 Optimize FlumeJava.run(); PCollection<String> words = lines.parallelDo(new DoFn<String, String>() { void process(String line, EmitFn<String> emitFn) { for (String word : splitIntoWords(line)) { emitFn.emit(word); } }, collectionOf(strings())); 4 Execute

Deferred Evaluation To apply optimization, FlumeJava applies deferred evaluation to its parallel operations Each PCollection object is represented internally either in deferred (not yet computed) or materialized (computed) state A deferred PCollection holds a pointer to the deferred operation that computes it A deferred operation holds references to the PCollections that are its arguments (either deferred or materialized) and the deferred PCollections that are its results When a FlumeJava operation, e.g., parallelDo(), is called, it just creates a ParallelDo deferred operation object and returns a new deferred PCollection that points to it The result of executing a series of FlumeJava operations is a directed acyclic graph (DAG) of deferred PCollections and operations (the DAG is also called the execution plan)

Optimization Optimizer Strategy Optimizer Output Sink flattens Lift CombineValues Insert fusion blocks Fuse parallelDos Fuse MSCRs MSCR Flatten Operate

ParallelDo Fusion Producer-consumer fusion: Sibling fusion: One ParallelDo operation performs function f, and its result is consumed by another ParallelDo operation that performs function g Replaced by a single multi-output ParallelDo that computes both f and (g ◦ f), e.g.: A and D replaced by (A + D) to give A.1 and D.0 If result of f not needed by other operations, it won’t be produced, e.g.: A and B replaced by (A + B) to give B.0 only, A.0 is not produced Sibling fusion: Two or more ParallelDo operations read the same input Pcollection Fused into a single multi-output ParallelDo operation that computes the results of all the fused operations in a single pass over the input, e.g.: B, C and D are fused into (B + C + D)

MapShuffleCombineReduce (MSCR) Transform combinations of the four primitives into a single MapReduce Generalizes MapReduce Multiple input channels Multiple reducers/combiners Multiple outputs per reducer Pass-through outputs

MSCR Fusion An MSCR operation produced from a set of related GroupByKey operations GroupByKey operations are related if they consume the same input or inputs created by the same (fused) ParallelDo operations E.g. MSCR fusion seeded by three GroupByKey operations (starred PCollections are needed by later operations)

Let’s do it! Optimization Optimizer Strategy Optimizer Output Sink flattens Lift CombineValues Insert fusion blocks Fuse parallelDos Fuse MSCRs MSCR Flatten Operate Let’s do it!

An Example: Step 1 Initially 16 data-parallel operations After sinking Flattens

An Example: Step 2 After Step 1 After ParallelDo fusion

An Example: Step 3 After Step 2 After MSCR fusion

An Example: Final Result From 16 data-parallel operations to 2 MSCR operations (executed by 2 MapReduce!)

Some Results 5x reduction in average number of MapReduce stages Faster than other approaches Except for Hand-optimized MapReduce chains 319 users over a year period in Google