Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 Material adapted from.

Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 ypengab@cse.ust.hk Material adapted from slides by Christophe Bisciglia, Aaron Kimball, & Sierra Michels-Slettvet, Google Distributed Computing Seminar, 2007 (licensed under Creation Commons Attribution 3.0 License)

Today's Topics MapReduce Background information/overview Map and Reduce -------- from a programmer's perspective Architecture and workflow -------- a global overview Virtues and defects Improvement Spark Background MapReduce Spark

MapReduce Background Before MapReduce, large-scale data processing was difficult Managing parallelization and distribution Application development is tedious and hard to debug Resource scheduling and load-balancing Data storage and distribution Distributed file system “Moving computation is cheaper than moving data.” Fault/crash tolerance Scalability Background MapReduce Spark

Does “Divide and Conquer paradigm” still work in big data? Background MapReduce Spark Work Worker Partition Combine Result

Programming Model Opportunity: design an software abstraction undertake the divide and conquer and reduce programmers' workload for resource management task scheduling distributed synchronization and communication Functional programming, which has long history, provides some high-order functions to support divide and conquer. Map: do something to everything in a list Fold: combine results of a list in some way Background MapReduce Spark Computer Abstraction Application …

Map Map is a higher-order function How map works: Function is applied to every element in a list Result is a new list fffff Background MapReduce Spark

Fold Fold is also a higher-order function How fold works: Accumulator set to initial value Function applied to list element and the accumulator Result stored in the accumulator Repeated for every item in the list Result is the final value in the accumulator ffffffinal value Initial value Background MapReduce Spark

Map/Fold in Action Simple map example: Fold examples: Sum of squares: (map (lambda (x) (* x x)) '(1 2 3 4 5))  '(1 4 9 16 25) (fold + 0 '(1 2 3 4 5))  15 (fold * 1 '(1 2 3 4 5))  120 (define (sum-of-squares v) (fold + 0 (map (lambda (x) (* x x)) v))) (sum-of-squares '(1 2 3 4 5))  55 Background MapReduce Spark

MapReduce Programmers specify two functions: map (k1,v1) → list(k2,v2) reduce (k2, list (v2)) → list(v2) function map(String name, String document): // K1 name: document name // V1 document: document contents for each word w in document: emit (w, 1) function reduce(String word, Iterator partialCounts): // K2 word: a word // list(V2) partialCounts: a list of aggregated partial counts sum = 0 for each pc in partialCounts: sum += ParseInt(pc) emit (word, sum) Background MapReduce Spark An implementation of WordCount

It's just divide and conquer! Data Store Initial kv pairs map Initial kv pairs map Initial kv pairs map Initial kv pairs k 1, values… k 2, values… k 3, values… k 1, values… k 2, values… k 3, values… k 1, values… k 2, values… k 3, values… k 1, values… k 2, values… k 3, values… Barrier: aggregate values by keys reduce k 1, values… final k 1 values reduce k 2, values… final k 2 values reduce k 3, values… final k 3 values Background MapReduce Spark

Behind the scenes… Background MapReduce Spark

Programming interface input reader Map function partition function The partition function is given the key and the number of reducers and returns the index of the desired reduce. For load-balance, e.g. Hash function compare function The compare function is used to sort computing output. Ordering guarantee Reduce function output writer Background MapReduce Spark

Ouput of a Hadoop job ypeng@vm115:~/hadoop-0.20.2$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount /user/hduser/wordcount/15G-enwiki-input /user/hduser/wordcount/15G-enwiki-output 13/01/16 07:00:48 INFO input.FileInputFormat: Total input paths to process : 1 13/01/16 07:00:49 INFO mapred.JobClient: Running job: job_201301160607_0003 13/01/16 07:00:50 INFO mapred.JobClient: map 0% reduce 0%......................... 13/01/16 07:01:50 INFO mapred.JobClient: map 18% reduce 0% 13/01/16 07:01:52 INFO mapred.JobClient: map 19% reduce 0% 13/01/16 07:02:06 INFO mapred.JobClient: map 20% reduce 0% 13/01/16 07:02:08 INFO mapred.JobClient: map 20% reduce 1% 13/01/16 07:02:10 INFO mapred.JobClient: map 20% reduce 2%......................... 13/01/16 07:06:41 INFO mapred.JobClient: map 99% reduce 32% 13/01/16 07:06:47 INFO mapred.JobClient: map 100% reduce 33% 13/01/16 07:06:55 INFO mapred.JobClient: map 100% reduce 39%......................... 13/01/16 07:07:21 INFO mapred.JobClient: map 100% reduce 99% 13/01/16 07:07:31 INFO mapred.JobClient: map 100% reduce 100% 13/01/16 07:07:43 INFO mapred.JobClient: Job complete: job_201301160607_0003 (To continue.) Background MapReduce Spark Progress

Counters in a Hadoop job 13/01/16 07:07:43 INFO mapred.JobClient: Counters: 18 13/01/16 07:07:43 INFO mapred.JobClient: Job Counters 13/01/16 07:07:43 INFO mapred.JobClient: Launched reduce tasks=24 13/01/16 07:07:43 INFO mapred.JobClient: Rack-local map tasks=17 13/01/16 07:07:43 INFO mapred.JobClient: Launched map tasks=249 13/01/16 07:07:43 INFO mapred.JobClient: Data-local map tasks=203 13/01/16 07:07:43 INFO mapred.JobClient: FileSystemCounters 13/01/16 07:07:43 INFO mapred.JobClient: FILE_BYTES_READ=12023025990 13/01/16 07:07:43 INFO mapred.JobClient: HDFS_BYTES_READ=15492905740 13/01/16 07:07:43 INFO mapred.JobClient: FILE_BYTES_WRITTEN=14330761040 13/01/16 07:07:43 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=752814339 13/01/16 07:07:43 INFO mapred.JobClient: Map-Reduce Framework 13/01/16 07:07:43 INFO mapred.JobClient: Reduce input groups=39698527 13/01/16 07:07:43 INFO mapred.JobClient: Combine output records=508662829 13/01/16 07:07:43 INFO mapred.JobClient: Map input records=279422018 13/01/16 07:07:43 INFO mapred.JobClient: Reduce shuffle bytes=2647359503 13/01/16 07:07:43 INFO mapred.JobClient: Reduce output records=39698527 13/01/16 07:07:43 INFO mapred.JobClient: Spilled Records=828280813 13/01/16 07:07:43 INFO mapred.JobClient: Map output bytes=24932976267 13/01/16 07:07:43 INFO mapred.JobClient: Combine input records=2813475352 13/01/16 07:07:43 INFO mapred.JobClient: Map output records=2376465967 13/01/16 07:07:43 INFO mapred.JobClient: Reduce input records=71653444 Background MapReduce Spark Summary of counters in job

Master in MapReduce Resource Management Maintain the current resource usage of each Worker(CPU, RAM, Used & free disk space, etc.) Examine worker failure periodically. Task Scheduling “Moving computation is cheaper than moving data.” Map and reduce tasks are assigned to idle Workers. Tasks on failure workers will be re-scheduled. When job is close to end, it launches backup tasks. Counter provides interactive job progress. stores the occurrences of various events. is helpful to performance tuning. Background MapReduce Spark

Data-oriented Map scheduling Launch map 1 on Worker 3 Switch Worker 1 Worker 2 Worker 3 Switch Worker 4 Worker 5 Worker 6 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 input = 1 2345 ++++ input splits Rack 1Rack 2 Launch map 2 on Worker 4 Launch map 3 on Worker 1 Launch map 4 on Worker 2 Launch map 5 on Worker 5 Background MapReduce Spark

Data flow in MapReduce jobs Mapper Reducer other Mappers other Reducers circular buffer (in memory) spills (on disk) merged spills (on disk) intermediate files (on disk) Combiner Background MapReduce Spark GFS local split rack-local split non-local split GFS

Map internal The map phase reads the task’s input split from GFS, parses it into records(key/value pairs), and applies the map function to each records. After the map function has been applied to each record, the commit phase registers the final output to Master, which will tell reduce the location of map output. Background MapReduce Spark

Reduce internal The shuffle phase fetches the reduce task’s input data. The sort phase groups records with the same key together. The reduce phase applies the user-defined reduce function to each key and corresponding list of values. Background MapReduce Spark

Backup Tasks There are barriers in a MapReduce job. No reduce function executes until all maps finish. The job can not complete until all reduces finish. The execution time of a job will be severely lengthened if a task is blocked. Master schedules backup/speculative tasks for unfinished ones before the job is close to end. A job will take 44% longer if backup tasks are disabled. Map Reduce Job complete Background MapReduce Spark

Virtues and defects of MR Virtues Towards large scale data Programming friendly Implicit parallelism Data-locality Fault/crash tolerance Scalability Open-source with good ecosystem[1] Defects Bad for iterative ML algorithms Not sure [1] http://docs.hortonworks.com/CURRENT/index.htm#About_Hortonworks_Data_Platform/Understanding_Hadoop_Ecosystem.htm Background MapReduce Spark

Network traffic in MapReduce 1. Map may read split from remote ChunkServer 2. Reduce copy the output of Map 3. Reduce output write to GFS 1 1 2 2 3 3 Background MapReduce Spark

Disk R/W in MapReduce 1. ChunkServer reads local block for remote split fetching 2. Spill intermediate result to disk 3. Write the copied partition to local disk 4. Write the result output to local ChunkServer 5. Write the result output to remote ChunkServer 1 1 2 2 3 3 4 4 5 5 Background MapReduce Spark

Iterative MapReduce Performing graph algorithm Using MapReduce. Background MapReduce Spark

Motivation of Spark Iterative algorithms (machine learning, graphs) Interactive data mining tools (R, Excel, Python) Background MapReduce Spark

Programming Model Fine-grained Computing outputs of every iteration are distributed and store to stable storage Coarse-grained Only logging the transformations to build a dataset (i.e. lineage) Resilient distributed datasets (RDDs) Immutable, partitioned collections of objects Created through parallel transformations (map, filter, groupBy, join, …) on data in stable storage Can be cached for efficient reuse Actions on RDDs Count, reduce, collect, save, … Background MapReduce Spark

Spark Operations Transformations (define a new RDD) map filter sample groupByKey reduceByKey sortByKey flatMap union join cogroup cross mapValues Actions (return a result to driver program) collect reduce count save lookupKey Background MapReduce Spark

Example: Log Mining Load error messages from a log into memory, then interactively search for various patterns lines = spark.textFile(“hdfs://...”) errors = lines.filter(_.startsWith(“ERROR”)) messages = errors.map(_.split(‘\t’)(2)) cachedMsgs = messages.cache() Block 1 Block 2 Block 3 Worker Driver cachedMsgs.filter(_.contains(“foo”)).count cachedMsgs.filter(_.contains(“bar”)).count... tasks results Cache 1 Cache 2 Cache 3 Base RDD Transformed RDD Action Result: full-text search of Wikipedia in <1 sec (vs 20 sec for on-disk data) Result: scaled to 1 TB data in 5-7 sec (vs 170 sec for on-disk data)

RDD Fault Tolerance RDDs maintain lineage information that can be used to reconstruct lost partitions Ex: messages = textFile(...).filter(_.startsWith(“ERROR”)).map(_.split(‘\t’)(2)) HDFS File Filtered RDD Mapped RDD filter (func = _.contains(...)) map (func = _.split(...)) Background MapReduce Spark

Example: Logistic Regression Goal: find best line separating two sets of points + – + + + + + + + + – – – – – – – – + target – random initial line Background MapReduce Spark

Example: Logistic Regression val data = spark.textFile(...).map(readPoint).cache() var w = Vector.random(D) for (i <- 1 to ITERATIONS) { val gradient = data.map(p => (1 / (1 + exp(-p.y*(w dot p.x))) - 1) * p.y * p.x ).reduce(_ + _) w -= gradient } println("Final w: " + w) Background MapReduce Spark Keep variable “data” in memory

Logistic Regression Performance 127 s / iteration first iteration 174 s further iterations 6 s Background MapReduce Spark

Spark Programming Interface (eg. page rank)

Representing RDDs

Spark Scheduler Dryad-like DAGs Pipelines functions within a stage Cache-aware work reuse & locality Partitioning-aware to avoid shuffles = cached data partition Background MapReduce Spark

Behavior with Not Enough RAM

Fault Recovery Results

Conclusion Both MapReduce and Spark are excellent big data software, which are scalable, fault-tolerant, and programming friendly. Especially, Spark provides a more effective method for iterative computing jobs. Background MapReduce Spark

QUESTIONS?

Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 Material adapted from.

Similar presentations

Presentation on theme: "Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 Material adapted from."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 Material adapted from.

Similar presentations

Presentation on theme: "Comp6611 Course Lecture Big data applications Yang PENG Network and System Lab CSE, HKUST Monday, March 11, 2013 Material adapted from."— Presentation transcript:

Similar presentations

About project

Feedback