Tallahassee, Florida, 2016 COP5725 Advanced Database Systems MapReduce Spring 2016.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

MapReduce Online Tyson Condie UC Berkeley Slides by Kaixiang MO
MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
 Open source software framework designed for storage and processing of large scale data on clusters of commodity hardware  Created by Doug Cutting and.
MapReduce.
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Distributed Computations
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
Distributed Computations MapReduce
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Ch 4. The Evolution of Analytic Scalability
MapReduce.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Jeffrey D. Ullman Stanford University. 2 Chunking Replication Distribution on Racks.
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
MapReduce April 2012 Extract from various presentations: Sudarshan, Chungnam, Teradata Aster, …
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce and Hadoop 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 2: MapReduce and Hadoop Mining Massive.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
Presented by CH.Anusha.  Apache Hadoop framework  HDFS and MapReduce  Hadoop distributed file system  JobTracker and TaskTracker  Apache Hadoop NextGen.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Hadoop/MapReduce Computing Paradigm 1 Shirish Agale.
Introduction to Hadoop and HDFS
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
MapReduce How to painlessly process terabytes of data.
MapReduce M/R slides adapted from those of Jeff Dean’s.
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
Massive Data Processing 02: MapReduce Basics 闫宏飞 北京大学信息科学技术学院 7/1/2014 This work is licensed under a Creative Commons.
大规模数据处理 / 云计算 Lecture 3 – MapReduce Basics 闫宏飞 北京大学信息科学技术学院 7/12/2011 This work is licensed under a Creative Commons.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
MapReduce Algorithm Design Based on Jimmy Lin’s slides
MapReduce and Data Management Based on slides from Jimmy Lin’s lecture slides ( (licensed.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
Chapter 5 Ranking with Indexes 1. 2 More Indexing Techniques n Indexing techniques:  Inverted files - best choice for most applications  Suffix trees.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
MapReduce Computer Engineering Department Distributed Systems Course Assoc. Prof. Dr. Ahmet Sayar Kocaeli University - Fall 2015.
IBM Research ® © 2007 IBM Corporation Introduction to Map-Reduce and Join Processing.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Big Data Infrastructure Week 2: MapReduce Algorithm Design (2/2) This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0.
MapReduce Basics Chapter 2 Lin and Dyer & /tutorial/
Next Generation of Apache Hadoop MapReduce Owen
BIG DATA/ Hadoop Interview Questions.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Large-scale file systems and Map-Reduce
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.
MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner
湖南大学-信息科学与工程学院-计算机与科学系
February 26th – Map/Reduce
Cse 344 May 4th – Map/Reduce.
MapReduce Algorithm Design Adapted from Jimmy Lin’s slides.
Ch 4. The Evolution of Analytic Scalability
Distributed System Gang Wu Spring,2018.
5/7/2019 Map Reduce Map reduce.
COS 518: Distributed Systems Lecture 11 Mike Freedman
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

Tallahassee, Florida, 2016 COP5725 Advanced Database Systems MapReduce Spring 2016

What is MapReduce? Programming model – expressing distributed computations at a massive scale – “the computation takes a set of input key/value pairs, and produces a set of output key/value pairs. The user of the MapReduce library expresses the computation as two functions: map and reduce” Execution framework – organizing and performing data-intensive computations – processing parallelizable problems across huge datasets using a large number of computers (nodes) Open-source implementation: Hadoop and others 1

How Much Data ? Google processes 20 PB a day (2008) Facebook has 2.5 PB of user data + 15 TB/day (4/2009) eBay has 6.5 PB of user data + 50 TB/day (5/2009) CERN’s LHC (Large Hadron Collider) will generate 15 PB a year 2 640K ought to be enough for anybody

Who cares ? Ready-made large-data problems – Lots of user-generated content, even more user behavior data Examples: Facebook friend suggestions, Google ad. placement – Business intelligence: gather everything in a data warehouse and run analytics to generate insight Utility computing – Provision Hadoop clusters on-demand in the cloud – Lower barriers to entry for tackling large-data problems – Commoditization and democratization of large-data capabilities 3

Spread Work Over Many Machines Challenges – Workload partitioning: how do we assign work units to workers? – Load balancing: what if we have more work units than workers? – Synchronization: what if workers need to share partial results? – Aggregation: how do we aggregate partial results? – Termination: how do we know all the workers have finished? – Fault tolerance: what if workers die? Common theme – Communication between workers (e.g., to exchange states) – Access to shared resources (e.g., data) We need a synchronization mechanism 4

Current Methods Programming models – Shared memory (pthreads) – Message passing (MPI) Design Patterns – Master-slaves – Producer-consumer flows – Shared work queues 5 Message Passing P1P1 P2P2 P3P3 P4P4 P5P5 Shared Memory P1P1 P2P2 P3P3 P4P4 P5P5 Memory master slaves producerconsumer producerconsumer work queue

Problem with Current Solutions Lots of programming work – communication and coordination – workload partitioning – status reporting – optimization – locality Repeat for every problem you want to solve Stuff breaks – One server may stay up three years (1,000 days) – If you have 10,000 servers, expect to lose 10 a day 6

What We Need A Distributed System – Scalable – Fault-tolerant – Easy to program – Applicable to many problems – …… 7

How Do We Scale Up ? Divide & Conquer 8 “Work” w1w1 w2w2 w3w3 r1r1 r2r2 r3r3 “Result” “worker” Partition Combine

General Ideas Iterate over a large number of records Extract something of interest from each Shuffle and sort intermediate results Aggregate intermediate results Generate final output Key idea: provide a functional abstraction for these two operations – map (k, v) → – reduce (k’, v’) → All values with the same key are sent to the same reducer – The execution framework handles everything else… 9 Map Reduce

General Ideas 10 map Shuffle and Sort: aggregate values by keys reduce k1k1 k2k2 k3k3 k4k4 k5k5 k6k6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 ba12cc36ac52bc78 a15b27c2368 r1r1 s1s1 r2r2 s2s2 r3r3 s3s3

Two More Functions Apart from Map and Reduce, the execution framework handles everything else… Not quite…usually, programmers can also specify: – partition (k’, number of partitions) → partition for k’ Divides up key space for parallel reduce operations Often a simple hash of the key, e.g., hash(k’) mod n – combine (k’, v’) → * Mini-reducers that run in memory after the map phase Used as an optimization to reduce network traffic 11

12 combine ba12c9ac52bc78 partition map k1k1 k2k2 k3k3 k4k4 k5k5 k6k6 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 ba12cc36ac52bc78 Shuffle and Sort: aggregate values by keys reduce a15b27c298 r1r1 s1s1 r2r2 s2s2 r3r3 s3s3 c2

Motivation for Local Aggregation Ideal scaling characteristics: – Twice the data, twice the running time – Twice the resources, half the running time Why can’t we achieve this? – Synchronization requires communication – Communication kills performance Thus… avoid communication! – Reduce intermediate data via local aggregation – Combiners can help 13

Word Count v Input: { } Output:. e.g.

15 … Group by reduce key … … … … <“obama”, {1}> <“the”, {1, 1}> <“is”, {1, 1, 1}>

Word Count v2.0 16

Word Count v Key: preserve state across input key-value pairs!

Combiner Design Combiners and reducers share same method signature – Sometimes, reducers can serve as combiners – Often, not… Remember: combiner are optional optimizations – Should not affect algorithm correctness – May be run 0, 1, or multiple times Example: find average of all integers associated with the same key 18

Computing the Mean v Why can’t we use reducer as combiner?

Computing the Mean v Why doesn’t this work? combiners must have the same input and output key-value type, which also must be the same as the mapper output type and the reducer input type

Computing the Mean v3.0 21

Computing the Mean v4.0 22

MapReduce Runtime Handles scheduling – Assigns workers to map and reduce tasks Handles “data distribution” – Moves processes to data Handles synchronization – Gathers, sorts, and shuffles intermediate data Handles errors and faults – Detects worker failures and restarts Everything happens on top of a distributed FS 23

Execution 24 split 0 split 1 split 2 split 3 split 4 worker Master User Program output file 0 output file 1 (1) submit (2) schedule map (2) schedule reduce (3) read (4) local write (5) remote read (6) write Input files Map phase Intermediate files (on local disk) Reduce phase Output files

Implementation Google has a proprietary implementation in C++ – Bindings in Java, Python Hadoop is an open-source implementation in Java – Development led by Yahoo, used in production – Now an Apache project – Rapidly expanding software ecosystem Lots of custom research implementations – For GPUs, cell processors, etc. 25

Distributed File System Don’t move data to workers… move workers to the data! – Store data on the local disks of nodes in the cluster – Start up the workers on the node that has the data local Why? – Not enough RAM to hold all the data in memory – Disk access is slow, but disk throughput (data transfer rate) is reasonable A distributed file system is the answer – GFS (Google File System) for Google’s MapReduce – HDFS (Hadoop Distributed File System) for Hadoop 26

GFS Commodity hardware over “exotic” hardware – Scale “out”, not “up” Scale out (horizontally): add more nodes to a system Scale up (vertically): add resources to a single node in a system High component failure rates – Inexpensive commodity components fail all the time “Modest” number of huge files – Multi-gigabyte files are common, if not encouraged Files are write-once, mostly appended to – Perhaps concurrently Large streaming reads over random access – High sustained throughput over low latency 27

Seeks vs. Scans Consider a 1 TB database with 100-byte records – We want to update 1 percent of the records Scenario 1: random access – Each update takes ~30 ms (seek, read, write) – 10 8 updates = ~35 days Scenario 2: rewrite all records – Assume 100 MB/s throughput – Time = 5.6 hours(!) Lesson: avoid random seeks! 28

GFS Files stored as chunks – Fixed size (64MB) Reliability through replication – Each chunk replicated across 3+ chunk servers Single master to coordinate access, keep metadata – Simple centralized management No data caching – Little benefit due to large datasets, streaming reads Simplify the API – Push some of the issues onto the client (e.g., data layout) 29

Relational Databases vs. MapReduce Relational databases: – Multipurpose: analysis and transactions; batch and interactive – Data integrity via ACID transactions – Lots of tools in software ecosystem (for ingesting, reporting, etc.) – Supports SQL (and SQL integration, e.g., JDBC) – Automatic SQL query optimization MapReduce (Hadoop): – Designed for large clusters, fault tolerant – Data is accessed in “native format” – Supports many query languages – Programmers retain control over performance – Open source 30

Workloads OLTP (online transaction processing) – Typical applications: e-commerce, banking, airline reservations – User facing: real-time, low latency, highly-concurrent – Tasks: relatively small set of “standard” transactional queries – Data access pattern: random reads, updates, writes (involving relatively small amounts of data) OLAP (online analytical processing) – Typical applications: business intelligence, data mining – Back-end processing: batch workloads, less concurrency – Tasks: complex analytical queries, often ad hoc – Data access pattern: table scans, large amounts of data involved per query 31

OLTP/OLAP Integration OLTP database for user-facing transactions – Retain records of all activity – Periodic ETL (e.g., nightly) Extract-Transform-Load (ETL) – Extract records from source – Transform: clean data, check integrity, aggregate, etc. – Load into OLAP database OLAP database for data warehousing – Business intelligence: reporting, ad hoc queries, data mining, etc. – Feedback to improve OLTP services 32

Relational Algebra in MapReduce Projection – Map over tuples, emit new tuples with appropriate attributes – No reducers, unless for regrouping or resorting tuples – Alternatively: perform in reducer, after some other processing Selection – Map over tuples, emit only tuples that meet criteria – No reducers, unless for regrouping or resorting tuples – Alternatively: perform in reducer, after some other processing 33

Relational Algebra in MapReduce Group by – Example: What is the average time spent per URL? – In SQL: SELECT url, AVG(time) FROM visits GROUP BY url – In MapReduce: Map over tuples, emit time, keyed by url Framework automatically groups values by keys Compute average in reducer Optimize with combiners 34

Join in MapReduce Reduce-side Join: group by join key – Map over both sets of tuples – Emit tuple as value with join key as the intermediate key – Execution framework brings together tuples sharing the same key – Perform actual join in reducer – Similar to a “sort-merge join” in database terminology 35

Reduce-side Join: Example 36 R1R1 R4R4 S2S2 S3S3 R1R1 R4R4 S2S2 S3S3 keysvalues Map R1R1 R4R4 S2S2 S3S3 keysvalues Reduce Note: no guarantee if R is going to come first or S

Join in MapReduce Map-side Join: parallel scans – Assume two datasets are sorted by the join key 37 R1R1 R2R2 R3R3 R4R4 S1S1 S2S2 S3S3 S4S4 A sequential scan through both datasets to join (called a “merge join” in database terminology)

Join in MapReduce Map-side Join – If datasets are sorted by join key, join can be accomplished by a scan over both datasets – How can we accomplish this in parallel? Partition and sort both datasets in the same manner – In MapReduce: Map over one dataset, read from other corresponding partition No reducers necessary (unless to repartition or resort) 38

Join in MapReduce In-memory Join – Basic idea: load one dataset into memory, stream over other dataset Works if R << S and R fits into memory Called a “hash join” in database terminology – MapReduce implementation Distribute R to all nodes Map over S, each mapper loads R in memory, hashed by join key For every tuple in S, look up join key in R No reducers, unless for regrouping or resorting tuples 39

Which Join Algorithm to Use? In-memory join > map-side join > reduce-side join – Why? Limitations of each? – In-memory join: memory – Map-side join: sort order and partitioning – Reduce-side join: general purpose 40

Processing Relational Data: Summary MapReduce algorithms for processing relational data: – Group by, sorting, partitioning are handled automatically by shuffle/sort in MapReduce – Selection, projection, and other computations (e.g., aggregation), are performed either in mapper or reducer – Multiple strategies for relational joins Complex operations require multiple MapReduce jobs – Example: top ten URLs in terms of average time spent – Opportunities for automatic optimization 41