CS-4513 Distributed Computing Systems Hugh C. Lauer

CS-4513 Distributed Computing Systems Hugh C. Lauer
Map-Reduce Assumptions: Graduate level Operating Systems Making Choices about operation systems Why a micro-century? …just about enough time for one concept CS-4513 Distributed Computing Systems Hugh C. Lauer

MapReduce Programming model and implementation …
… for processing very large data sets Many terabytes, on clusters of distributed computers Supports a broad variety of real-world tasks Foundation of Google’s applications CS-4513 D-term 2009 (Special Lecture) MapReduce

Why MapReduce An important new model for distributed and parallel computing Fundamentally different from traditional models of parallelism Data Parallelism Task Parallelism Pipelined Parallelism An abstraction to automate the mechanics of data handling and to let the programmer concentrate on semantics of a problem CS-4513 D-term 2009 (Special Lecture) MapReduce

Last Year in CS-4513 Divided class into four teams
Each team to research and teach one aspect The abstraction itself and its algorithms Distributed MapReduce Class of problems that MapReduce can help solve Google File System to support MapReduce Today’s material is drawn from those presentations CS-4513 D-term 2009 (Special Lecture) MapReduce

Google Cluster 1000’s of PC-class systems High-speed interconnect
Dual proc. x86, 4-8 GB RAM Commodity disks High-speed interconnect Mb/sec Distributed, replicated file system, optimized for GByte-size files Reading and appending Non-negligible failure rates CS-4513 D-term 2009 (Special Lecture) MapReduce

Typical Applications Search TBytes for words or phrases
Create Page Rank among pages Conceptually simple Devilishly difficult to implement in distrib. environment CS-4513 D-term 2009 (Special Lecture) MapReduce

Basic Abstraction Partition application into two functions
Map Reduce Both written by programmer Let system partition execution among distributed platforms Scheduling, communication, synchronization, fault tolerance, reliability, etc. As of January 2008 10,000 separate MapReduce programs developed within Google 100,000 MapReduce jobs per day 20 Petabytes of data processed per day CS-4513 D-term 2009 (Special Lecture) MapReduce

Map and Reduce Map – written by programmer System
Takes input key-value pairs Generates set of intermediate key-value pairs System Organizes intermediate pairs by key Reduce – written by programmer Processes or merges all values for a given key Iterates through all keys CS-4513 D-term 2009 (Special Lecture) MapReduce

Example – Count Occurrences of Words in Collection of Documents
Pseudo-code:– map(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, "1"); Note: key is not used in this simple application CS-4513 D-term 2009 (Special Lecture) MapReduce

Example – Count Occurrences of Words (continued)
Pseudo-code:– reduce(String key, Iterator values): // key: a word // values: a list of counts int result = 0; for each v in values: result += ParseInt(v); Emit(AsString(result)); CS-4513 D-term 2009 (Special Lecture) MapReduce

Example – Count Occurrences of Words (continued)
MapReduce specification Names of input and output files Tuning parameters Expressed as C++ main() function Linked with MapReduce library CS-4513 D-term 2009 (Special Lecture) MapReduce

Full C++ Text of Word Frequency Application
Approximately 70 lines of C++ code Dean, J. and Ghemawat, S. “MapReduce: Simplified data processing on large clusters,” In Proceedings of Operating Systems Design and Implementation (OSDI). San Francisco, CA, pp (.pdf). Note: This paper is an earlier version of the CACM paper distributed by to class. It contains some details not included in the CACM paper. CS-4513 D-term 2009 (Special Lecture) MapReduce

Other Examples Distributed grep Count of URL access frequency
Key is pattern to search for; values are the lines to search Count of URL access frequency Similar to word count; from web access logs Reverse web-link graph Obtain list of sources for URL target Large-scale indexing Google production search service CS-4513 D-term 2009 (Special Lecture) MapReduce

What It Does Map: (k1, v1)  list(k2, v2)
Reduce: (k2, list(v2))  list(v2) MapReduce library: Converts input arguments to many (k1, v1) pairs; calls Map for each pair Reorganizes intermediate lists from Map Calls Reduce for each intermediate key k2 CS-4513 D-term 2009 (Special Lecture) MapReduce

Brute-Force Implementation
CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 0: split input files into pieces–16-64 Mbyte CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 1:– Fork User Program Many distributed processes Scattered across cluster One designated as Master Brute-Force Implementation CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 2:– Master assigns worker tasks Manages results Monitors behavior & faults CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 3:– Map workers Read input splits via GFS Parse key-value pairs Passes pair to Map function Buffers output in local mem. CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 4:–Intermediate files To local disk (via GFS) Notify master CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 5:–Reduce worker Reads int. data (streaming) Sorts by intermediate key CS-4513 D-term 2009 (Special Lecture) MapReduce

Step 6:–Call Reduce function For each key, list of values Writes output file Notifies master CS-4513 D-term 2009 (Special Lecture) MapReduce

Result One output file for each Reduce worker
Combined by application program or Passed to another Map-Reduce call Or another distributed application CS-4513 D-term 2009 (Special Lecture) MapReduce

Questions? This presentation is stored at //

Distributed System Issues
Fault-tolerance Distributed file access Scalable performance CS-4513 D-term 2009 (Special Lecture) MapReduce

Managing Faults and Failures
In a cluster of 1800 nodes, there will always be a handful of failures Question: with 1800 hard drives, 100,000 hours MTBF, what is MBTF of a drive in cluster? Some processors may be “slow” – called stragglers Intermittent memory or bus errors Recoverable disk or network errors Over-scheduling by system CS-4513 D-term 2009 (Special Lecture) MapReduce

Managing Faults and Failures (continued)
Master task periodically pings worker tasks If no response, starts a new worker task with same responsibility New worker reads data from different replica Also, starts backup tasks for stragglers Just in case! Whichever task finishes first “wins” Other task(s) shut down Performance penalty for backup tasks A few percent loss in system resources Enormous improvement in response time CS-4513 D-term 2009 (Special Lecture) MapReduce

Questions?

Google File System Assumptions
System failures are the norm System stores mostly large (multi-gigabyte) files Expected read operations Large streaming accesses (> 1MByte per access) Few random accesses (a few KB out of someplace random) Expected write operations Long appending writes Multiple clients appending concurrently Updates in place to the middle of a file are extremely rare … and expensive Bandwidth trumps latency CS-4513 D-term 2009 (Special Lecture) MapReduce

Google File System (continued)
One Master server per cluster Many Chunk servers in each cluster Clients CS-4513 D-term 2009 (Special Lecture) MapReduce

Files partitioned into 64 MByte chunks Each chunk is replicated across chunk servers Chunk server stores its chunks in traditional Linux files on node of the cluster At least three replicas per chunk (different servers) No caching of file data (not useful in streaming!) Dynamic replication if chunk server fails CS-4513 D-term 2009 (Special Lecture) MapReduce

Master maintains metadata & chunk info Also garbage collection All data transactions are between clients and chunk servers Transactions between client and Master for control and info only Atomic transactions, replicated log Master can be restarted on a different node as necessary Also chunk servers CS-4513 D-term 2009 (Special Lecture) MapReduce

Reference Ghemawat, Sanjay, Gobioff, Howard, and Leung, Shun-Tak, “The Google File System,” Proceedings of the 2003 Symposium on Operating System Principles, Bolton Landing (Lake George), NY, October (.pdf) CS-4513 D-term 2009 (Special Lecture) MapReduce

Additonal Reference Dean, Jeffrey, and Ghemawat, Sanjay, “MapReduce: Simplified Data Processing on Large Clusters,” Communications of the ACM, vol 51, #1, January 2008, pp (.pdf) CS-4513 D-term 2009 (Special Lecture) MapReduce

Questions?

CS-4513 Distributed Computing Systems Hugh C. Lauer

Similar presentations

Presentation on theme: "CS-4513 Distributed Computing Systems Hugh C. Lauer"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

CS-4513 Distributed Computing Systems Hugh C. Lauer

Similar presentations

Presentation on theme: "CS-4513 Distributed Computing Systems Hugh C. Lauer"— Presentation transcript:

Similar presentations

About project

Feedback