CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Concurrent Programming.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

MAP REDUCE PROGRAMMING Dr G Sudha Sadasivam. Map - reduce sort/merge based distributed processing Best for batch- oriented processing Sort/merge is primitive.
Mapreduce and Hadoop Introduce Mapreduce and Hadoop
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
Concurrency for data-intensive applications
Distributed Computations
MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.
CS 345A Data Mining MapReduce. Single-node architecture Memory Disk CPU Machine Learning, Statistics “Classical” Data Mining.
MapReduce Dean and Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM, Vol. 51, No. 1, January Shahram.
MapReduce Simplified Data Processing on Large Clusters Google, Inc. Presented by Prasad Raghavendra.
Distributed Computations MapReduce
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
7/14/2015EECS 584, Fall MapReduce: Simplied Data Processing on Large Clusters Yunxing Dai, Huan Feng.
Distributed MapReduce Team B Presented by: Christian Bryan Matthew Dailey Greg Opperman Nate Piper Brett Ponsler Samuel Song Alex Ostapenko Keilin Bickar.
L22: SC Report, Map Reduce November 23, Map Reduce What is MapReduce? Example computing environment How it works Fault Tolerance Debugging Performance.
MapReduce Simplified Data Processing On large Clusters Jeffery Dean and Sanjay Ghemawat.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
MapReduce: Simplified Data Processing on Large Clusters
14.1 Silberschatz, Galvin and Gagne ©2009 Operating System Concepts with Java – 8 th Edition Chapter 4: Threads.
MapReduce: Simpliyed Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat To appear in OSDI 2004 (Operating Systems Design and Implementation)
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Hadoop Ida Mele. Parallel programming Parallel programming is used to improve performance and efficiency In a parallel program, the processing is broken.
MapReduce.
Introduction to Parallel Programming MapReduce Except where otherwise noted all portions of this work are Copyright (c) 2007 Google and are licensed under.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.
Google MapReduce Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc. Presented by Conroy Whitney 4 th year CS – Web Development.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
Süleyman Fatih GİRİŞ CONTENT 1. Introduction 2. Programming Model 2.1 Example 2.2 More Examples 3. Implementation 3.1 ExecutionOverview 3.2.
Map Reduce and Hadoop S. Sudarshan, IIT Bombay
Silberschatz, Galvin and Gagne ©2011Operating System Concepts Essentials – 8 th Edition Chapter 4: Threads.
Map Reduce for data-intensive computing (Some of the content is adapted from the original authors’ talk at OSDI 04)
Parallel Programming Models Basic question: what is the “right” way to write parallel programs –And deal with the complexity of finding parallelism, coarsening.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
MapReduce and Hadoop 1 Wu-Jun Li Department of Computer Science and Engineering Shanghai Jiao Tong University Lecture 2: MapReduce and Hadoop Mining Massive.
1 The Map-Reduce Framework Compiled by Mark Silberstein, using slides from Dan Weld’s class at U. Washington, Yaniv Carmeli and some other.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Map Reduce: Simplified Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat Google, Inc. OSDI ’04: 6 th Symposium on Operating Systems Design.
MAP REDUCE : SIMPLIFIED DATA PROCESSING ON LARGE CLUSTERS Presented by: Simarpreet Gill.
MapReduce How to painlessly process terabytes of data.
Google’s MapReduce Connor Poske Florida State University.
MapReduce M/R slides adapted from those of Jeff Dean’s.
MapReduce Kristof Bamps Wouter Deroey. Outline Problem overview MapReduce o overview o implementation o refinements o conclusion.
L22: Parallel Programming Language Features (Chapel and MapReduce) December 1, 2009.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
By Jeff Dean & Sanjay Ghemawat Google Inc. OSDI 2004 Presented by : Mohit Deopujari.
MapReduce: Simplified Data Processing on Large Clusters Lim JunSeok.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Chapter 4: Threads.
MapReduce : Simplified Data Processing on Large Clusters P 謝光昱 P 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.
C-Store: MapReduce Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY May. 22, 2009.
MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.
Silberschatz, Galvin and Gagne ©2013 Operating System Concepts – 9 th Edition Chapter 4: Threads.
Process by Dr. Amin Danial Asham. References Operating System Concepts ABRAHAM SILBERSCHATZ, PETER BAER GALVIN, and GREG GAGNE.
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn University
Silberschatz, Galvin and Gagne ©2009Operating System Concepts – 8 th Edition Chapter 4: Threads.
MapReduce: Simplified Data Processing on Large Clusters Jeff Dean, Sanjay Ghemawat Google, Inc.
Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.
Chapter 4: Threads Modified by Dr. Neerja Mhaskar for CS 3SH3.
MapReduce Simplied Data Processing on Large Clusters
湖南大学-信息科学与工程学院-计算机与科学系
February 26th – Map/Reduce
Chapter 4: Threads.
Cse 344 May 4th – Map/Reduce.
CS 345A Data Mining MapReduce This presentation has been altered.
Chapter 4: Threads.
Operating System Overview
MapReduce: Simplified Data Processing on Large Clusters
Presentation transcript:

CSC 660: Advanced Operating SystemsSlide #1 CSC 660: Advanced OS Concurrent Programming

CSC 660: Advanced Operating SystemsSlide #2 Topics 1.Multi-core processors 2.Types of concurrency 3.Threads 4.MapReduce

CSC 660: Advanced Operating SystemsSlide #3 Multi-Core Processors Building multiple processors on one die. Multi-Core vs SMP –Cheaper –May share cache/bus. –Faster communication. Intel Core 2 Duo Architecture

CSC 660: Advanced Operating SystemsSlide #4 Why Multi-Core?

CSC 660: Advanced Operating SystemsSlide #5 Amdahl’s Law The increase in performance of a computation due to an improvement of a proportion P of the computation by a factor S is given by. Graph shows speedup for computations where 10%, 20%, 50%, or 100% of the computation is parallelizable.

CSC 660: Advanced Operating SystemsSlide #6 Concurrency vs Parallelism Concurrency: Logically simultaneous processing. Does not require multiple processors. Parallelism: Physically simultaneous processing. Does require multiple processors.

CSC 660: Advanced Operating SystemsSlide #7 Parallel Programming Paradigms Shared Memory Communicate via altering shared vars. Requires synchronization. Faster. Harder to reason about. Message Passing Communicate by exchanging messages. Doesn’t require synchronization. Safer. Easier to reason about.

CSC 660: Advanced Operating SystemsSlide #8 Shared Memory Shared Memory Concurrency is Threads + Synchronization Synchronization Types Locks Semaphores Monitors Transactional Memory

CSC 660: Advanced Operating SystemsSlide #9 Threads Multiple paths of execution running in a single shared memory space. –Threads have local data, e.g. stack. –Code and global data are shared. –Need to synchronize concurrent accesses. Types of threads –pthreads (POSIX threads) –Java threads –.NET threads

CSC 660: Advanced Operating SystemsSlide #10 Thread Implementation Kernel threads Kernel supports and schedules threads. Blocking I/O blocks only one thread. User threads (green threads) Co-operatively scheduled in single kernel thread. Lightweight (faster to start, switch than kernel). Need to use non-blocking I/O.

CSC 660: Advanced Operating SystemsSlide #11 Why are threads hard? Synchronization Must coordinate access to shared data w/ locks. Deadlock Must always acquire locks in same order. Must know every path that accesses data and what other data each path requires. Breaks modularity See deadlock Debugging Data and time dependencies.

CSC 660: Advanced Operating SystemsSlide #12 Why are threads hard? Performance Low concurrency if locks are coarse grained. Code is complex if locks are fine grained. Performance cost of fine grained locks. Support Different OSes use different thread libraries. Many libraries are not thread safe. Few debugging tools.

CSC 660: Advanced Operating SystemsSlide #13 Software Transactional Memory Memory Transactions –Set of memory ops that executes atomically. –Illusion of serial execution like db transactions. –Easy to code like coarse-grained locking. –Scalability comparable to fine-grained locking without deadlock issues. Language Support –Haskell –Fortress

CSC 660: Advanced Operating SystemsSlide #14 Synchronized vs Atomic Code

CSC 660: Advanced Operating SystemsSlide #15 Implementing STM Data Versioning –Transaction works on new copy of data. –Copy becomes visible if transaction succeeds. Conflict Detection –Conflict occurs when two transactions work with same data and at least one writes. –Track read and write sets for each transaction. –Pessimistic detection: checks during transaction. –Optimistic detection: checks after transaction.

CSC 660: Advanced Operating SystemsSlide #16 Message Passing Threads have no shared state. No need for synchronization. Communication via messages. Message-passing Languages Erlang E Oz

CSC 660: Advanced Operating SystemsSlide #17 Erlang Features –No mutable state. –Message passing concurrency. –Green threads (1000s of parallel connections) –Fault tolerance ( % availability) Applications –Telecommunications –Air traffic control –IM (ejabberd at jabber.org)

CSC 660: Advanced Operating SystemsSlide #18 Yaws vs Apache: Throughput vs Load

CSC 660: Advanced Operating SystemsSlide #19 MapReduce Map Input: (k1, v1) Output: list(k2,v2) Reduce Input: (k2, list(v2)) Output: list(v2) Similar to Lisp functions.

CSC 660: Advanced Operating SystemsSlide #20 Example: wc map(String key, String value): // key: document name // value: document contents for each word w in value: EmitIntermediate(w, "1"); reduce(String key, Iterator values): // key: a word // values: a list of counts int result = 0; for each v in values: result += ParseInt(v); Emit(AsString(result));

CSC 660: Advanced Operating SystemsSlide #21 MapReduce Clusters Cluster consists of 1000s of machines. Machines are dual-proc x86 Linux 2-4GB. Networking: 100MBps – 1 GBps Storage: Local IDE hard disks, GoogleFS Users submit job to scheduling system.

CSC 660: Advanced Operating SystemsSlide #22 Execution Overview 1.MapReduce library in user program splits input files into M pieces of 16-64MB each. 2.Master copy and workers start. Master has M map tasks and R reduce tasks to assign to idle workers. M >> #workers 3.Map worker reads contents of input split, parses key/value pairs, and passes them to user-defined Map function. 4.Periodically map workers write output pairs to local disk, partitioned into R regions. Locations of pairs passed back to master.

CSC 660: Advanced Operating SystemsSlide #23 Execution Overview 5.When a reduce worker is notified about a location, it uses RPCs to read map output pairs from local disk of map workers. It sorts the data by keys. 6.Reduce worker iterates over intermediate data. For each unique intermediate key, it passes key and set of intermediate values to user-defined Reduce function. Output of Reduce function sent to final output file. 7.When all Map and Reduce tasks complete, Master wakes up user program and MapReduce() call returns to user code.

CSC 660: Advanced Operating SystemsSlide #24 Execution Overview

CSC 660: Advanced Operating SystemsSlide #25 Fault Tolerance Worker Failure –Master pings workers periodically. –If no response, Master marks worker as failed. –Master will mark worker task as idle and reassign. Master Failure –Current implementation fails if Master dies. –Master writes periodic checkpoints. –If dies, restarts from last checkpoint.

CSC 660: Advanced Operating SystemsSlide #26 Backup Tasks Stragglers –Workers that take an unusually long time to complete their tasks. –Often due to failing disk being slow. Backup Tasks –When MapReduce almost complete, Master schedules backup executions of remaining tasks. –Task marked completed when either original worker or backup worker completes it. –Increases computational requirements slightly. –Improves execution time noticeably.

CSC 660: Advanced Operating SystemsSlide #27 Google Index Build Crawler returns 20 TB of data. Indexer runs 5-10 mapreduce ops. Code complexity of each operation reduced significantly from prior indexer. MapReduce efficiency high enough not to reduce number of data passes.

CSC 660: Advanced Operating SystemsSlide #28 References 1.Ali-Reza Adl-Tabatabai, Christos Kozyrakis, Bratin Saha, “Multicore Programming with Transactional Memory,” Computer Architecture, 4(10), =444&page=1, =444&page=1 2.Gregory Andrews, Fundamentals of Multithreaded, Paralle, and Distributed Programming, Addison-Wesley, Gregory Andrews and Fred Schneider, “Concepts and Notations for Concurrent Programming,” ACM Computing Surveys, 15(1), Jeffrey Dean and Sanjay Ghemawat, “MapReduce: Simplified Data Processing on Large Clusters,” OSDI’04: Sixth Symposium on Operating System Design and Implementation, December John Osterhout, “Why Threads are a Bad Idea,” Abraham Silberschatz, Peter Baer Galvin, and Greg Gagne, Operating System Concepts, 6 th edition, Wiley, Herb Sutter, “The Free Lunch is Over: A Fundamental Turn Towards Concurrency,” Dr. Dobbs Journal, 30(3), March 2005.