How to Parallelize an Algorithm

Slides:

Advertisements

Similar presentations

Parallel List Ranking Advanced Algorithms & Data Structures Lecture Theme 17 Prof. Dr. Th. Ottmann Summer Semester 2006.

Advertisements

Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:

Problem-solving on large-scale clusters: theory and applications Lecture 3: Bringing it all together.

The University of Georgia Department of Computer Science Department of Computer Science Introducing Parallelism through Sorting Integrating Concepts from.

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Fall 2011.

MapReduce: Simplified Data Processing on Large Clusters Cloud Computing Seminar SEECS, NUST By Dr. Zahid Anwar.

Problem-solving on large-scale clusters: theory and applications Lecture 1: Introduction and Theoretical Background.

Advanced Topics in Algorithms and Data Structures 1 Two parallel list ranking algorithms An O (log n ) time and O ( n log n ) work list ranking algorithm.

Lecture 2 – MapReduce: Theory and Implementation CSE 490h – Introduction to Distributed Computing, Spring 2007 Except as otherwise noted, the content of.

This module was created with support form NSF under grant # DUE Module developed by Martin Burtscher Module B1 and B2: Parallelization.

A Computer Science Tapestry 1 Recursion (Tapestry 10.1, 10.3) l Recursion is an indispensable technique in a programming language ä Allows many complex.

MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.

Design Issues. How to parallelize  Task decomposition  Data decomposition  Dataflow decomposition Jaruloj Chongstitvatana 2 Parallel Programming: Parallelization.

Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.

Lecture 2 – MapReduce: Theory and Implementation CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of.

MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.

MapReduce Basics Chapter 2 Lin and Dyer & /tutorial/

1 Introduction to Functional Programming in Racket CS 270 Math Foundations of CS Jeremy Johnson.

Lecture 3 – MapReduce: Implementation CSE 490h – Introduction to Distributed Computing, Spring 2009 Except as otherwise noted, the content of this presentation.

CMSC201 Computer Science I for Majors Lecture 19 – Recursion

MapReduce “MapReduce allows us to stop thinking about fault tolerance.” Cathy O’Neil & Rachel Schutt, 2013.

CS239-Lecture 4 FlumeJava Madan Musuvathi Visiting Professor, UCLA

Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing MapReduce - Introduction Dr. Xiao Qin Auburn.

Homework Assignment #2 J. H. Wang Oct. 24, 2017.

The University of Adelaide, School of Computer Science

Adapted from: Google & UWash’s Creative Common MR Deck

BTEC NCF Dip in Comp: Unit 01 Principles of Computer Science Lesson 01 – Computational Thinking Mr C Johnston.

Higher-Order Functions

CC Procesamiento Masivo de Datos Otoño Lecture 4: MapReduce/Hadoop II

Lecture 3: Bringing it all together

Data Structures Recursion CIS265/506: Chapter 06 - Recursion.

Introduction to Functional Programming in Racket

EE 193: Parallel Computing

MapReduce Computing Paradigm Basics Fall 2013 Elke A. Rundensteiner

Announcements Final Exam on August 17th Wednesday at 16:00.

Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.

Cloud Distributed Computing Environment Hadoop

湖南大学-信息科学与工程学院-计算机与科学系

CS 5010 Program Design Paradigms “Bootcamp” Lesson 7.5

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Winter 2013.

MapReduce Algorithm Design Adapted from Jimmy Lin’s slides.

Objective of This Course

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2013.

CS110: Discussion about Spark

Chapter 2 Lin and Dyer & MapReduce Basics Chapter 2 Lin and Dyer &

Distributed System Gang Wu Spring，2018.

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2016.

Michael Ernst UW CSE 140 Winter 2013

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Autumn 2018.

Bin Ren, Gagan Agrawal, Brad Chamberlain, Steve Deitz

Lecture 16 (Intro to MapReduce and Hadoop)

Pointer analysis.

CS 457/557: Functional Languages Folds

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Zach Tatlock Winter 2018.

CS222: Principles of Data Management Lecture #10 External Sorting

Introduction to Functional Programming in Racket

CSE 3302 Programming Languages

MapReduce Algorithm Design

Announcements Quiz 5 HW6 due October 23

Lecture 2 The Art of Concurrency

CS222P: Principles of Data Management Lecture #10 External Sorting

CS150 Introduction to Computer Science 1

foldr and foldl applied to addition (using infix notation)

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Autumn 2017.

Chapter 2 Lin and Dyer & MapReduce Basics Chapter 2 Lin and Dyer &

CSE341: Programming Languages Lecture 8 Lexical Scope and Function Closures Dan Grossman Spring 2019.

Divide and Conquer Merge sort and quick sort Binary search

Analysis of Structured or Semi-structured Data on a Hadoop Cluster

Live Variables – Basic Block

Presentation transcript:

How to Parallelize an Algorithm Lecture 2

Today’s Outline Quiz Functional Programming Review Algorithm Parallelization Announcements Projects Readings Start by explaining a specific instantiation of the map reduce

Quiz This one is graded (unlike the last one)

Fold & Map review Fold: foldl f z [] = z Map: map f [] = [] foldl f z (x:xs) = foldl f (f z x) xs [foldr f z (x:xs) = f x (foldr f z xs)] Applies a function to every element in the list Each iteration can access the result of the previous Map: map f [] = [] map f (x:xs) = f x : map xs Applies a function to every element in the list Each iteration is independent of all others How would you parallelize Map? Fold?

Group Exercises See handout

Answers to Group Exercises Concat: concat xss = foldr (++) [] xss Given a list of lists, concats all sublists Group: group xss = foldl group_func [] xss group_func (result) (k,v) = if (has (k,v) result) then (map (update (k,v)) result) else (k,v) :: result update (k1,v1) (k2,v_list) = if (EQ k1 k2) then (k1, v1::vlist) (k2,v_list)

What issues are there in parallelization? Why Parallelize? Reasons to parallelize: Your reasons here Reasons to not parallelize: Your reasons here Reasons to parallelize: - Scalability - Better utilization of resources Reasons to not parallelize: - Problem isn’t easily parallelizable - No extra resources to use - Overhead of coordination larger than benefits of parallelization What issues are there in parallelization?

Implicit Serialization Example DocInfo: f = read_file("file.txt”) words = count_uniq_words(f) spaces = count_spaces(f) f = capitalize(f) f = remove_punct(f) words2 = count_uniq_words(f) puts(“unique words: ” + words) puts(“num spaces: ” + spaces) puts(“unique scrubbed words: ” + words2) The point of this exercise is to introduce the concept that imperative programming introduces a lot of unnecessary serialization. That, the only “real” required serialization is a data dependency. Which statements can be reordered?

Data Dependency Graph Which operations can be done in parallel? f = read_file("file.txt”) words = count_uniq_words(f) spaces = count_spaces(f) f = capitalize(f) f = remove_punct(f) words2 = count_uniq_words(f) puts(“unique words” + words) puts(“num spaces” + spaces) puts(“unique scrubbed words” + words2) read_file capitalize f0 f1 remove_punct count_uniq_words count_spaces f2 words spaces puts Show the dataflow dependency. Ask students to imagine a system where you could automatically generate and understand this graph. console0 puts count_uniq_words console1 puts Which operations can be done in parallel? console2 words2

Distributing Computation read_file capitalize Ram f0 f1 Cpu1 remove_punct Storage count_uniq_words count_spaces Cpu2 f2 words spaces puts Cpu3 Takes 5 steps console0 puts count_uniq_words console1 Cpu4 puts console2 words2

Eliminating Dependencies 1 of 3 read_file capitalize f0 f1 Synchronization points? f[0,1,2] console[0,1,2] remove_punct count_uniq_words count_spaces Ideas for removing them: Your ideas here f2 words spaces puts Show that you can shuffle operations and unnecessary side effects around by removing state from the shared context console0 puts count_uniq_words console1 puts console2 words2

Eliminating Dependencies 2 of 3 captialize, remove_punct can be combined and run first to create a copy of the data before “counting”. DocInfo 2.0: f = read_file("file.txt”) scrubbed_f = scrub_words(f) words = count_uniq_words(f) spaces = count_spaces(f) words2 = count_uniq_words(scrubbed_f) puts(“unique words” + words) puts(“num spaces” + spaces) puts(“unique scrubbed words” + words2) Show that you can shuffle operations and unnecessary side effects around by removing state from the shared context

Dependency Graph 2.0 scrub_words read_file f scrubbed_f f = read_file("file.txt”) scrubbed_f = scrub_words(f) words = count_uniq_words(f) spaces = count_spaces(f) words2 = count_uniq_words(scrubbed_f) puts(“unique words” + words) puts(“num spaces” + spaces) puts(“unique scrubbed words” + words2) scrub_words read_file f scrubbed_f count_uniq_words count_uniq_words count_spaces words spaces puts words2 This model graph shows you can break dependencies by copying the data console0 puts console1 puts console2

Distributing Computation 2.0 scrub_words read_file Ram f scrubbed_f Cpu1 Storage count_uniq_words count_uniq_words count_spaces Cpu2 words spaces puts words2 Cpu3 2 steps console0 puts console1 Cpu4 puts console2

Eliminating Dependencies 3 of 3 captialize, remove_punct only needs to be applied to each word (not the whole file) before “counting”. DocInfo 3.0: f = read_file("file.txt”) words = count_uniq_words(f) spaces = count_spaces(f) words2 = count_uniq_scrubbed_words(f) puts(“unique words” + words) puts(“num spaces” + spaces) puts(“unique scrubbed words” + words2) Show that you can shuffle operations and unnecessary side effects around by removing state from the shared context

Dependency Graph 3.0 read_file f0 f = read_file("file.txt”) words = count_uniq_words(f) spaces = count_spaces(f) words2 = count_uniq_scrubbed_words(f) puts(“unique words” + words) puts(“num spaces” + spaces) puts(“unique scrubbed words” + words2) read_file f0 count_uniq_words count_spaces count_uniq_scrubbed_words words spaces puts The point of this exercise is to introduce the concept that imperitive programming introduces a lot of unecessary serialization. That, the only “real” required serialization is a data dependency. console0 words2 puts console1 puts console2

Distributing Computation 3.0 read_file Ram f0 Cpu1 Storage count_uniq_words count_spaces count_uniq_scrubbed_words Cpu2 words spaces puts Cpu3 2 steps console0 words2 puts console1 Cpu4 puts console2

Parallelization Summary Parallelization tradeoff: Good: better scalability Bad: less algorithmic flexibility, higher complexity Neutral: optimizes for large input over small input Why avoid data dependencies? lowers complexity makes parallelization possible How do you avoid data dependencies? avoid stateful algorithms avoid side effects (clone state instead of modifying) avoid global variables and member variables

Parallelizing Map Definition of map: map f [] = [] map f (x:xs) = f x : map xs What’s required to parallelize map? function needs to be stateless function available to each computation unit input accessible by all computation units output ordering isn’t important

Parallelizing Fold Definition of fold: foldl f z [] = z foldl f z (x:xs) = foldl f (f z x) xs What’s required to parallelize fold? You can’t. Why can’t you parallelize fold? Each step depends on the result of the previous. How is fold useful in parallel computing then?

MapReduce maps a fold over the sorted result of a map! mapreduce fm fr l = map (reducePerKey fr) (group (map fm l)) reducePerKey fr (k,v_list) = (k, (foldl (fr k) [] v_list)) Assume map here is actually concatMap. Argument l is a list documents The result of first map is a list of key-value pairs The function fr takes 3 arguments key, context, current. With currying, this allows for locking the value of “key” for each list during the fold. MapReduce maps a fold over the sorted result of a map!