Pipelined Pattern This pattern is implemented in Seeds, see

Slides:



Advertisements
Similar presentations
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Advertisements

1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).
1 5.1 Pipelined Computations. 2 Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming).
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Numerical Algorithms • Matrix multiplication
Introduction to working with Loops  2000 Prentice Hall, Inc. All rights reserved. Modified for use with this course. Introduction to Computers and Programming.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Pipelined Computations Divide a problem into a series of tasks A processor completes a task sequentially and pipes the results to the next processor Pipelining.
Algorithm Efficiency and Sorting
COMPE575 Parallel & Cluster Computing 5.1 Pipelined Computations Chapter 5.
1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson Fall 2011 Grid computing course. Slides10-1.ppt Modification.
2a.1 Evaluating Parallel Programs Cluster Computing, UNC-Charlotte, B. Wilkinson.
1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science.
Pipelined Computations P0P1P2P3P5. Pipelined Computations a s in s out a[0] a s in s out a[1] a s in s out a[2] a s in s out a[3] a s in s out a[4] for(i=0.
CSCI-455/552 Introduction to High Performance Computing Lecture 13.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Data Structure Introduction.
Decomposition Data Decomposition – Dividing the data into subgroups and assigning each piece to different processors – Example: Embarrassingly parallel.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
ALGORITHMS.
Computer Science 320 A First Program in Parallel Java.
Pattern Programming with the Seeds Framework © 2013 B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 31 intro.ppt Modification date: Feb 17,
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
Pattern Programming PP-1.1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, August 29A, 2013 PatternProg-1.
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! 1 ITCS 4/5145 Parallel Computing,
Pattern Programming Seeds Framework Notes on Assignment 1 PP-2.1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, August 30, 2012 PatternProg-2.
Numerical Algorithms Chapter 11.
Sets and Maps Chapter 9.
CSE 1342 Programming Concepts
EECE 310: Software Engineering
Lecture 3: Parallel Algorithm Design
5.13 Recursion Recursive functions Functions that call themselves
Dr. Barry Wilkinson University of North Carolina Charlotte
Problem Solving Strategies
Pipelined Computations
CS4230 Parallel Programming Lecture 12: More Task Parallelism Mary Hall October 4, /04/2012 CS4230.
Building Java Programs
Stencil Pattern A stencil describes a 2- or 3- dimensional layout of processes, with each process able to communicate with its neighbors. Appears in simulating.
Algorithm design and Analysis
Describing algorithms in pseudo code
Algorithm An algorithm is a finite set of steps required to solve a problem. An algorithm must have following properties: Input: An algorithm must have.
Stacks.
Decomposition Data Decomposition Functional Decomposition
Numerical Algorithms • Parallelizing matrix multiplication
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Programming with Parallel Design Patterns
Pipelined Computations
B. Wilkinson/Clayton Ferner Seeds.ppt Modification date August
Introduction to High Performance Computing Lecture 12
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
© B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 31 session2a
Dr. Barry Wilkinson University of North Carolina Charlotte
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.
CS210- Lecture 5 Jun 9, 2005 Agenda Queues
Building Java Programs
Sets and Maps Chapter 9.
Pattern Programming Seeds Framework Workpool Assignment 1
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Data Structures & Algorithms
Building Java Programs
Data Parallel Pattern 6c.1
Building Java Programs
Data Parallel Computations and Pattern
A type is a collection of values
Data Parallel Computations and Pattern
Presentation transcript:

Pipelined Pattern This pattern is implemented in Seeds, see “Pipeline Template Tutorial.” http://coit-grid01.uncc.edu/seeds/docs/pipeline_tutorial.pdf ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2012 June 25, 2012.

Pipelined Computations Problem divided into a series of tasks that have to be completed one after the other (the basis of sequential programming). Each task executed by a separate process or processor.

Example Add all the elements of array a to an accumulating sum: for (i = 0; i < n; i++) sum = sum + a[i]; The loop could be “unfolded” to yield sum = sum + a[0]; sum = sum + a[1]; sum = sum + a[2]; sum = sum + a[3]; sum = sum + a[4]; .

Pipeline for an unfolded loop

Another Example Frequency filter - Objective to remove specific frequencies (f0, f1, f2,f3, etc.) from a digitized signal, f(t). Signal enters pipeline from left:

Where pipelining can be used to good effect Assuming problem can be divided into a series of sequential tasks, pipelined approach can provide increased execution speed under the following three types of computations: 1. If more than one instance of the complete problem is to be executed. 2. If a series of data items must be processed sequentially, each requiring multiple operations. 3. If information to start next process can be passed forward before process before it has completed all its internal operations.

“Type 1” Pipeline Space-Time Diagram

“Type 2” Pipeline Space-Time Diagram

“Type 3” Pipeline Space-Time Diagram Pipeline processing where information passes to next stage before previous state completed.

If the number of stages is larger than the number of processors in any pipeline, a group of stages can be assigned to each processor:

Example Pipelined Solutions (Examples of each type of computation)

“Type 1” pipeline Adding Numbers If more than one instance of complete problem to be executed. Adding Numbers

Basic code for process Pi : recv(&accumulation, Pi-1); accumulation = accumulation + number; send(&accumulation, Pi+1); except for the first process, P0, which is send(&number, P1); and the last process, Pn-1, which is recv(&number, Pn-2); accumulation = accumulation + number;

“Type 2” pipeline Sorting Numbers If a series of data items must be processed sequentially, each requiring multiple operations. Sorting Numbers A parallel version of insertion sort.

Pipeline for sorting using insertion sort The basic algorithm for process Pi is recv(&number, Pi-1); if (number > x) { send(&x, Pi+1); x = number; } else send(&number, Pi+1);

Prime Number Generation: Sieve of Eratosthenes Type 2 pipeline computation Series of all integers generated from 2. First number, 2, is prime and kept. All multiples of this number deleted as they cannot be prime. Process repeated with each remaining number. The algorithm removes non-primes, leaving only primes.

The code for a process, Pi, could be based upon recv(&x, Pi-1); /* repeat following for each number */ recv(&number, Pi-1); if ((number % x) != 0) send(&number, P i+1); Each process will not receive the same number of numbers and is not known beforehand. Use a “terminator” message, which is sent at the end of the sequence: recv(&x, Pi-1); for (i = 0; i < n; i++) { recv(&number, Pi-1); If (number == terminator) break; (number % x) != 0) send(&number, P i+1); }

Type 3 pipeline Solving System of Linear Equations If information to start next process can be passed forward before process before it has completed all its internal operations. Solving System of Linear Equations with upper-triangular form (created after Guassian Elimination step if needed) a’s and b’s are constants and x’s are unknowns to be found.

Back Substitution First, unknown x0 is found from last equation; i.e., Value obtained for x0 substituted into next equation to obtain x1; i.e., Values obtained for x1 and x0 substituted into next equation to obtain x2: and so on until all the unknowns are found.

Pipeline Solution First pipeline stage computes x0 and passes x0 onto the second stage, which computes x1 from x0 and passes both x0 and x1 onto the next stage, which computes x2 from x0 and x1, and so on. Type 3 pipeline computation

The ith process (0 < i < n) receives the values x0, x1, x2, …, xi-1 and computes xi from the equation: Sequential Code x[0] = b[0]/a[0][0]; /* computed separately */ for (i = 1; i < n; i++) { /*for remaining unknowns*/ sum = 0; for (j = 0; j < i; j++ sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; }

Parallel Code Pseudocode of process Pi (1 < i < n) of could be for (j = 0; j < i; j++) { recv(&x[j], Pi-1); send(&x[j], Pi+1); } sum = 0; for (j = 0; j < i; j++) sum = sum + a[i][j]*x[j]; x[i] = (b[i] - sum)/a[i][i]; send(&x[i], Pi+1); Now have additional computations to do after receiving and resending values.

Pipeline processing using back substitution

Seeds Pipeline Pattern The following has been taken directly from “Pipeline Template Tutorial.” by Jeremy Villalobos. http://coit-grid01.uncc.edu/seeds/docs/pipeline_tutorial.pdf Interface: public abstract class PipeLine extends BasicLayerInterface { public abstract Data Compute(int stage, Data input); public abstract Data DiffuseData(int segment); public abstract void GatherData(int segment, Data dat); public abstract int getDataCount(); public abstract int getStageCount(); }

Example BubbleSort Diffuse method distributes one list of integers at a time. package edu.uncc.grid.example.pipeline; import edu.uncc.grid.pgaf.datamodules.Data; import edu.uncc.grid.pgaf.datamodules.DataMap; import edu.uncc.grid.pgaf.interfaces.basic.PipeLine; public class BubbleSortModule extends PipeLine { int[][] ListsOfLists; … public Data DiffuseData(int segment) { DataMap<String, int[]> m = new DataMap<String, int[]>(); m.put("list", ListsOfLists[segment]); return m; }

Compute method uses stage number to determine which index of list the node should work on. public Data Compute(int stage, Data dat) { DataMap<String, int[]> m = (DataMap<String, in t []>)dat; int[] input = (int[]) m.get("list"); int index = stage; for( int i = index; i < 10; i++){ if( input[index] > input[i] ){ int temp = input[index]; input[index] = input[i]; input[i] = temp; } DataMap<String, int[]> out = new DataMap<String, int[]>(); System.out.print("W:"+stage+":"); for( int i = 0; i < input.length; i++){ System.out.print(input[i]+ ","); System.out.println(); out.put("list", input); return out;

Gather method puts list back into ListOfList array. public void GatherData(int segment, Data dat) { DataMap<String, int[]> m = (DataMap<String, in t []>)dat; ListsOfLists[segment] = m.get("list"); } ListOfLists can only be seen by the node on your local computer. All remote nodes will have this variable as null

public void printLists(){ for( int i = 0; i < ListsOfLists.length; i++){ for( int j= 0; j < ListsOfLists[i].length; j++){ System.out.print( ListsOfLists[i][j] + ","); } System.out.println(); public int getDataCount() { return 10; public int getStageCount() { public void initializeModule(String[] args) {}

Bootstrap class public class RunBubbleSortModule { public static void main(String[] args) { try { int[][] list = { … }; // initialize array statically BubbleSortModule bubble = new BubbleSortModule(); bubble.ListsOfLists = list; Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern( new Operand((String[])null,new Anchor("localhost", Types.DataFlowRoll.SINK_SOURCE), bubble)); System.out.println(id.toString() ); Seeds.waitOnPattern(id); bubble.printLists(); Seeds.stop(); } catch (SecurityException e) { … }

Questions