B. Wilkinson/Clayton Ferner Seeds.ppt Modification date August

Slides:



Advertisements
Similar presentations
Grid Computing, B. Wilkinson, C Program Command Line Arguments A normal C program specifies command line arguments to be passed to main with:
Advertisements

Toward using higher-level abstractions to teach Parallel Computing 5/20/2013 (c) Copyright 2013 Clayton S. Ferner, UNC Wilmington1 Clayton Ferner, University.
1 Short Course on Grid Computing Jornadas Chilenas de Computación 2010 INFONOR-CHILE 2010 November 15th - 19th, 2010 Antofagasta, Chile Dr. Barry Wilkinson.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
1 UNC-Charlotte’s Grid Computing “Seeds” framework 1 © 2011 Jeremy Villalobos /B. Wilkinson Fall 2011 Grid computing course. Slides10-1.ppt Modification.
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Programming Arrays. Question Write a program that reads 3 numbers from the user and print them in ascending order. How many variables do we need to store.
Java Unit 9: Arrays Declaring and Processing Arrays.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Pattern Programming Barry Wilkinson University of North Carolina Charlotte CCI Friday Seminar Series April 13 th, 2012.
1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science.
Pattern Programming Barry Wilkinson University of North Carolina Charlotte Computer Science Colloquium University of North Carolina at Greensboro September.
Arrays Module 6. Objectives Nature and purpose of an array Using arrays in Java programs Methods with array parameter Methods that return an array Array.
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
1 " Teaching Parallel Design Patterns to Undergraduates in Computer Science” Panel member SIGCSE The 45 th ACM Technical Symposium on Computer Science.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
1 "Workshop 31: Developing a Hands-on Undergraduate Parallel Programming Course with Pattern Programming SIGCSE The 44 th ACM Technical Symposium.
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
Computer Science 320 A First Program in Parallel Java.
Pattern Programming with the Seeds Framework © 2013 B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 31 intro.ppt Modification date: Feb 17,
Arrays Chapter 7. MIS Object Oriented Systems Arrays UTD, SOM 2 Objectives Nature and purpose of an array Using arrays in Java programs Methods.
int [] scores = new int [10];
9.1 CLASS (STATIC) VARIABLES AND METHODS Defining classes is only one aspect of object-oriented programming. The real power of object-oriented programming.
Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,
Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.
Pattern Programming PP-1.1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, August 29A, 2013 PatternProg-1.
Pattern Programming Seeds Framework Notes on Assignment 1 PP-2.1 ITCS 4/5145 Parallel Programming UNC-Charlotte, B. Wilkinson, August 30, 2012 PatternProg-2.
Arrays Chapter 7.
Topic: Classes and Objects
Dr. Barry Wilkinson University of North Carolina Charlotte
“Form Ever Follows Function” Louis Henri Sullivan
Pattern Parallel Programming
Suzaku Pattern Programming Framework Workpool pattern (Version 2)
Subroutines Idea: useful code can be saved and re-used, with different data values Example: Our function to find the largest element of an array might.
Stencil Pattern A stencil describes a 2- or 3- dimensional layout of processes, with each process able to communicate with its neighbors. Appears in simulating.
Using compiler-directed approach to create MPI code automatically
Pattern Parallel Programming
Type Conversion, Constants, and the String Object
Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra
All-to-All Pattern A pattern where all (slave) processes can communicate with each other Somewhat the worst case scenario! ITCS 4/5145 Parallel Computing,
Object Oriented Programming
int [] scores = new int [10];
Programming with Parallel Design Patterns
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt Oct 24, 2013.
Pipelined Pattern This pattern is implemented in Seeds, see
Pattern Programming Tools
© B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 31 session2a
Dr. Barry Wilkinson University of North Carolina Charlotte
int [] scores = new int [10];
Introduction to parallelism and the Message Passing Interface
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Quiz Questions Seeds pattern programming framework
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson, 2012 slides5.ppt March 20, 2014.
Pipeline Pattern ITCS 4/5145 Parallel Computing, UNC-Charlotte, B. Wilkinson slides5.ppt August 17, 2014.
Using compiler-directed approach to create MPI code automatically
Parallel Techniques • Embarrassingly Parallel Computations
Embarrassingly Parallel Computations
Pattern Programming Seeds Framework Workpool Assignment 1
Patterns Paraguin Compiler Version 2.1.
Quiz Questions Seeds pattern programming framework
Matrix Addition and Multiplication
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as.
Random Numbers while loop
Data Parallel Pattern 6c.1
Data Parallel Computations and Pattern
A type is a collection of values
Data Parallel Computations and Pattern
Presentation transcript:

B. Wilkinson/Clayton Ferner Seeds.ppt Modification date August 15 2014 Seeds Framework B. Wilkinson/Clayton Ferner Seeds.ppt Modification date August 15 2014

“Seeds” Parallel Grid Application Framework Some Key Features Pattern-programming Java user interface (C++ version developed) Self-deploys on computers, clusters, and geographically distributed computers. Three development layers, basic, advanced and expert, exposing increasing detail. We will use the basic level. http://coit-grid01.uncc.edu/seeds/

Seeds programming Workpool Several standard patterns implemented including Workpool, Pipeline, All-to-all, Stencil, etc. Workpool Three phases: Master diffuses data to slaves Slaves performs computations Master gathers results for slaves Programmer specifies what master and slave do, and what is transferred between them, without implementing low level message passing routines. Slaves Master Workpool Slaves Compute Gather Diffuse Master Message passing done by Seeds

User Program “Module” class Two classes: Diffuse “Module” class – diffuse, compute and gather methods and any other methods associated with application Run module “Bootstrap” class - creates an instance of the module class and starts the framework and executes module pattern. Diffuse Compute Gather Run module Bootstrap class

Seeds Workpool DiffuseData, Compute, and GatherData Methods Master GatherData DiffuseData Private variable total (answer) DataMap d Returns d to each slave Data argument data Compute Data argument data DataMap input Slaves DataMap output DiffuseData, Compute and GatherData methods start with a capital letter although method names should not! d created in DiffuseData. output created in Compute

Data and DataMap classes For implementation convenience two classes: Data class used to pass data between master and slaves (Uses a “segment” number to keep track of packets, see later). DataMap class inside compute method DataMap is a subclass of Data and so allows casting. DataMap methods put (String, data) – puts data into DataMap identified by string get (String) – gets stored data identified by string DataMap extends Java HashMap which implement a Map, see http://doc.java.sun.com/DocWeb/api/java.util.HashMap

Module class Data cast into a DataMap segment used by Seeds to keep track of where to put results public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); input Data = …. d.put(“name_of_inputdata", inputData); return d; } public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; //data produced by DiffuseData() DataMap<String, Object> output = new DataMap<String, Object>(); //output returned to gatherdata inputData = input.get(“name_of_inputdata”); … // computation output.put("name_of _results", results); // to return to GatherData() return output; public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; outdata = out.get (“name_of_results”); result … // aggregate outdata from all the worker nodes. result a private variable Data cast into a DataMap By framework GatherData gives back Data object with a segment number By framework

Question Will a class field modified in the DiffuseData or GatherData methods be updated with the same values as in the Compute method? Answer NO. The two methods are running on different JVMs (and different nodes)

Seeds Implementations Three Java versions developed: Full JXTA P2P version intended for a cluster and a fully distributed system (grid system). Requires an Internet connection. JXTA P2P version not needing an external network but otherwise identical, suitable for testing on a single computer. Multicore (thread-based) version specifically a single multicore computer. Multicore version much faster execution on a single computer. Only difference is minor coding changes in bootstrap class.

Bootstrap class JXTA P2P version package edu.uncc.grid.example.workpool; import java.io.IOException; import net.jxta.pipe.PipeID; import edu.uncc.grid.pgaf.Anchor; import edu.uncc.grid.pgaf.Operand; import edu.uncc.grid.pgaf.Seeds; import edu.uncc.grid.pgaf.p2p.Types; public class RunMonteCarloPiModule { public static void main(String[] args) { try { MyModule pi = new MyModule(); Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern(new Operand( (String[])null, new Anchor("hostname", Types.DataFlowRoll.SINK_SOURCE), pi )); System.out.println(id.toString() ); Seeds.waitOnPattern(id); Seeds.stop(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { e.printStackTrace(); } catch (IOException e) { } catch (Exception e) { } This code deploys framework and starts execution of pattern Different patterns have similar code

Bootstrap class Multicore version Much faster on a multicore platform Thread based Bootstrap class does not need to start and stop JXTA P2P. Seeds.start() and Seeds.stop() not needed. Otherwise user code similar. public class RunMonteCarloPiModule { public static void main(String[] args) { try { MyModule pi=new MyModule(); Thread id = Seeds.startPatternMulticore( new Operand( (String[])null, new Anchor( args[0], Types.DataFlowRole.SINK_SOURCE), pi ),4); id.join(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { e.printStackTrace(); } catch (IOException e) { } catch (Exception e) { }

Measuring Time Can instrument code in the bootstrap class: public class RunMyModule { public static void main (String [] args ) { try{ long start = System.currentTimeMillis(); MyModule m = new MyModule(); Seeds.start(. ); PipeID id = ( … ); Seeds.waitOnPattern(id); Seeds.stop(); long stop = System.currentTimeMillis(); double time = (double) (stop - start) / 1000.0; System.out.println(“Execution time = " + time); } catch (SecurityException e) { … …

Compiling/executing Can be done on the command line (ant script provided) or through an IDE (Eclipse)

Examples of applications using Workpool Pattern Computing p by the Monte Carlo method

Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without any task communications during the computation. Monte Carlo methods use random selections. For parallelizing Monte Carlo code, must address best way to generate random numbers in parallel.

Calculate p using the Monte Carlo method Circle formed within a 2 x 2 square. Ratio of area of circle to square given by: Points within square chosen randomly. Score kept of how many points happen to lie within circle. Fraction of points within circle will be , given sufficient number of randomly selected samples.

Typically only one quadrant used. One quadrant can be described by integral: Random pairs of numbers, (xr,yr) generated, each between 0 and 1. Counted as in circle if

Alternative (better) Monte Carlo Method (Not used here) Generate random values of x to compute f(x) Sum values of f(x): where xr are randomly generated values of x between x1 and x2. Monte Carlo method very useful if the function cannot be integrated numerically (maybe having a large number of variables) 3.19

Workpool implementation Slaves Compute Return number of 1000 random points inside arc of circle inside Send by starting seed for random sequence to each slave seed Aggregate answers DiffuseData GatherData Master Compute node Source/sink

Seeds Monte Carlo code MonteCarloPiModule.java DiffuseData Method (Required to be implemented) public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit }

(Required to be implemented) Compute Method (Required to be implemented) public Data Compute (Data data) { DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } output.put("inside", inside); // to return to GatherData() return output;

GatherData Method (Required to be implemented) public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. }

getDataCount Method (Required to be implemented) public int getDataCount() { return random_samples; } Set number of data “envelopes” sent from master by DiffuseData to slaves, in this case number of “seeds”. (Number of physical slaves processors might be different.) Initialized in: initializeModule(String[ ] args) { random_samples = 3000; }

Method to compute p result (used in bootstrap module) public double getPi() { // returns value of pi based on all workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; }

Complete module class MonteCarloPiModule public Data Compute (Data data) { // input gets the data produced by DiffuseData() DataMap<String, Object> input = (DataMap<String,Object>)data; DataMap<String, Object> output = new DataMap<String, Object>(); Long seed = (Long) input.get("seed"); // get random seed Random r = new Random(); r.setSeed(seed); Long inside = 0L; for (int i = 0; i < DoubleDataSize ; i++) { double x = r.nextDouble(); double y = r.nextDouble(); double dist = x * x + y * y; if (dist <= 1.0) { ++inside; } output.put("inside", inside);// store partial answer to return to GatherData() return output; // output will emit partial answers done by this method public Data DiffuseData (int segment) { DataMap<String, Object> d =new DataMap<String, Object>(); d.put("seed", R.nextLong()); return d; // returns a random seed for each job unit public void GatherData (int segment, Data dat) { DataMap<String,Object> out = (DataMap<String,Object>) dat; Long inside = (Long) out.get("inside"); total += inside; // aggregate answer from all the worker nodes. public double getPi() { // returns value of pi based on the job done by all the workers double pi = (total / (random_samples * DoubleDataSize)) * 4; return pi; public int getDataCount() { return random_samples; Complete module class MonteCarloPiModule package edu.uncc.grid.example.workpool; import java.util.Random; import java.util.logging.Level; import edu.uncc.grid.pgaf.datamodules.Data; import edu.uncc.grid.pgaf.datamodules.DataMap; import edu.uncc.grid.pgaf.interfaces.basic.Workpool; import edu.uncc.grid.pgaf.p2p.Node; public class MonteCarloPiModule extends Workpool { private static final long serialVersionUID = 1L; private static final int DoubleDataSize = 1000; double total; int random_samples; Random R; public MonteCarloPiModule() { R = new Random(); } public void initializeModule(String[] args) { total = 0; Node.getLog().setLevel(Level.WARNING); // reduce verbosity for logging random_samples = 3000; // set # of random samples

Bootstrap class (Multicore version) ... public class RunMonteCarloPiModule { public static void main(String[] args) { try { MonteCarloPiModule pi = new MonteCarloPiModule(); Seeds.start( "/path/to/seeds/seed/folder" , false); PipeID id = Seeds.startPattern(new Operand( (String[])null, new Anchor("hostname", Types.DataFlowRoll.SINK_SOURCE),pi)); System.out.println(id.toString() ); Seeds.waitOnPattern(id); Seeds.stop(); System.out.println( "The result is: " + pi.getPi() ) ; } catch (SecurityException e) { ...

Discussion Does anyone see a potential flaw in the code (clue: random number generation)

Workpool pattern 2. Matrix addition and multiplication Matrix addition and multiplication very easy to parallelize as each result value independent of other result values.

Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as bi,j, each element of C computed as: Add A B C Easy to parallelize – each processor computes one C element or group of C elements

Workpool Implementation Slave computation Adds one row of A with one row of B to create one row of C (rather than each slave adding single elements) Add A B C Note generally we want the Computation/Communication ratio as large as possible. Here it is O(1)!

Workpool implementation Slaves (one for each row) Return one row of C C A B Send one row of A and B to slave Master Compute node Following example 3 x 3 arrays and 3 slaves Source/sink

MatrixAddModule.java Continues on several sides package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; public int getDataCount() { return 3; public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); MatrixAddModule.java Continues on several sides In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)

DataMap d returned are pairs of string key and associated array DiffuseData method public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] rowB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int k = segment; for (int i=0;i<3;i++) { rowA[i] = matrixA[k][i]; rowB[i] = matrixB[k][i]; } d.put("rowA",rowA); d.put("rowB",rowB); return d; DataMap d returned are pairs of string key and associated array segment variable used to select rows Copy one row of A and one row of B into rowA, rowB to be sent to slaves rowA and rowB put in d DataMap to send to slaves

Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, int[]> output = new DataMap<String, int[]>(); int[] rowA = (int[]) input.get("rowA"); int[] rowB = (int[]) input.get("rowB"); for (int i=0;i<3;i++) { rowC[i] = rowA[i] + rowB[i]; } output.put("rowC",rowC); return output; Get two rows from data received Add rows Put result row into output with key to be sent back to master

GatherData method Note segment variable and Data from slave public void GatherData(int segment, Data dat) { DataMap<String,int[]> out = (DataMap<String,int[]>) dat; int[] rowC = (int[]) out.get("rowC"); for (int i=0;i<3;i++) { matrixC[segment][i]= rowC[i]; } Get C row sent from slave Place row into result matrix Segment variable associated with Data used to choose correct row

Bootstrap class Multicore version public class RunMonteCarloPiModule { public static void main(String[] args) { try { long start = System.currentTimeMillis(); MatrixAddModule m = new MatrixAddModule(); m.initMatrices(); Thread id = Seeds.startPatternMulticore( new Operand( (String[])null, new Anchor( args[0], Types.DataFlowRole.SINK_SOURCE),pi ),4); id.join(); long stop = System.currentTimeMillis(); double time = (double) (stop - start) / 1000.0; System.out.println("Execution time = " + time); m.printResult(); } catch …

Matrix Multiplication Sequential code to compute A x B square (n x n matrices) for (i = 0; i < n; i++) // for each row of A for (j = 0; j < n; j++) { // for each column of B c[i][j] = 0; for (k = 0; k < n; k++) c[i][j] = c[i][j] + a[i][k] * b[k][j]; } Requires n3 multiplications and n3 additions. Sequential time complexity of O(n3). Very easy to parallelize as each result independent

Workpool implementation With one slave computing one element of result: Slaves (one for each element of result) Return one element of C C A Send one row of A and one column of B to slave B Compute node Source/sink Master Following example 3 x 3 arrays and 9 slaves

MatrixAddModule.java Continues on several sides package edu.uncc.grid.example.workpool; import … public class MatrixAddModule extends Workpool { private static final long serialVersionUID = 1L; int[][] matrixA; int[][] matrixB; int[][] matrixC; public MatrixAddModule() { matrixC = new int[3][3]; } public void initMatrices(){ matrixA = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; matrixB = new int[][]{{2,5,8},{3,4,9},{1,5,2}}; public int getDataCount() { return 9; public void initializeModule(String[] args) { Node.getLog().setLevel(Level.WARNING); MatrixAddModule.java Continues on several sides In this example matrices are 3 x 3 Some initial values Required method. Number of data objects (Slaves)

DiffuseData method public Data DiffuseData(int segment) { int[] rowA = new int[3]; int[] colB = new int[3]; DataMap<String, int[]> d =new DataMap<String, int[]>(); int a=segment/3,b = segment%3 ; for (int i=0;i<3;i++) { rowA[i] = matrixA[a][i]; colB[i] = matrixB[i][b]; } d.put("rowA",rowA); d.put(“colB",colB); return d; DataMap d returned are pairs of string key and associated array segment variable used to select element in A and B Copy one row of A and one column of B into rowA, colB to be sent to slaves rowA and colB put in d DataMap to send to slaves

Note on mapping rows and columns to segments Arow Bcol segment 0 0 0 segment 1 0 1 segment 2 0 2 segment 3 1 0 segment 4 1 1 segment 5 1 2 segment 6 2 0 segment 7 2 1 segment 8 2 2 int Arow =segment/3; Int Bcol = segment%3;

Compute method public Data Compute(Data data) { int[] rowC = new int[3]; DataMap<String, int[]> input = (DataMap<String,int[]>)data; DataMap<String, Integer> output = new DataMap<String, Integer>(); int[] rowA = (int[]) input.get("rowA"); int[] colB = (int[]) input.get(“colB"); int out = 0; for (int i=0;i<3;i++) { out += rowA[i]*colB[i]; } output.put(“out",out); return output; Get two rows from data received Matrix multiplication, one result Put result into output with key to be sent back to master

GatherData method Note segment variable and Data from slave public void GatherData(int segment, Data dat) { DataMap<String,Integer> out = (DataMap<String,Integer>) dat; int answer = out.get("out"); int a=segment/3, b=segment%3; matrixC[a][b]= answer; } Get result sent from slave* Place element into result matrix Segment variable associated with Data used to choose correct row * Cast from Integer to int not necessary

Workpool Numerical integration Slaves (one for each partition) Master Compute node Source/sink Area Start End Send start and end for partition to slave Return computed area under curve F(x) x

Questions