Computer Science 320 Barrier Actions. 1-D Continuous Cellular Automata 1-D array of cells, each having a value between 0.0 and 1.0 Each cell has a neighborhood.

Slides:



Advertisements
Similar presentations
Chapter 22 Implementing lists: linked implementations.
Advertisements

AP Computer Science Anthony Keen. Computer 101 What happens when you turn a computer on? –BIOS tries to start a system loader –A system loader tries to.
Computer Science 320 Clumping in Parallel Java. Sequential vs Parallel Program Initial setup Execute the computation Clean up Initial setup Create a parallel.
EXAMPLES (Arrays). Example Many engineering and scientific applications represent data as a 2-dimensional grid of values; say brightness of pixels in.
The University of Adelaide, School of Computer Science
Lecture 20: 11/12/2002CS170 Fall CS170 Computer Organization and Architecture I Ayman Abdel-Hamid Department of Computer Science Old Dominion University.
Programming and Data Structure
The University of Adelaide, School of Computer Science
Computer Science 320 Reduction Variables and Operators.
C Programming Basics Lecture 5 Engineering H192 Winter 2005 Lecture 05
Slides prepared by Rose Williams, Binghamton University Chapter 11 Recursion.
CS2420: Lecture 9 Vladimir Kulyukin Computer Science Department Utah State University.
General Computer Science for Engineers CISC 106 Lecture 13 Roger Craig Computer and Information Sciences 3/13/2009.
1 ES 314 Advanced Programming Lec 3 Sept 8 Goals: complete discussion of pointers discuss 1-d array examples Selection sorting Insertion sorting 2-d arrays.
Threads A thread is a program unit that is executed independently of other parts of the program A thread is a program unit that is executed independently.
CS2420: Lecture 8 Vladimir Kulyukin Computer Science Department Utah State University.
Today Objectives Chapter 6 of Quinn Creating 2-D arrays Thinking about “grain size” Introducing point-to-point communications Reading and printing 2-D.
Advance Data Structure 1 College Of Mathematic & Computer Sciences 1 Computer Sciences Department م. م علي عبد الكريم حبيب.
CSC – Java Programming II Lecture 9 January 30, 2002.
Chapter 6Java: an Introduction to Computer Science & Programming - Walter Savitch 1 l Array Basics l Arrays in Classes and Methods l Programming with Arrays.
Chapter 1 Algorithm Analysis
ECE 1747 Parallel Programming Shared Memory: OpenMP Environment and Synchronization.
Computer Science 320 Load Balancing for Hybrid SMP/Clusters.
Introduction to CUDA (1 of 2) Patrick Cozzi University of Pennsylvania CIS Spring 2012.
Introduction to CUDA 1 of 2 Patrick Cozzi University of Pennsylvania CIS Fall 2012.
General Features of Java Programming Language Variables and Data Types Operators Expressions Control Flow Statements.
Introduction to Computer Programming Counting Loops.
Computer Science 320 Broadcasting. Floyd’s Algorithm on SMP for i = 0 to n – 1 parallel for r = 0 to n – 1 for c = 0 to n – 1 d rc = min(d rc, d ri +
Lecture 4. RAM Model, Space and Time Complexity
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Section 8.8.  In this lesson you will learn to add, subtract, multiply, and divide rational expressions. In the previous lesson you combined a rational.
Review Recursion Call Stack. Two-dimensional Arrays Visualized as a grid int[][] grays = {{0, 20, 40}, {60, 80, 100}, {120, 140, 160}, {180, 200, 220}};
1 Workshop Topics - Outline Workshop 1 - Introduction Workshop 2 - module instantiation Workshop 3 - Lexical conventions Workshop 4 - Value Logic System.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Practice Session 9 Exchanger CyclicBarrier Exceptions.
Design Issues. How to parallelize  Task decomposition  Data decomposition  Dataflow decomposition Jaruloj Chongstitvatana 2 Parallel Programming: Parallelization.
Synchronizing threads, thread pools, etc.
Engineering H192 - Computer Programming The Ohio State University Gateway Engineering Education Coalition Lect 5P. 1Winter Quarter C Programming Basics.
CPS4200 Unix Systems Programming Chapter 2. Programs, Processes and Threads A program is a prepared sequence of instructions to accomplish a defined task.
Odds and Ends. CS 21a 09/18/05 L14: Odds & Ends Slide 2 Copyright © 2005, by the authors of these slides, and Ateneo de Manila University. All rights.
Chapter 14 Abstract Classes and Interfaces. Abstract Classes An abstract class extracts common features and functionality of a family of objects An abstract.
Computer Science 320 Massive Parallelism. Example Problem: Breaking a Cipher Somehow obtain a sample plaintext and its ciphertext Then search for the.
Introduction to CUDA (1 of n*) Patrick Cozzi University of Pennsylvania CIS Spring 2011 * Where n is 2 or 3.
Overview of Java CSCI 392 Day One. Running C code vs Java code C Source Code C Compiler Object File (machine code) Library Files Linker Executable File.
Computer Science 320 Introduction to Hybrid SMP/Clusters.
Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,
Computer Science 320 Load Balancing with Clusters.
Advanced Arithmetic, Conditionals, and Loops INFSY 535.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
1 Arrays of Arrays An array can represent a collection of any type of object - including other arrays! The world is filled with examples Monthly magazine:
Computer Science 320 Load Balancing. Behavior of Parallel Program Why do 3 threads take longer than two?
Winter 2006CISC121 - Prof. McLeod1 Stuff No stuff today!
Computer Science 320 Parallel Image Generation. The Mandelbrot Set.
Introduction to CUDA 1 of 2 Patrick Cozzi University of Pennsylvania CIS Fall 2014.
Computer Science 320 A First Program in Parallel Java.
Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,
ADVANCED POINTERS. Overview Review on pointers and arrays Common troubles with pointers Multidimensional arrays Pointers as function arguments Functions.
A FIRST BOOK OF C++ CHAPTER 8 ARRAYS AND POINTERS.
VISUAL C++ PROGRAMMING: CONCEPTS AND PROJECTS Chapter 7A Arrays (Concepts)
Chapter 5 – Part 3 Conditionals and Loops. © 2004 Pearson Addison-Wesley. All rights reserved2/19 Outline The if Statement and Conditions Other Conditional.
Chapter 15 Running Time Analysis. Topics Orders of Magnitude and Big-Oh Notation Running Time Analysis of Algorithms –Counting Statements –Evaluating.
User-Written Functions
Chapter 6 CS 3370 – C++ Functions.
Algorithm Analysis CSE 2011 Winter September 2018.
Java Software Structures: John Lewis & Joseph Chase
Parallel Programming with MPI and OpenMP
Chapter 10: Pointers 1.
Selection sort Given an array of length n,
CSCE569 Parallel Computing
CS561 Computer Architecture Hye Yeon Kim
Presentation transcript:

Computer Science 320 Barrier Actions

1-D Continuous Cellular Automata 1-D array of cells, each having a value between 0.0 and 1.0 Each cell has a neighborhood consisting of itself and the cells to its left and right The array wraps arround to accommodate neighbors of first and last cells

Processing a 1-D CCA Initially, all cells are 0, except for X C/2 = 1 For each cell, calculate a new value by multiplying the average of the neighbors by a constant A, adding another constant B, and keeping the fractional part

Processing a 1-D CCA sX0X1X2X3X4X5X6X7X8 X /1211/1211/1211/121/41/41/411/1211/12 11/12 25/65/65/611/187/181/67/1811/185/6 5/6 33/43/473/10819/3611/3625/10811/3619/3673/108 3/4 42/352/8146/8134/8122/8116/8122/8134/8146/81 52/ / / /324109/32423/10853/32423/108109/ /324 After 5 iterations on10 cells, with A = 1 and B = 11/12 Note the rational number results

Imaging the 1-D CCA After 5 iterations on10 cells, with A = 1 and B = 11/12

Imaging the 1-D CCA After 200 iterations on 400 cells, with A = 1 and B = 11/12

Resources for Rational Arithmetic edu.rit.smp.ca.BigRational –assign –add –mul –fracPart –normalize –floatValue

Program Design: Data Could use a byte matrix for the pixel data and a BigRational matrix for the cells But O(SC) complexity: limits both the number of cells and the number of steps when scaling up

Program Design: Data Represent just the current row and the previous row (2 arrays of C cells) Swap the array references after each step to avoid copying cells Also need just the row of current pixel values, not the whole matrix

CCASeq import edu.rit.image.GrayImageRow; import edu.rit.image.PJGGrayImage; import edu.rit.image.PJGImage; import edu.rit.util.Range; // Constants. static final BigRational ZERO = new BigRational ("0"); static final BigRational ONE = new BigRational ("1"); static final BigRational ONE_THIRD = new BigRational ("1/3"); // Command line arguments. static int C; static int S; static BigRational A; static BigRational B; static File imagefile;

CCASeq // Old and new cell arrays. static BigRational[] currentCell; static BigRational[] nextCell; // Grayscale image matrix. static byte[][] pixelmatrix; static PJGGrayImage image; static PJGImage.Writer writer; // One row of the grayscale image matrix. static byte[] pixelrow; static GrayImageRow imagerow;

CCASeq // Parse command line arguments. if (args.length != 5) usage(); C = Integer.parseInt (args[0]); S = Integer.parseInt (args[1]); A = new BigRational (args[2]).mul(ONE_THIRD); B = new BigRational (args[3]); imagefile = new File (args[4]); // Allocate storage for old and new cell arrays. Initialize all cells to // 0, except center cell to 1. currentCell = new BigRational[C]; nextCell = new BigRational[C]; for (int i = 0; i < C; ++ i){ currentCell[i] = new BigRational(); nextCell[i] = new BigRational(); } currentCell[C/2].assign(ONE);

CCASeq // Set up pixel matrix, image, and image writer. pixelmatrix = new byte [S+1] []; image = new PJGGrayImage (S+1, C, pixelmatrix); writer = image.prepareToWrite(new BufferedOutputStream (new FileOutputStream(imagefile))); // Allocate storage for one pixel matrix row. pixelrow = new byte[C]; imagerow = new GrayImageRow(pixelrow); imagerow.setInterpretation(PJGGrayImage.ZERO_IS_WHITE);

CCASeq // Do S time steps. for (int s = 0; s < S; ++ s){ // Calculate next state of each cell. for (int i = 0; i < C; ++ i){ nextCell[i].assign (currentCell[i]).add (currentCell[(i-1+C)%C]).add (currentCell[(i+1)%C]).mul (A).add (B).normalize().fracPart(); } // Write current CA state to image file. writeCurrentCell(s); // Advance one time step -- swap old and new cell arrays. BigRational[] tmp = currentCell; currentCell = nextCell; nextCell = tmp; }

CCASeq private static void writeCurrentCell(int r) throws IOException{ // Set image row's gray values based on current cell states. for (int i = 0; i < C; ++ i) imagerow.setPixel(i, currentCell[i].floatValue()); // Set row r of the pixel matrix. pixelmatrix[r] = pixelrow; // Write row-r slice of the image to the image file. writer.writeRowSlice(new Range(r, r)); }

Parallelize! Calculation of each row depends on previous row, so can’t have a parallel outer loop But calcs within a row are independent, so can have a parallel inner loop But must synchronize array reference swaps and writing pixel data to file

Barrier Action All threads in a parallel for loop wait at a barrier until all are finished A barrier action allows one thread to do some cleanup while the others continue to wait at the barrier

Barrier Action new ParallelTeam().execute(new ParallelRegion(){... execute(0, 99, new IntegerForLoop(){ public void run(){ for (in I = first; I <= last; ++i) // Loop body } }, new BarrierAction(){ public void run(){ // Code to execute in a single thread } });... ));

Barrier Action

CCASmp // Do S time steps. Sequential outer loop. for (int s = 0; s < S; ++ s){ final int step = s; // Calculate next state of each cell. Parallel inner loop. execute (0, C-1, new IntegerForLoop(){ public IntegerSchedule schedule(){ return IntegerSchedule.guided(); } public void run (int first, int last){ for (int i = first; i <= last; ++ i){ nextCell[i].assign (currentCell[i]).add (currentCell[(i-1+C)%C]).add (currentCell[(i+1)%C]).mul (A).add (B).normalize().fracPart(); } },

CCASmp // Synchronize threads before next outer loop iteration. new BarrierAction(){ public void run() throws Exception{ // Write current CA state to image file. writeCurrentCell (step); // Advance one time step -- swap old and new cell // arrays. BigRational[] tmp = currentCell; currentCell = nextCell; nextCell = tmp; } });

Performance