Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,

Slides:



Advertisements
Similar presentations
AP Computer Science Anthony Keen. Computer 101 What happens when you turn a computer on? –BIOS tries to start a system loader –A system loader tries to.
Advertisements

Computer Science 320 Clumping in Parallel Java. Sequential vs Parallel Program Initial setup Execute the computation Clean up Initial setup Create a parallel.
Modular Programming With Functions
Prepared 7/28/2011 by T. O’Neil for 3460:677, Fall 2011, The University of Akron.
The Divisibility & Modular Arithmetic: Selected Exercises Goal: Introduce fundamental number theory concepts: The division algorithm Congruences Rules.
Computer Science 320 Reduction Variables and Operators.
Computer Science II Recursion Professor: Evan Korth New York University.
True BASIC Ch. 6 Practice Questions. What is the output? PRINT X LET X = -1 PRINT X FOR X = 4 TO 5 STEP 2 PRINT X NEXT X PRINT X END.
1 9/29/06CS150 Introduction to Computer Science 1 Loops Section Page 255.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Recursion Road Map Introduction to Recursion Recursion Example #1: World’s Simplest Recursion Program Visualizing Recursion –Using Stacks Recursion Example.
Games at Bolton OpenMP Techniques Andrew Williams
Classes, methods, and conditional statements We’re past the basics. These are the roots.
Random (1) Random class contains a method to generate random numbers of integer and double type Note: before using Random class, you should add following.
Synchronization in Java Fawzi Emad Chau-Wen Tseng Department of Computer Science University of Maryland, College Park.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Lecture 6 Karnaugh Map. Logic Reduction Using Karnaugh Map Create a Karnaugh Map Circle (2, 4, 8..) 1’s. OR the minterm generated by each loop.
Parallel Processing (CS526) Spring 2012(Week 8).  Thread Status.  Synchronization in Shared Memory Programming(Java threads ) ◦ Locks ◦ Barriars.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Slides Prepared from the CI-Tutor Courses at NCSA By S. Masoud Sadjadi School of Computing and Information Sciences Florida.
Computer Science 320 Load Balancing for Hybrid SMP/Clusters.
CSE 260 – Parallel Processing UCSD Fall 2006 A Performance Characterization of UPC Presented by – Anup Tapadia Fallon Chen.
Definitions & Scenarios. Acknowledgements  This tutorial is based in part on Concurrency: State Models & Java Programming by Jeff Magee and Jeff Kramer.
Some Uses of Probability Randomized algorithms –for CS in general –for games and robotics in particular Testing Simulation Solving probabilistic problems.
Repetition & Loops. One of the BIG advantages of a computer: ­It can perform tasks over and over again, without getting bored or making mistakes (assuming.
Solving Quadratic Equations by Factoring. Solution by factoring Example 1 Find the roots of each quadratic by factoring. factoring a) x² − 3x + 2 b) x².
Monte Carlo Methods Versatile methods for analyzing the behavior of some activity, plan or process that involves uncertainty.
Simulation Time-stepping and Monte Carlo Methods Random Number Generation Shirley Moore CS 1401 Spring 2013 March 26, 2013.
Hit-and-Miss (or Rejection) Monte Carlo Method: a “brute-force” method based on completely random sampling Then, how do we throw the stones and count them.
1 CSCI N201 Programming Concepts and Database 9 – Loops Lingma Acheson Department of Computer and Information Science, IUPUI.
Repetition Statements while and do while loops
Loops Wrap Up 10/21/13. Topics *Sentinel Loops *Nested Loops *Random Numbers.
BEGINNING PROGRAMMING.  Literally – giving instructions to a computer so that it does what you want  Practically – using a programming language (such.
Lab 2 Parallel processing using NIOS II processors
Computer Science 320 Massive Parallelism. Example Problem: Breaking a Cipher Somehow obtain a sample plaintext and its ciphertext Then search for the.
Zhen Jiang Dept. of Computer Science West Chester University West Chester, PA CSC141 Computer Science I 12/11/20151.
Computer Science 320 Introduction to Hybrid SMP/Clusters.
Computer Science 320 Load Balancing with Clusters.
CSCI-455/552 Introduction to High Performance Computing Lecture 9.
1 1 Slide Simulation Professor Ahmadi. 2 2 Slide Simulation Chapter Outline n Computer Simulation n Simulation Modeling n Random Variables and Pseudo-Random.
Computer Science 320 Parallel Image Generation. The Mandelbrot Set.
Zhen Jiang Dept. of Computer Science West Chester University West Chester, PA CSC141 Computer Science I 2/4/20161.
Computer Science 320 Random Numbers for Parallel Programs.
CS 2200 Presentation 18b MUTEX. Questions? Our Road Map Processor Networking Parallel Systems I/O Subsystem Memory Hierarchy.
Computer Science 320 A First Program in Parallel Java.
Pattern Programming with the Seeds Framework © 2013 B. Wilkinson/Clayton Ferner SIGCSE 2013 Workshop 31 intro.ppt Modification date: Feb 17,
1 Java Programming Java Programming II Concurrent Programming: Threads ( I)
Computer Science 320 Barrier Actions. 1-D Continuous Cellular Automata 1-D array of cells, each having a value between 0.0 and 1.0 Each cell has a neighborhood.
Computer Science 320 Reduction. Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then,
Computer Science 320 Cache Interference. Unexpected Performance The testing of the partial key search SMP program produced anomalous results In particular,
CS0007: Introduction to Computer Programming The for Loop, Accumulator Variables, Seninel Values, and The Random Class.
Concurrency in Java MD. ANISUR RAHMAN. slide 2 Concurrency  Multiprogramming  Single processor runs several programs at the same time  Each program.
Loops, Part II IT108 George Mason University. Indefinite Loop Don’t always have access to the number of iterations ahead of time If a condition (user-response,
Lesson 7 Iteration Structures. Iteration is the third control structure we will explore. Iteration simply means to do something repeatedly. All iteration.
Suzaku Pattern Programming Framework (a) Structure and low level patterns © 2015 B. Wilkinson Suzaku.pptx Modification date February 22,
L131 Assignment Operators Topics Increment and Decrement Operators Assignment Operators Debugging Tips rand( ) math library functions Reading Sections.
Introduction to Recursion
Chapter 19 Java Never Ends
TK1114 Computer Programming
CSC 142 Computer Science II
CSC141 Computer Science I Zhen Jiang Dept. of Computer Science
Using compiler-directed approach to create MPI code automatically
Random numbers What does it mean for a number to be random?
Random numbers What does it mean for a number to be random?
By Brandon, Ben, and Lee Parallel Computing.
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Object Oriented Programming
CS179: GPU PROGRAMMING Recitation 2 GPU Memory Synchronization
Random numbers What does it mean for a number to be random?
Shared-Memory Paradigm & OpenMP
Presentation transcript:

Computer Science 320 Reduction

Estimating π Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle Then, C / N should be about the same ratio as circle area / square area Circle’s area = π * R 2, and circle quadrant’s area is π / 4, where R = 1 Then C / N = π / 4, and π = 4 * C / N

Monte Carlo Methods Throw N darts, and let C be the number of darts that land within the circle quadrant of a unit circle π = 4 * C / N Monte Carlo methods make use of random numbers to solve a problem The more points we generate, the more accurate the estimate

Sequential Program PiSeq Inputs: –the seed for the random number generator –the number of points to generate Output: the estimate of π Resources: java.util.Random

Sequential Program PiSeq // Start timing. long time = -System.currentTimeMillis(); // Command line arguments. static long seed; static long N; // Pseudorandom number generator. static Random prng; // Number of points within the unit circle. static long count; // Validate command line arguments. if (args.length != 2) usage(); seed = Long.parseLong (args[0]); N = Long.parseLong (args[1]); // Set up PRNG. prng = new Random (seed);

Sequential Program PiSeq // Generate n random points in the unit square, count how many are in // the unit circle. count = 0; for (long i = 0; i < N; ++ i){ double x = prng.nextDouble(); double y = prng.nextDouble(); if (x * x + y * y <= 1.0) ++ count; } // Stop timing. time += System.currentTimeMillis(); // Print results. System.out.println("pi = 4 * " + count + " / " + N + " = " + (4.0 * count / N));

Parallelize! Multiple threads generate and throw darts Shared variables: prng and count These are WMRM, so the threads must be synchronized java.util.Random is multiple thread-safe, because it uses an atomic compare-and-set (CAS) operation edu.rit.pj.reduction.SharedLong also employs CAS and thus is multiple thread-safe

Parallel Program PiSmp // Generate n random points in the unit square, count how many are in // the unit circle. count = new SharedLong(0); new ParallelTeam().execute(new ParallelRegion(){ public void run() throws Exception{ execute(0, N-1, new LongForLoop(){ public void run (long first, long last){ for (long i = first; i <= last; ++ i){ double x = prng.nextDouble(); double y = prng.nextDouble(); if (x*x + y*y <= 1.0) count.incrementAndGet(); } }); } });

Performance of PiSmp

Problem Synchronization on prng and count means threads must wait on each iteration; the more threads there are, the more waiting occurs There might be billions of iterations!

Solution Each thread gets its own prng and count, and the counts are combined at the end: reduction!

Parallel Program PiSmp2 new ParallelTeam().execute (new ParallelRegion(){ public void run() throws Exception{ execute (0, N-1, new LongForLoop(){ // Set up per-thread PRNG and counter. Random prng_thread = new Random (seed); long count_thread = 0; // Extra padding to avert cache interference. long pad0, pad1, pad2, pad3, pad4, pad5, pad6, pad7; long pad8, pad9, pada, padb, padc, padd, pade, padf; // Parallel loop body. public void run (long first, long last){ // Generate random points. for (long i = first; i <= last; ++ i){ double x = prng_thread.nextDouble(); double y = prng_thread.nextDouble(); if (x*x + y*y <= 1.0) ++ count_thread; } public void finish(){ // Reduce per-thread counts into shared count. count.addAndGet (count_thread); } }); } });

Performance of PiSmp2

Another Problem Parallel version with one thread produces the same value of π as the sequential version With multiple threads, we get a different value of π with the same seed and N Each thread is generating the same points!