Monte Carlo Integration Using MPI

Slides:



Advertisements
Similar presentations
Practical techniques & Examples
Advertisements

Section 7.2: Direction Fields and Euler’s Methods Practice HW from Stewart Textbook (not to hand in) p. 511 # 1-13, odd.
Parallel Strategies Partitioning consists of the following steps –Divide the problem into parts –Compute each part separately –Merge the results Divide.
GP Applications Two main areas of research Testing genetic programming in areas other techniques have been applied to. Applying genetic programming to.
Computational Physics Lecture 4 Dr. Guy Tel-Zur.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Evaluating Hypotheses
#10 MONTE CARLO SIMULATION Systems Fall 2000 Instructor: Peter M. Hahn
12d.1 Two Example Parallel Programs using MPI UNC-Wilmington, C. Ferner, 2007 Mar 209, 2007.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Monte Carlo Simulation Used when it is infeasible or impossible to compute an exact result with a deterministic algorithm Especially useful in –Studying.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Introduction to Monte Carlo Methods D.J.C. Mackay.
Buffon’s Needle Todd Savage. Buffon's needle problem asks to find the probability that a needle of length ‘l’ will land on a line, given a floor with.
Analysis of Monte Carlo Integration Fall 2012 By Yaohang Li, Ph.D.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Simulation of Random Walk How do we investigate this numerically? Choose the step length to be a=1 Use a computer to generate random numbers r i uniformly.
Chapters 5 and 6: Numerical Integration
Independent Study of Parallel Programming Languages An Independent Study By: Haris Ribic, Computer Science - Theoretical Independent Study Advisor: Professor.
Predicting performance of applications and infrastructures Tania Lorido 27th May 2011.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
1 Collective Communications. 2 Overview  All processes in a group participate in communication, by calling the same function with matching arguments.
1 MPI: Message-Passing Interface Chapter 2. 2 MPI - (Message Passing Interface) Message passing library standard (MPI) is developed by group of academics.
Section 7.2a Area between curves.
MPI and High Performance Computing: Systems and Programming Barry Britt, Systems Administrator Department of Computer Science Iowa State University.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Software Life Cycle What Requirements Gathering, Problem definition
Hybrid MPI and OpenMP Parallel Programming
Message Passing Programming Model AMANO, Hideharu Textbook pp. 140-147.
MPI Introduction to MPI Commands. Basics – Send and Receive MPI is a message passing environment. The processors’ method of sharing information is NOT.
Monte Carlo Methods So far we have discussed Monte Carlo methods based on a uniform distribution of random numbers on the interval [0,1] p(x) = 1 0  x.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Monte Carlo Process Risk Analysis for Water Resources Planning and Management Institute for Water Resources 2008.
Embarrassingly Parallel Computations Partitioning and Divide-and-Conquer Strategies Pipelined Computations Synchronous Computations Asynchronous Computations.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
CSCI-455/522 Introduction to High Performance Computing Lecture 4.
CSCI-455/552 Introduction to High Performance Computing Lecture 11.5.
P roblem of the Day - Calculator Let f be the function given by f(x) = 3e 3x and let g be the function given by g(x) = 6x 3. At what value of x do the.
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Image Processing A Study in Pixel Averaging Building a Resolution Pyramid With Parallel Computing Denise Runnels and Farnaz Zand.
2.1 Collective Communication Involves set of processes, defined by an intra-communicator. Message tags not present. Principal collective operations: MPI_BCAST()
Timing in MPI Tarik Booker MPI Presentation May 7, 2003.
COMP7330/7336 Advanced Parallel and Distributed Computing MPI Programming - Exercises Dr. Xiao Qin Auburn University
Lesson 8: Basic Monte Carlo integration
Monte Carlo Methods Some example applications in C++
The OSCAR Cluster System
MPI Message Passing Interface
Computational Lab in Physics: Numerical Integration
Using compiler-directed approach to create MPI code automatically
Predictive distributions
A Unifying View on Instance Selection
Parallel Processing - MPI
Pattern Programming Tools
By Brandon, Ben, and Lee Parallel Computing.
Introduction to parallelism and the Message Passing Interface
Monte Carlo Methods A so-called “embarrassingly parallel” computation as it decomposes into obviously independent tasks that can be done in parallel without.
Parallel Processing Javier Delgado
3.0 Functions of One Random Variable
Parallel Techniques • Embarrassingly Parallel Computations
Embarrassingly Parallel Computations
The Fundamental Theorem of Calculus
Statistical Data Mining
Replicated Binary Designs
Hybrid MPI and OpenMP Parallel Programming
SKTN 2393 Numerical Methods for Nuclear Engineers
Hardware Environment VIA cluster - 8 nodes Blade Server – 5 nodes
f(x) g(x) x x (-8,5) (8,4) (8,3) (3,0) (-4,-1) (-7,-1) (3,-2) (0,-3)
Parallel Processing - MPI
Presentation transcript:

Monte Carlo Integration Using MPI Barry L. Kurtz Chris Mitchell Appalachian State University

Monitoring MPI Performance Goals We will use MPI We will parallelize the algorithm to increase accuracy We will parallelize the algorithm to increase speed We will vary the number of processors from 1 to 8 under these conditions Node performance monitoring Graphical plot of CPU usage on each node Separates out types of CPU tasks

Integration Using Monte Carlo Main idea Similar to the PI program demonstrated with MATLAB place random points in a rectangular area and find the percentage of points that satisfy the given criteria Our functions will be in the first quadrant only Variables Number of processors used The function being integrated The number of histories in the sample space The low and high range for the interval

Example: f(x) = 2 x2 Given the range 0 to 5 The analytic solution is 2/3 x3 evaluated from 0 to 5 giving 83 1/3 Sample Calculation: # Hits = 3 Total pts = 10 Area of rectangle = 250 Estimate of Integral 250*3/10 = 75

Parallelization Techniques Increase the number of points by giving each processor the specified number of points As number of processors increases we expect accuracy to increase due to the larger number of total points Computation time should not change dramatically Divide a specified number of points “equally” between the processors As number of processors increases we expect accuracy to stay the same Total computation time should decrease

Three Test Functions f(x) = 2x2 – Strictly increasing function g(x) = e-x – Strictly decreasing function h(x) = 2 + sin(x) – Oscillating function How will we find the area of the enclosing rectangle? Issues arise with finding maximum value of the given function on the given interval Think of a solution that could apply to all three functions given above

Finding the Maximum

MPI Code for Finding Max double findMax(double low, double high, double(*fp)(double)) { double i, interval, /* size of steps between tests */ result, /* function return*/ max = 0; /* holds max value thus far found */ interval = (high - low)/100; for(i = low; i < high; i += interval) result = fp(i); if(result > max) max = result; } return max;

MPI Initialization for Accuracy MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); ::: MPI_Bcast(&numHist, 1, MPI_INT, MASTER, MPI_COMM_WORLD); MPI_Bcast(&low, 1, MPI_DOUBLE, MASTER, MPI_Bcast(&high, 1, MPI_DOUBLE,MASTER,

MPI Code for Accuracy /* history calculation loop */ for(i = 0; i < numHist; i++ ) { x = ((double)random()/((double)(RAND_MAX) + (double)(1))); x *= (high - low); x += low; y = ((double)random()/((double)(RAND_MAX) + (double)(1))) * max; /* if point is below the function value, it's a hit */ if(y < fp(x)) /* fp is the function to be integrated */ hits++; } total++; /* calculate this process' estimate of function's area */ subArea = ((double)(hits)/(double)(total)) * (max * (high - low));

Gather the Data and Calculate the Result /* calculate total hits and histories generated by all processes */ MPI_Reduce(&hits, &allHits, 1, MPI_INT, MPI_SUM, MASTER, MPI_COMM_WORLD); MPI_Reduce(&total, &allTotal, 1, MPI_INT, MPI_SUM, MASTER, MPI_COMM_WORLD); if(rank == MASTER) { area = ((double)(allHits)/(double)(allTotal)) * (max * (high - low)); printf("\nArea of function between %5.3f and %5.3f is: %f\n", low, high, area);

MPI Initialization for Speed MPI_Init(&argc, &argv); MPI_Comm_size(MPI_COMM_WORLD, &size); MPI_Comm_rank(MPI_COMM_WORLD, &rank); ::: numHist = numHist/size; MPI_Bcast(&numHist, 1, MPI_INT, MASTER, MPI_COMM_WORLD); MPI_Bcast(&low, 1, MPI_DOUBLE, MASTER, MPI_Bcast(&high, 1, MPI_DOUBLE, MASTER,

What Are Your Predictions? Will accuracy increase linearly with the number of processors? Will the execution time decrease linearly with the number of processors? How important is the random number generation? Would you expect occasional anomalies?

The Performance Monitor Monitors Performance on a Local Cluster Separates the following types of CPU usage User % System % Easy % Total % Provides a quick, intuitive view of the load balancing for the algorithm distribution Developed at Appalachian State by Keith Woodie and Michael Economy

Results for Increasing Accuracy Number of Histories per processor = 10,000,000 # Processors Time Result ABS Err 1 2.469 83.4247 0.091366667 2 2.604 83.344062 0.010728667 3 2.470 83.3935 0.060166667 4 2.612 83.3531 0.019766667 5 2.637 83.344005 0.010671667 6 2.616 83.318346 0.014987333 7 2.602 83.344739 0.011405667 8 2.618 83.334184 0.000850667

Results for Increasing Speed Total Number of Histories = 10,000,000 # Processors Time Result ABS Err 1 2.524 83.335025 0.001691667 2 1.261 83.316375 0.016958333 3 0.838 83.296083 0.037250333 4 0.629 83.313425 0.019908333 5 0.515 83.263475 0.069858333 6 0.421 83.323333 0.010000333 7 0.361 83.26295 0.070383333 8 0.317 83.368875 0.035541667