Lecture 5 PRAM Algorithms (cont.)

Slides:



Advertisements
Similar presentations
1 Parallel Algorithms (chap. 30, 1 st edition) Parallel: perform more than one operation at a time. PRAM model: Parallel Random Access Model. p0p0 p1p1.
Advertisements

Parallel Algorithms.
PRAM Algorithms Sathish Vadhiyar. PRAM Model - Introduction Parallel Random Access Machine Allows parallel-algorithm designers to treat processing power.
1 Lecture 5 PRAM Algorithm: Parallel Prefix Parallel Computing Fall 2008.
Block LU Factorization Lecture 24 MA471 Fall 2003.
Instructor Neelima Gupta Table of Contents Parallel Algorithms.
Parallel Algorithms and Computing Selected topics Parallel Architecture.
Optimal PRAM algorithms: Efficiency of concurrent writing “Computer science is no more about computers than astronomy is about telescopes.” Edsger Dijkstra.
Lecture 3: Parallel Algorithm Design
Parallel Matrix Operations using MPI CPS 5401 Fall 2014 Shirley Moore, Instructor November 3,
Advanced Algorithms Piyush Kumar (Lecture 12: Parallel Algorithms) Welcome to COT5405 Courtesy Baker 05.
PRAM (Parallel Random Access Machine)
Efficient Parallel Algorithms COMP308
TECH Computer Science Parallel Algorithms  several operations can be executed at the same time  many problems are most naturally modeled with parallelism.
Advanced Topics in Algorithms and Data Structures Lecture pg 1 Recursion.
Advanced Topics in Algorithms and Data Structures Classification of the PRAM model In the PRAM model, processors communicate by reading from and writing.
PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006.
Simulating a CRCW algorithm with an EREW algorithm Efficient Parallel Algorithms COMP308.
Slide 1 Parallel Computation Models Lecture 3 Lecture 4.
Overview Efficient Parallel Algorithms COMP308. COMP 308 Exam Time allowed : 2.5 hours Answer four questions (out of six). If you attempt to answer more.
Advanced Topics in Algorithms and Data Structures 1 Lecture 4 : Accelerated Cascading and Parallel List Ranking We will first discuss a technique called.
CSE621/JKim Lec4.1 9/20/99 CSE621 Parallel Algorithms Lecture 4 Matrix Operation September 20, 1999.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2001 Lecture 9 Tuesday, 11/20/01 Parallel Algorithms Chapters 28,
1 Lecture 6 More PRAM Algorithm Parallel Computing Fall 2008.
Sorting, Searching, and Simulation in the MapReduce Framework Michael T. Goodrich Dept. of Computer Science.
The Euler-tour technique
Design of parallel algorithms Matrix operations J. Porras.
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
Parallel Computers 1 The PRAM Model for Parallel Computation (Chapter 2) References:[2, Akl, Ch 2], [3, Quinn, Ch 2], from references listed for Chapter.
1 Lecture 3 PRAM Algorithms Parallel Computing Fall 2008.
Fall 2008Paradigms for Parallel Algorithms1 Paradigms for Parallel Algorithms.
Basic PRAM algorithms Problem 1. Min of n numbers Problem 2. Computing a position of the first one in the sequence of 0’s and 1’s.
Simulating a CRCW algorithm with an EREW algorithm Lecture 4 Efficient Parallel Algorithms COMP308.
RAM and Parallel RAM (PRAM). Why models? What is a machine model? – A abstraction describes the operation of a machine. – Allowing to associate a value.
1 Lecture 2: Parallel computational models. 2  Turing machine  RAM (Figure )  Logic circuit model RAM (Random Access Machine) Operations supposed to.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 8, 2005 Session 8.
Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.
A.Broumandnia, 1 5 PRAM and Basic Algorithms Topics in This Chapter 5.1 PRAM Submodels and Assumptions 5.2 Data Broadcasting 5.3.
Matrix Multiplication Instructor : Dr Sushil K Prasad Presented By : R. Jayampathi Sampath Instructor : Dr Sushil K Prasad Presented By : R. Jayampathi.
1 Lectures on Parallel and Distributed Algorithms COMP 523: Advanced Algorithmic Techniques Lecturer: Dariusz Kowalski Lectures on Parallel and Distributed.
RAM, PRAM, and LogP models
Bulk Synchronous Processing (BSP) Model Course: CSC 8350 Instructor: Dr. Sushil Prasad Presented by: Chris Moultrie.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7.
Parallel Algorithms. Parallel Models u Hypercube u Butterfly u Fully Connected u Other Networks u Shared Memory v.s. Distributed Memory u SIMD v.s. MIMD.
06/12/2015Applied Algorithmics - week41 Non-periodicity and witnesses  Periodicity - continued If string w=w[0..n-1] has periodicity p if w[i]=w[i+p],
5 PRAM and Basic Algorithms
Fall 2008Simple Parallel Algorithms1. Fall 2008Simple Parallel Algorithms2 Scalar Product of Two Vectors Let a = (a 1, a 2, …, a n ); b = (b 1, b 2, …,
Unit1: Modeling & Simulation Module5: Logic Simulation Topic: Unknown Logic Value.
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-3.
3/12/2013Computer Engg, IIT(BHU)1 PRAM ALGORITHMS-1.
PRAM and Parallel Computing
Properties and Applications of Matrices
Lecture 3: Parallel Algorithm Design
PRAM Model for Parallel Computation
Parallel Algorithms (chap. 30, 1st edition)
Lecture 2: Parallel computational models
Lecture 22 review PRAM: A model developed for parallel machines
Dr. Ameria Eldosoky Discrete mathematics
PRAM Algorithms.
PRAM Model for Parallel Computation
Parallel Matrix Operations
[ ] [ ] [ ] [ ] EXAMPLE 3 Scalar multiplication Simplify the product:
CSE838 Lecture notes copy right: Moon Jung Chung
Numerical Algorithms Quiz questions
Unit –VIII PRAM Algorithms.
Professor Ioana Banicescu CSE 8843
Parallel Algorithms A Simple Model for Parallel Processing
Module 6: Introduction to Parallel Computing
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as.
Presentation transcript:

Lecture 5 PRAM Algorithms (cont.) Parallel Computing Fall 2010

PRAM Algorithm: Matrix Multiplication A simple algorithm for multiplying two n × n matrices on a CREW PRAM with time complexity T = O(lg n) and P = n3 follows. For convenience, processors are indexed as triples (i, j, k), where i, j, k = 1, . . . , n. In the first step processor (i, j, k) concurrently reads aij and bjk and performs the multiplication aijbjk. In the following steps, for all i, k the results (i, ∗, k) are combined, using the parallel sum algorithm to form cik = j aijbjk. After lgn steps, the result cik is thus computed. The same algorithm also works on the EREW PRAM with the same time and processor complexity. The first step of the CREW algorithm need to be changed only. We avoid concurrency by broadcasting element aij to processors (i, j, ∗) using the broadcasting algorithm of the EREW PRAM in O(lg n) steps. Similarly, bjk is broadcast to processors (∗, j, k). The above algorithm also shows how an n-processor EREW PRAM can simulate an n-processor CREW PRAM with an O(lg n) slowdown.

Matrix Multiplication CREW EREW 1. aij to all (i,j,*) procs O(1) O(lgn) bjk to all (*,j,k) procs O(1) O(lgn) 2. aij*bjk at (i,j,k) proc O(1) O(1) 3. parallel sumj aij *bjk (i,*,k) procs O(lgn) O(lgn) n procs participate 4. cik = sumj aij*bjk O(1) O(1) T=O(lgn),P=O(n3 ) W=O( n3 lgn) W2 = O(n3 )

PRAM Algorithm: Logical AND operation Problem. Let X1 . . .,Xn be binary/boolean values. Find X = X1 ∧ X2 ∧ . . . ∧ Xn. The sequential problem accepts a P = 1, T = O(n),W = O(n) direct solution. An EREW PRAM algorithm solution for this problem works the same way as the PARALLEL SUM algorithm and its performance is P = O(n), T = O(lg n),W = O(n lg n) along with the improvements in P and W mentioned for the PARALLEL SUM algorithm. In the remainder we will investigate a CRCW PRAM algorithm. Let binary value Xi reside in the shared memory location i. We can find X = X1 ∧ X2 ∧ . . . ∧ Xn in constant time on a CRCW PRAM. Processor 1 first writes an 1 in shared memory cell 0. If Xi = 0, processor i writes a 0 in memory cell 0. The result X is then stored in this memory cell. The result stored in cell 0 is 1 (TRUE) unless a processor writes a 0 in cell 0; then one of the Xi is 0 (FALSE) and the result X should be FALSE, as it is.

End Thank you!