Lecture 22 review PRAM: A model developed for parallel machines

Slides:



Advertisements
Similar presentations
Machine cycle.
Advertisements

PRAM Algorithms Sathish Vadhiyar. PRAM Model - Introduction Parallel Random Access Machine Allows parallel-algorithm designers to treat processing power.
Lecture 38: Chapter 7: Multiprocessors Today’s topic –Vector processors –GPUs –An example 1.
Optimal PRAM algorithms: Efficiency of concurrent writing “Computer science is no more about computers than astronomy is about telescopes.” Edsger Dijkstra.
PRAM (Parallel Random Access Machine)
Efficient Parallel Algorithms COMP308
TECH Computer Science Parallel Algorithms  several operations can be executed at the same time  many problems are most naturally modeled with parallelism.
Advanced Topics in Algorithms and Data Structures Classification of the PRAM model In the PRAM model, processors communicate by reading from and writing.
PRAM Models Advanced Algorithms & Data Structures Lecture Theme 13 Prof. Dr. Th. Ottmann Summer Semester 2006.
Simulating a CRCW algorithm with an EREW algorithm Efficient Parallel Algorithms COMP308.
Uzi Vishkin.  Introduction  Objective  Model of Parallel Computation ▪ Work Depth Model ( ~ PRAM) ▪ Informal Work Depth Model  PRAM Model  Technique:
Slide 1 Parallel Computation Models Lecture 3 Lecture 4.
Advanced Topics in Algorithms and Data Structures An overview of the lecture 2 Models of parallel computation Characteristics of SIMD models Design issue.
Models of Parallel Computation
1 Lecture 8 Architecture Independent (MPI) Algorithm Design Parallel Computing Fall 2007.
Communication [Lower] Bounds for Heterogeneous Architectures Julian Bui.
Models of Parallel Computation Advanced Algorithms & Data Structures Lecture Theme 12 Prof. Dr. Th. Ottmann Summer Semester 2006.
1 Lecture 3 PRAM Algorithms Parallel Computing Fall 2008.
RAM and Parallel RAM (PRAM). Why models? What is a machine model? – A abstraction describes the operation of a machine. – Allowing to associate a value.
1 Lecture 2: Parallel computational models. 2  Turing machine  RAM (Figure )  Logic circuit model RAM (Random Access Machine) Operations supposed to.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 8, 2005 Session 8.
Parallel Algorithms Sorting and more. Keep hardware in mind When considering ‘parallel’ algorithms, – We have to have an understanding of the hardware.
Design and analysis of algorithms for multicore architectures Alejandro Salinger April 2 nd, 2009 Joint work with Alex López-Ortiz and Reza Dorrigiv.
-1.1- Chapter 2 Abstract Machine Models Lectured by: Nguyễn Đức Thái Prepared by: Thoại Nam.
RAM, PRAM, and LogP models
LogP and BSP models. LogP model Common MPP organization: complete machine connected by a network. LogP attempts to capture the characteristics of such.
Bulk Synchronous Processing (BSP) Model Course: CSC 8350 Instructor: Dr. Sushil Prasad Presented by: Chris Moultrie.
Parallel Computing Department Of Computer Engineering Ferdowsi University Hossain Deldari.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 February 3, 2005 Session 7.
Parallel Processing & Distributed Systems Thoai Nam Chapter 2.
Data Structures and Algorithms in Parallel Computing Lecture 1.
Data Structures and Algorithms in Parallel Computing Lecture 4.
1 A simple parallel algorithm Adding n numbers in parallel.
1 Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Computer Architecture Lecture 25 Fasih ur Rehman.
CPT-S Advanced Databases
These slides are based on the book:
PRAM and Parallel Computing
Community Grids Laboratory
Distributed and Parallel Processing
EECE571R -- Harnessing Massively Parallel Processors ece
PRAM Model for Parallel Computation
Dynamic connection system
Parallel Computing Lecture
Lecture 2: Parallel computational models
Parallel computation models
PRAM Algorithms.
Lecture 5: GPU Compute Architecture
PRAM Model for Parallel Computation
Lecture 16: Parallel Algorithms I
Lecture 7: Introduction to Distributed Computing.
Lecture 5: GPU Compute Architecture for the last time
Data Structures and Algorithms in Parallel Computing
Parallel and Distributed Algorithms
Symmetric Multiprocessing (SMP)
Objective of This Course
STUDY AND IMPLEMENTATION
Lecture 5 PRAM Algorithms (cont.)
CSE838 Lecture notes copy right: Moon Jung Chung
Thread Synchronization
Part 2: Parallel Models (I)
COMP60621 Fundamentals of Parallel and Distributed Systems
Unit –VIII PRAM Algorithms.
EE 4xx: Computer Architecture and Performance Programming
© 2012 Elsevier, Inc. All rights reserved.
Parallel Algorithms A Simple Model for Parallel Processing
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Process Synchronization
COMP60611 Fundamentals of Parallel and Distributed Systems
Processor design Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section 11.3.
Presentation transcript:

Lecture 22 review PRAM: A model developed for parallel machines An unbounded collection of processors Each processor has an infinite number of registers An unbounded collection of shared memory cells. All processors can access all memory cells in unit time (when there is no memory conflict). All processors execute PRAM instructions synchronously Each PRAM instruction executes in 3-phase cycles The only way processors exchange data is through the shared memory.

PRAM EREW <= CREW<= CRCW(common)<= CRCW(random)<=CRCW(priority)

PRAM variations Bounded memory PRAM, PRAM(m) In a given step, only m memory accesses can be serviced. Lemma: Assume m'<m. Any problem that can be solved on a p-processor and m-cell PRAM in t steps can be solved on a max(p,m')-processor m'-cell PRAM in O(tm/m') steps. Bounded number of processors PRAM Lemma: Any problem that can be solved by a p processor PRAM in t steps can be solved by a p’ processor PRAM in t = O(tp/p’) steps. E.g. Matrix multiplication PRAM algorithm with time complexity O(log(N)) on N^3 processors on P processors, the problem can be solved in O(log(N)N^3/P). LPRAM L units to access global memory Lemma: Any algorithm that runs in a p processor PRAM can run in LPRAM with a loss of a factor of L.

PRAM summary The RAM model is widely used. PRAM is simple and easy to understand This model never reachs beyond the algorithm community. It may get more important as threaded programming becomes more popular. The BSP (bulk synchronous parallel) model is another try after PRAM. Asynchronously progress Model latency and limited bandwidth