An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

1 Uniform memory access (UMA) Each processor has uniform access time to memory - also known as symmetric multiprocessors (SMPs) (example: SUN ES1000) Non-uniform.
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
CSE431 Chapter 7A.1Irwin, PSU, 2008 CSE 431 Computer Architecture Fall 2008 Chapter 7A: Intro to Multiprocessor Systems Mary Jane Irwin (
Distributed Systems CS
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Class CS 775/875, Spring 2011 Amit H. Kumar, OCCS Old Dominion University.
Development of Parallel Simulator for Wireless WCDMA Network Hong Zhang Communication lab of HUT.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Computer Architecture Introduction to MIMD architectures Ola Flygt Växjö University
Introduction to MIMD architectures
Background Computer System Architectures Computer System Software.
1 Introduction to MIMD Architectures Sima, Fountain and Kacsuk Chapter 15 CSE462.
Reference: Message Passing Fundamentals.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
Multiprocessors CSE 471 Aut 011 Multiprocessors - Flynn’s Taxonomy (1966) Single Instruction stream, Single Data stream (SISD) –Conventional uniprocessor.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
CPE 731 Advanced Computer Architecture Multiprocessor Introduction
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.
Mapping Techniques for Load Balancing
1 Parallel computing and its recent topics. 2 Outline 1. Introduction of parallel processing (1)What is parallel processing (2)Classification of parallel.
Computer System Architectures Computer System Software
Parallel Programming Models Jihad El-Sana These slides are based on the book: Introduction to Parallel Computing, Blaise Barney, Lawrence Livermore National.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
MIMD Shared Memory Multiprocessors. MIMD -- Shared Memory u Each processor has a full CPU u Each processors runs its own code –can be the same program.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
August 15, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 12: Multiprocessors: Non-Uniform Memory Access * Jeremy R. Johnson.
Parallel Computer Architecture and Interconnect 1b.1.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
ECE200 – Computer Organization Chapter 9 – Multiprocessors.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
Parallel Computing.
Lecture 3 : Performance of Parallel Programs Courtesy : MIT Prof. Amarasinghe and Dr. Rabbah’s course note.
Outline Why this subject? What is High Performance Computing?
Lecture 3: Computer Architectures
3/12/2013Computer Engg, IIT(BHU)1 INTRODUCTION-2.
Multiprocessor So far, we have spoken at length microprocessors. We will now study the multiprocessor, how they work, what are the specific problems that.
August 13, 2001Systems Architecture II1 Systems Architecture II (CS ) Lecture 11: Multiprocessors: Uniform Memory Access * Jeremy R. Johnson Monday,
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Multiprocessor  Use large number of processor design for workstation or PC market  Has an efficient medium for communication among the processor memory.
Background Computer System Architectures Computer System Software.
Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)
Hybrid Parallel Implementation of The DG Method Advanced Computing Department/ CAAM 03/03/2016 N. Chaabane, B. Riviere, H. Calandra, M. Sekachev, S. Hamlaoui.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Group Members Hamza Zahid (131391) Fahad Nadeem khan Abdual Hannan AIR UNIVERSITY MULTAN CAMPUS.
These slides are based on the book:
Introduction to Parallel Computing: MPI, OpenMP and Hybrid Programming
Overview Parallel Processing Pipelining
Introduction to parallel programming
Distributed Processors
Parallel Programming By J. H. Wang May 2, 2017.
Overview Parallel Processing Pipelining
Parallel Programming in C with MPI and OpenMP
Parallel and Multiprocessor Architectures – Shared Memory
Multiple Processor Systems
Introduction to Multiprocessors
AN INTRODUCTION ON PARALLEL PROCESSING
Distributed Systems CS
Hybrid Programming with OpenMP and MPI
Chapter 4 Multiprocessors
Database System Architectures
Parallel Programming in C with MPI and OpenMP
Presentation transcript:

An Introduction to Parallel Computing Dr. David Cronk Innovative Computing Lab University of Tennessee Distribution A: Approved for public release; distribution is unlimited.

David Cronk Distribution A: Approved for public release; distribution is unlimited. 2 Outline Parallel Architectures Parallel Processing ›What is parallel processing? ›An example of parallel processing ›Why use parallel processing? Parallel programming ›Programming models ›Message passing issues Data distribution Flow control

David Cronk Distribution A: Approved for public release; distribution is unlimited. 3 Shared Memory Architectures CPU Main Memory bus Single address space All processors have access to a pool of shared memory Symmetric multiprocessors (SMPs) – Access time is uniform

David Cronk Distribution A: Approved for public release; distribution is unlimited. 4 Shared Memory Architectures Single address space All processors have access to a pool of shared memory Non-Uniform Memory Access (NUMA) Main Memory bus CPU Main Memory bus CPU Network

David Cronk Distribution A: Approved for public release; distribution is unlimited. 5 Distributed memory Architectures MMMMMMMM PPPPPPPP Network

David Cronk Distribution A: Approved for public release; distribution is unlimited. 6 Networks Grid – processors are connected to 4 neighbors Cylinder – A closed grid Torus – A closed cylinder Hypercube – Each processor is connected to 2^n other processors, where n is the degree of the hypercube Fully Connected – Every processor is directly connected to every other processor

David Cronk Distribution A: Approved for public release; distribution is unlimited. 7 Parallel Processing What is parallel processing? ›Using multiple processors to solve a single problem Task parallelism –The problem consists of a number of independent tasks –Each processor or groups of processors can perform a separate task Data parallelism –The problem consists of dependent tasks –Each processor works on a different part of data

David Cronk Distribution A: Approved for public release; distribution is unlimited. 8 Parallel Processing We can approximate the integral as a sum of rectangles

David Cronk Distribution A: Approved for public release; distribution is unlimited. 9 Parallel Processing

David Cronk Distribution A: Approved for public release; distribution is unlimited. 10 Parallel Processing

David Cronk Distribution A: Approved for public release; distribution is unlimited. 11 Parallel Processing Why parallel processing? ›Faster time to completion Computation can be performed faster with more processors ›Able to run larger jobs or at a higher resolution Larger jobs can complete in a reasonable amount of time on multiple processors Data for larger jobs can fit in memory when spread out across multiple processors

David Cronk Distribution A: Approved for public release; distribution is unlimited. 12 Parallel Programming Outline ›Programming models ›Message passing issues Data distribution Flow control

David Cronk Distribution A: Approved for public release; distribution is unlimited. 13 Parallel Programming Programming models ›Shared memory All processes have access to global memory ›Distributed memory (message passing) Processes have access to only local memory. Data is shared via explicit message passing ›Combination shared/distributed Groups of processes share access to “local” data while data is shared between groups via explicit message passing

David Cronk Distribution A: Approved for public release; distribution is unlimited. 14 Message Passing Message passing is the most common method for programming for distributed memory With message passing, there is an explicit sender and receiver of data In message passing systems, different processes are identified by unique identifiers ›Simplify this to each having a unique numerical identifier Senders send data to a specific process based on this identifier Receivers specify which process to receive from based on this identifier

David Cronk Distribution A: Approved for public release; distribution is unlimited. 15 Parallel Programming Message Passing Issues ›Data Distribution Minimize overhead –Latency (message start up time) »Few large messages is better than many small –Memory movement Maximize load balance –Less idle time waiting for data or synchronizing –Each process should do about the same work ›Flow Control Minimize waiting

David Cronk Distribution A: Approved for public release; distribution is unlimited. 16 Data Distribution

David Cronk Distribution A: Approved for public release; distribution is unlimited. 17 Data Distribution

David Cronk Distribution A: Approved for public release; distribution is unlimited. 18 Flow Control ……… Send to 1Recv from 0 Send to 2 Recv from 1 Send to 3 Recv from 2 Send to 4 Send to 1Send to 2 Recv from 0 Send to 3 Recv from 1 Send to 4 Recv from 2

David Cronk Distribution A: Approved for public release; distribution is unlimited. 19 “This presentation was made possible through support provided by DoD HPCMP PET activities through Mississippi State University (MSU) under contract No. N D-7110.”