Types of Parallel Computers

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

SE-292 High Performance Computing
Introductions to Parallel Programming Using OpenMP
Super computers Parallel Processing By: Lecturer \ Aisha Dawood.
Today’s topics Single processors and the Memory Hierarchy
Beowulf Supercomputer System Lee, Jung won CS843.
Taxanomy of parallel machines. Taxonomy of parallel machines Memory – Shared mem. – Distributed mem. Control – SIMD – MIMD.
Chapter 1 Parallel Computers.
Parallel Computers Chapter 1
CSCI-455/522 Introduction to High Performance Computing Lecture 2.
COMPE 462 Parallel Computing
Distributed Processing, Client/Server, and Clusters
History of Distributed Systems Joseph Cordina
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd ed., by B. Wilkinson & M
Tuesday, September 12, 2006 Nothing is impossible for people who don't have to do it themselves. - Weiler.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
A Comparative Study of Network Protocols & Interconnect for Cluster Computing Performance Evaluation of Fast Ethernet, Gigabit Ethernet and Myrinet.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
Parallel Computer Architectures
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Introduction to Symmetric Multiprocessors Süha TUNA Bilişim Enstitüsü UHeM Yaz Çalıştayı
Computer System Architectures Computer System Software
Lecture 4: Parallel Programming Models. Parallel Programming Models Parallel Programming Models: Data parallelism / Task parallelism Explicit parallelism.
Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Parallel Computing and Parallel Computers. Home work assignment 1. Write few paragraphs (max two page) about yourself. Currently what going on in your.
并行程序设计 Programming for parallel computing 张少强 QQ: ( 第一讲: 2011 年 9 月.
Grid Computing, B. Wilkinson, 20047a.1 Computational Grids.
Parallel Computer Architecture and Interconnect 1b.1.
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
MIMD Distributed Memory Architectures message-passing multicomputers.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Multiple Processor Systems Chapter Multiprocessors 8.2 Multicomputers 8.3 Distributed systems.
Case Study in Computational Science & Engineering - Lecture 2 1 Parallel Architecture Models Shared Memory –Dual/Quad Pentium, Cray T90, IBM Power3 Node.
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster computers –shared memory model ( access nsec) –message passing multiprocessor.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
MODERN OPERATING SYSTEMS Third Edition ANDREW S. TANENBAUM Chapter 8 Multiple Processor Systems Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall,
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
Message-Passing Computing Chapter 2. Programming Multicomputer Design special parallel programming language –Occam Extend existing language to handle.
CS- 492 : Distributed system & Parallel Processing Lecture 7: Sun: 15/5/1435 Foundations of designing parallel algorithms and shared memory models Lecturer/
1 BİL 542 Parallel Computing. 2 Parallel Programming Chapter 1.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
Outline Why this subject? What is High Performance Computing?
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Von Neumann Computers Article Authors: Rudolf Eigenman & David Lilja
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Background Computer System Architectures Computer System Software.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Slides for Parallel Programming Techniques & Applications Using Networked Workstations & Parallel Computers 2nd Edition, by B. Wilkinson & M. Allen, ©
PERFORMANCE OF THE OPENMP AND MPI IMPLEMENTATIONS ON ULTRASPARC SYSTEM Abstract Programmers and developers interested in utilizing parallel programming.
Parallel and Distributed Programming: A Brief Introduction Kenjiro Taura.
Parallel Computers Chapter 1.
Overview Parallel Processing Pipelining
Course Outline Introduction in algorithms and applications
Constructing a system with multiple computers or processors
CSCI/CMPE 3334 Systems Programming
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Programming with Shared Memory
Types of Parallel Computers
Presentation transcript:

Types of Parallel Computers Two principal types: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Cluster Computing, UNC-Charlotte, B. Wilkinson, 2006.

Shared Memory Multiprocessor

Conventional Computer Consists of a processor executing a program stored in a (main) memory: Each main memory location located by its address. Addresses start at 0 and extend to 2b - 1 when there are b bits (binary digits) in address. Main memory Instr uctions (to processor) Data (to or from processor) Processor

Shared Memory Multiprocessor System Natural way to extend single processor model - have multiple processors connected to multiple memory modules, such that each processor can access any memory module : Memory module One address space Processor-memory Interconnections Processors

Simplistic view of a small shared memory multiprocessor Processors Shared memory Bus Examples: Dual Pentiums Quad Pentiums

Real computer system have cache memory between the main memory and processors. Level 1 (L1) cache and Level 2 (L2) cache. Example Quad Shared Memory Multiprocessor Processor Processor Processor Processor L1 cache L1 cache L1 cache L1 cache L2 Cache L2 Cache L2 Cache L2 Cache Bus interface Bus interface Bus interface Bus interface Processor/ memory b us I/O interf ace Memory controller I/O b us Memory Shared memory

“Recent” innovation Dual-core and multi-core processors Two or more independent processors in one package Actually an old idea but not put into wide practice until recently. Since L1 cache is usually inside package and L2 cache outside package, dual-/multi-core processors usually share L2 cache.

Example Dual core Pentiums (Intel CoreTM2 Dual processors) -- Two processors in one package sharing a common L2 Cache. Introduced April 2005. (Also hyper-threaded) Xbox 360 game console -- triple core PowerPC microprocessor. PlayStation 3 Cell processor -- 9 core design. References and more information: http://www.intel.com/products/processor/core2duo/index.htm http://en.wikipedia.org/wiki/Dual_core

Programming Shared Memory Multiprocessors Several possible ways 1. Use Threads - programmer decomposes program into individual parallel sequences, (threads), each being able to access shared variables declared outside threads. Example Pthreads 2. Use library functions and preprocessor compiler directives with a sequential programming language to declare shared variables and specify parallelism. Example OpenMP - industry standard. Consists of library functions, compiler directives, and environment variables - needs OpenMP compiler

3. Use a modified sequential programming language -- added syntax to declare shared variables and specify parallelism. Example UPC (Unified Parallel C) - needs a UPC compiler. 4. Use a specially designed parallel programming language -- with syntax to express parallelism. Compiler automatically creates executable code for each processor (not now common). 5. Use a regular sequential programming language such as C and ask parallelizing compiler to convert it into parallel executable code. Also not now common.

Message-Passing Multicomputer Complete computers connected through an interconnection network: Interconnection network Messages Processor Local memory Computers

Interconnection Networks Limited and exhaustive interconnections 2- and 3-dimensional meshes Hypercube (not now common) Using Switches: Crossbar Trees Multistage interconnection networks

Two-dimensional array (mesh) Computer/ Links processor Also three-dimensional - used in some large high performance systems.

Three-dimensional hypercube

Four-dimensional hypercube Hypercubes popular in 1980’s - not now

Crossbar switch Memor ies Processors Switches

Tree Root Switch Links element Processors

Multistage Interconnection Network Example: Omega network 2 ´ 2 switch elements (straight-through or crossover connections) 000 000 001 001 010 010 011 011 Inputs Outputs 100 100 101 101 110 110 111 111

Networked Computers as a Computing Platform A network of computers became a very attractive alternative to expensive supercomputers and parallel computer systems for high-performance computing in early 1990’s. Several early projects. Notable: Berkeley NOW (network of workstations) project. NASA Beowulf project.

Key advantages: Very high performance workstations and PCs readily available at low cost. The latest processors can easily be incorporated into the system as they become available. Existing software can be used or modified.

Beowulf Clusters* A group of interconnected “commodity” computers achieving high performance with low cost. Typically using commodity interconnects - high speed Ethernet, and Linux OS. * Beowulf comes from name given by NASA Goddard Space Flight Center cluster project.

Cluster Interconnects Originally fast Ethernet on low cost clusters Gigabit Ethernet - easy upgrade path More Specialized/Higher Performance Myrinet - 2.4 Gbits/sec - disadvantage: single vendor cLan SCI (Scalable Coherent Interface) QNet Infiniband - may be important as infininband interfaces may be integrated on next generation PCs

Dedicated cluster with a master node and compute nodes User Computers Dedicated Cluster External network Ethernet interface Master node Switch Local network Compute nodes

Software Tools for Clusters Based upon Message Passing Parallel Programming: Parallel Virtual Machine (PVM) - developed in late 1980’s. Became very popular. Message-Passing Interface (MPI) - standard defined in 1990s. Both provide a set of user-level libraries for message passing. Use with regular programming languages (C, C++, ...).