CM-5 Massively Parallel Supercomputer ALAN MOSER Thinking Machines Corporation 1993.

Slides:



Advertisements
Similar presentations
Multiple Processor Systems
Advertisements

Connection Machine Architecture Greg Faust, Mike Gibson, Sal Valente CS-6354 Computer Architecture Fall
DOE ASCI TeraFLOPS Rejitha Anand CMPS Accelerated Strategic Computing Initiative Large, complex, multifaceted, highly integrated research and development.
Distributed Systems CS
Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
1 Computational models of the physical world Cortical bone Trabecular bone.
Presented by: Quinn Gaumer CPS 221.  16,384 Processing Nodes (32 MHz)  30 m x 30 m  Teraflop  1992.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Introduction to MIMD architectures
History of Distributed Systems Joseph Cordina
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.
2. Multiprocessors Main Structures 2.1 Shared Memory x Distributed Memory Shared-Memory (Global-Memory) Multiprocessor:  All processors can access all.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
Parallel Computers Past and Present Yenchi Lin Apr 17,2003.
1  1998 Morgan Kaufmann Publishers Chapter 9 Multiprocessors.
Hitachi SR8000 Supercomputer LAPPEENRANTA UNIVERSITY OF TECHNOLOGY Department of Information Technology Introduction to Parallel Computing Group.
Arquitectura de Sistemas Paralelos e Distribuídos Paulo Marques Dep. Eng. Informática – Universidade de Coimbra Ago/ Machine.
Course Outline Introduction in algorithms and applications Parallel machines and architectures Overview of parallel machines, trends in top-500 Cluster.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
IBM RS/6000 SP POWER3 SMP Jari Jokinen Pekka Laurila.
Fall 2008Introduction to Parallel Processing1 Introduction to Parallel Processing.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
Introduction to Parallel Processing Ch. 12, Pg
Ch4: Distributed Systems Architectures. Typically, system with several interconnected computers that do not share clock or memory. Motivation: tie together.
Reference: / Parallel Programming Paradigm Yeni Herdiyeni Dept of Computer Science, IPB.
1 Lecture 7: Part 2: Message Passing Multicomputers (Distributed Memory Machines)
Course Outline Introduction in software and applications. Parallel machines and architectures –Overview of parallel machines –Cluster computers (Myrinet)
“The Architecture of Massively Parallel Processor CP-PACS” Taisuke Boku, Hiroshi Nakamura, et al. University of Tsukuba, Japan by Emre Tapcı.
1 Interconnects Shared address space and message passing computers can be constructed by connecting processors and memory unit using a variety of interconnection.
Multiprocessor systems Objective n the multiprocessors’ organization and implementation n the shared-memory in multiprocessor n static and dynamic connection.
Lappeenranta University of Technology / JP CT30A7001 Concurrent and Parallel Computing Introduction to concurrent and parallel computing.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
MIMD Distributed Memory Architectures message-passing multicomputers.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Chapter 9: Alternative Architectures In this course, we have concentrated on single processor systems But there are many other breeds of architectures:
Parallel Algorithms. Parallel Models u Hypercube u Butterfly u Fully Connected u Other Networks u Shared Memory v.s. Distributed Memory u SIMD v.s. MIMD.
Chapter 2 Data Manipulation. © 2005 Pearson Addison-Wesley. All rights reserved 2-2 Chapter 2: Data Manipulation 2.1 Computer Architecture 2.2 Machine.
Computing Environment The computing environment rapidly evolving ‑ you need to know not only the methods, but also How and when to apply them, Which computers.
Data Management for Decision Support Session-4 Prof. Bharat Bhasker.
Distributed Programming CA107 Topics in Computing Series Martin Crane Karl Podesta.
CCSM Performance, Successes and Challenges Tony Craig NCAR RIST Meeting March 12-14, 2002 Boulder, Colorado, USA.
Computer Architecture SIMD Ola Flygt Växjö University
The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.
2016/1/5Part I1 Models of Parallel Processing. 2016/1/5Part I2 Parallel processors come in many different varieties. Thus, we often deal with abstract.
Interconnection network network interface and a case study.
Outline Why this subject? What is High Performance Computing?
CPU/BIOS/BUS CES Industries, Inc. Lesson 8.  Brain of the computer  It is a “Logical Child, that is brain dead”  It can only run programs, and follow.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
Lecture 3: Computer Architectures
CALTECH cs184c Spring DeHon CS184c: Computer Architecture [Parallel and Multithreaded] Day 9: May 3, 2001 Distributed Shared Memory.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Background Computer System Architectures Computer System Software.
Intro to Distributed Systems Hank Levy. 23/20/2016 Distributed Systems Nearly all systems today are distributed in some way, e.g.: –they use –they.
These slides are based on the book:
Network Connected Multiprocessors
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
CMSC 611: Advanced Computer Architecture
Course Outline Introduction in algorithms and applications
12.4 Memory Organization in Multiprocessor Systems
Parallel and Multiprocessor Architectures
What is Parallel and Distributed computing?
Different Architectures
Introduction and History of Cray Supercomputers
Data Parallel Algorithms
Presentation transcript:

CM-5 Massively Parallel Supercomputer ALAN MOSER Thinking Machines Corporation 1993

CM-5 General Information Brain child of Brain child of W. Daniel Hillis W. Daniel Hillis Lewis W. Tucker Lewis W. Tucker Founded Thinking Machines Corporation in the 1980’s CM-5 was last in a line of successors to the original CM-1 Connection Machine

CM-5 Connection Machine Hardware Overview processing nodes each of which contain a 32 MHz SPARC RISC processor 32 MB of distributed memory although the size may vary according to customer specifications 128 Mflops (per processing node) yielding a total performance of 1Teraflops. Tera=2^40 or roughly 10^12

Processing Node vs. Processor Processing Node not a single processor but a set of 5 chips Single 32 MHz SPARC RISC processor with 4 separate vector units capable of performing 64 bit floating point and integer arithmetic

Diagram of Processing Node

CM-5 Operating System CM-5 runs the CMOST OS enhanced version of the UNIX OS Each processing node contains a microkernal of the OS

CM-5 MIMD or SIMD Machine? Referred to as synchronized MIMD machine somewhere between MIMD and SIMD best aspects of both types of machines

CM-5 Batch Processing or Timesharing? CM-5 allows both Batch processing and Timesharing Timesharing is provided by dividing processing nodes into a partition controlled by partition manager Protection is enforced by hardware so that one partition cannot interfere with another

Interconnection Network(s) What? CM-5 has not one but three overlapping interconnection networks Data Network Control Network Diagnostic Network

3 Overlapping Interconnection Networks

CM-5 Data Network Supports simultaneous sending of messages between processing nodes Solves several problems: balancing message loads in network “fetch-deadlock problem” timesharing a parallel computer

CM-5 Data Network Binary fat tree

CM-5 Data Network (cont.) Messages are passed between processing nodes using the least common ancestor Increasing bandwidth at each level up Avoids “bottlenecks” at the root node

CM-5 Control Network Designed a simple tree Provides synchronization so as to allow CM-5 to operate like SIMD computer In general control network provides for: fast broadcasting of data barrier synchronization parallel prefix/postfix scan operations

CM-5 Diagnostic Network Organized as incomplete binary tree Able to map-out or ignore parts of the tree that are faulty Able to select and access groups of system chips in parallel including: single chip single type of chip chips within a user partition chips associated within portion of the system such as board, cabinet, etc.

CM-5 Diagnostic Network Diagram

References “The Network Architecture of the Connection Machine CM-5”, Thinking Machines Corporation February 7, 1996 Hillis, Daniel and Tucker, Lewis “The CM-5 Connection Machine: A Scalable Supercomputer” Communications of the ACM, November 1993 Volume 36, Number 11