Presented by: Quinn Gaumer CPS 221.  16,384 Processing Nodes (32 MHz)  30 m x 30 m  Teraflop  1992.

Slides:



Advertisements
Similar presentations
Threads, SMP, and Microkernels
Advertisements

More on Processes Chapter 3. Process image _the physical representation of a process in the OS _an address space consisting of code, data and stack segments.
Chapter 3 Process Description and Control
PROTOCOL VERIFICATION & PROTOCOL VALIDATION. Protocol Verification Communication Protocols should be checked for correctness, robustness and performance,
CM-5 Massively Parallel Supercomputer ALAN MOSER Thinking Machines Corporation 1993.
Chorus and other Microkernels Presented by: Jonathan Tanner and Brian Doyle Articles By: Jon Udell Peter D. Varhol Dick Pountain.
Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
Network Layer and Transport Layer.
Computer Systems/Operating Systems - Class 8
1 Concurrency: Mutual Exclusion and Synchronization Chapter 5.
CS 582 / CMPE 481 Distributed Systems
Causality & Global States. P1 P2 P Physical Time 4 6 Include(obj1 ) obj1.method() P2 has obj1 Causality violation occurs when order.
Multiprocessors ELEC 6200: Computer Architecture and Design Instructor : Agrawal Name: Nam.
1  1998 Morgan Kaufmann Publishers Chapter 9 Multiprocessors.
Communication operations Efficient Parallel Algorithms COMP308.
William Stallings Data and Computer Communications 7th Edition
CS533 - Concepts of Operating Systems
Operating Systems CS208. What is Operating System? It is a program. It is the first piece of software to run after the system boots. It coordinates the.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
MULTICOMPUTER 1. MULTICOMPUTER, YANG DIPELAJARI Multiprocessors vs multicomputers Interconnection topologies Switching schemes Communication with messages.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
CECS 5460 – Assignment 3 Stacey VanderHeiden Güney.
Protocol Layering Chapter 10. Looked at: Architectural foundations of internetworking Architectural foundations of internetworking Forwarding of datagrams.
1 Fault Tolerance in the Nonstop Cyclone System By Scott Chan Robert Jardine Presented by Phuc Nguyen.
High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.
Protocol Architectures. Simple Protocol Architecture Not an actual architecture, but a model for how they work Similar to “pseudocode,” used for teaching.
Common Devices Used In Computer Networks
CH2 System models.
Chapter 4 Threads, SMP, and Microkernels Patricia Roy Manatee Community College, Venice, FL ©2008, Prentice Hall Operating Systems: Internals and Design.
Computer Architecture Lecture10: Input/output devices Piotr Bilski.
Chapter 2 Parallel Architecture. Moore’s Law The number of transistors on a chip doubles every years. – Has been valid for over 40 years – Can’t.
TELE202 Lecture 5 Packet switching in WAN 1 Lecturer Dr Z. Huang Overview ¥Last Lectures »C programming »Source: ¥This Lecture »Packet switching in Wide.
Switching breaks up large collision domains into smaller ones Collision domain is a network segment with two or more devices sharing the same Introduction.
 Circuit Switching  Packet Switching  Message Switching WCB/McGraw-Hill  The McGraw-Hill Companies, Inc., 1998.
Invitation to Computer Science 5 th Edition Chapter 6 An Introduction to System Software and Virtual Machine s.
Computer Networks with Internet Technology William Stallings
Threads, SMP, and Microkernels Chapter 4. Process Resource ownership - process is allocated a virtual address space to hold the process image Scheduling/execution-
CSE 661 PAPER PRESENTATION
Interrupts, Buses Chapter 6.2.5, Introduction to Interrupts Interrupts are a mechanism by which other modules (e.g. I/O) may interrupt normal.
Chapter 8-2 : Multicomputers Multiprocessors vs multicomputers Multiprocessors vs multicomputers Interconnection topologies Interconnection topologies.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
Packet switching network Data is divided into packets. Transfer of information as payload in data packets Packets undergo random delays & possible loss.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
1 Threads, SMP, and Microkernels Chapter Multithreading Operating system supports multiple threads of execution within a single process MS-DOS.
1 Client-Server Interaction. 2 Functionality Transport layer and layers below –Basic communication –Reliability Application layer –Abstractions Files.
CPS 4150 Computer Organization Fall 2006 Ching-Song Don Wei.
The Network Architecture of the Connection Machine CM-5 Charles E. Leiserson et al (Thinking Machines Corporation) Presented by Eric Carty-Fickes 1/28/04.
Interconnection network network interface and a case study.
Assoc. Prof. Dr. Ahmet Turan ÖZCERİT.  What Operating Systems Do  Computer-System Organization  Computer-System Architecture  Operating-System Structure.
Operating Systems: Summary INF1060: Introduction to Operating Systems and Data Communication.
MINIX Presented by: Clinton Morse, Joseph Paetz, Theresa Sullivan, and Angela Volk.
CDA-5155 Computer Architecture Principles Fall 2000 Multiprocessor Architectures.
Threads, SMP, and Microkernels Chapter 4. Processes and Threads Operating systems use processes for two purposes - Resource allocation and resource ownership.
EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Computer System Structures
The Concept of Universal Service
Chapter 4 Introduction to Network Layer
Scaling the Network: The Internet Protocol
Azeddien M. Sllame, Amani Hasan Abdelkader
Architecture of Parallel Computers CSC / ECE 506 Summer 2006 Scalable Programming Models Lecture 11 6/19/2006 Dr Steve Hunter.
Chapter 4 Introduction to Network Layer
Threads, SMP, and Microkernels
CS 258 Reading Assignment 4 Discussion Exploiting Two-Case Delivery for Fast Protected Messages Bill Kramer February 13, 2002 #
Lecture 4- Threads, SMP, and Microkernels
Prof. Leonardo Mostarda University of Camerino
Scaling the Network: The Internet Protocol
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Lecture 18: Coherence and Synchronization
Presentation transcript:

Presented by: Quinn Gaumer CPS 221

 16,384 Processing Nodes (32 MHz)  30 m x 30 m  Teraflop  1992

 With 16,384 processors the interconnect plays a large role  3 Types of Networks ◦ Data ◦ Control ◦ Diagnostic

 Easily Attainable High Performance  Scaling  Data Parallel Programming  High Reliability and Availability  Space/Time Shared  Fast Time to Market  Modular

Include  Control Processor  Processing Nodes  Slices of Data and Control Networks ◦ Privileged vs. Non-Privileged  Program Isolation  Time Sharing

 Provide Simple View of Network to Processors  Sharing and Fault Tolerance  Decouple Network/Processor by Providing Contract ◦ Software -> ISA -> Hardware

 “The data network promises to eventually accept and deliver all messages injected into the network by the processors as long as the processors promise to eventually eject all messages from the network when they are delivered to the processors. ”

 Collection of Memory Mapped FIFOs ◦ Outgoing/Ingoing  Restricted Operations ◦ Implemented with protected pages  Physical/Relative(Virtual) Address ◦ Programs use only relative addresses  Network Independent of User ◦ Delivery guaranteed by network not processing node ◦ Requires network diagnostics

 Fat Tree Structure ◦ Closer to the root, thicker the tree ◦ Ensures no bottlenecks at root  User Partitions and I/O are Sub-trees ◦ Guarantees network independence ◦ Messages in partition stay within partition  Many Optimal Node to Node Paths ◦ Choose randomly among open links

 Data can be only 1-5 Words  Wormhole Routing  CRC Checking done at every Link ◦ Additional !CRC sent when error first found  Primary Errors allow Diagnostic Network to Determine location

 Message Counters at every Link  Kirchoff’s Law to Determine Missing Messages  What to do with a Bad Chip or Link? ◦ Route Messages Away from Failure ◦ Map Out Nearby Processors ◦ Which is better?  Both.

 Solution: Virtual Channels ◦ One channel for request and response ◦ 4 channels per chip (Incoming and Outgoing)  Deadlock still possible! ◦ User sends but never attempts to receive messages ◦ Higher level languages to implement communication protocol

 Objectives ◦ Clear all messages for new user ◦ Allow all messages in transit to eventually finish  “All Fall Down” Method ◦ Evenly misroute all messages in transit to nodes ◦ Message saved at node ◦ Resent when swapped in

 Control Processor broadcasts program ◦ Not instructions(SIMD)  Each Processor runs program on data set  Inter-Processor Communication ◦ Hardware Barriers allow for processes to communicate without shared semaphores

 Program smaller than instructions ◦ Easier to deliver  Local fetch allows commodity processors ◦ Fast new RISC processors, less R & D.  Control system useful for other problems  Execution of generic MIMD code ◦ Message passing

 Broadcasting ◦ User/Supervisor ◦ Interrupt ◦ Utility  Combining ◦ Reduction ◦ Forward/Backward Scan ◦ Router Done  Global Operations ◦ Synchronous/Asynchronous OR

 Binary Tree  Four Types of Packets ◦ Single Source : Broadcasting ◦ Multiple Source: Combining ◦ Idle: Filler ◦ Abstain: Allow control node to skip waiting  Collisions on Network ◦ Multiple/Multiple: Buffering based on arrival time ◦ Multiple/Single: Single Source Packets Prioritized ◦ Single/Single: Error

 Control Processor for each Partition ◦ Executes scalar code while processing nodes execute parallel code  Connect any Control Processor to any Partition ◦ Problems can occur in control networks too ◦ Diagnostics may show part of control network must be mapped out

 Binary Network ◦ Pods(physical subsystem) are leaves  JTAG ◦ Designed for Multichip…but serial  Do JTAG for each Pod  Combine Responses with OR/AND