ECE/CS 757: Advanced Computer Architecture II

Slides:



Advertisements
Similar presentations
Prof. Natalie Enright Jerger
Advertisements

QuT: A Low-Power Optical Network-on-chip
A Novel 3D Layer-Multiplexed On-Chip Network
1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.
History of Distributed Systems Joseph Cordina
High Performance Router Architectures for Network- based Computing By Dr. Timothy Mark Pinkston University of South California Computer Engineering Division.
Storage area network and System area network (SAN)
INTRODUCTION TO COMPUTER NETWORKS INTRODUCTION Lecture # 1 (
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Interconnection Networks: Introduction
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
On-Chip Networks and Testing
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
1 CHAPTER 8 TELECOMMUNICATIONSANDNETWORKS. 2 TELECOMMUNICATIONS Telecommunications: Communication of all types of information, including digital data,
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
Dr. John P. Abraham Introduction to Computer Networks INTRODUCTION TO COMPUTER NETWORKS.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Background Computer System Architectures Computer System Software.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Data Communications Chapter 1 – Data Communications, Data Networks, and the Internet.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
Network Connected Multiprocessors
Lynn Choi School of Electrical Engineering
INTRODUCTION TO COMPUTER NETWORKS
CS408/533 Computer Networks Text: William Stallings Data and Computer Communications, 6th edition Chapter 1 - Introduction.
The Underlying Technologies
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Interconnection Networks (Part 2) Dr.
Lecture 23: Interconnection Networks
Course Outline Introduction in algorithms and applications
What is Fibre Channel? What is Fibre Channel? Introduction
Wide Area Network.
Pablo Abad, Pablo Prieto, Valentin Puente, Jose-Angel Gregorio
Constructing a system with multiple computers or processors
Architecture & Organization 1
Exploring Concentration and Channel Slicing in On-chip Network Router
ECE/CS 757: Advanced Computer Architecture II Interconnects
2 Basic Concepts: data and computer networking
Israel Cidon, Ran Ginosar and Avinoam Kolodny
Architecture & Organization 1
Parallel and Multiprocessor Architectures – Shared Memory
ECE/CS 757: Advanced Computer Architecture II
Storage area network and System area network (SAN)
INTRODUCTION TO COMPUTER NETWORKS
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Natalie Enright Jerger, Li Shiuan Peh, and Mikko Lipasti
Constructing a system with multiple computers or processors
Networks Networking has become ubiquitous (cf. WWW)
INTRODUCTION TO COMPUTER NETWORKS
INTRODUCTION TO COMPUTER NETWORKS
Computer Evolution and Performance
INTRODUCTION TO COMPUTER NETWORKS
ESE534: Computer Organization
Multi-core and Beyond COMP25212 System Architecture
Optical communications & networking - an Overview
Networking.
Requirements Definition
Basics of Computer Networking
Cluster Computers.
Multiprocessors and Multi-computers
Reconfigurable Computing (EN2911X, Fall07)
Multiprocessor System Interconnects
Presentation transcript:

ECE/CS 757: Advanced Computer Architecture II Instructor:Mikko H Lipasti Spring 2017 University of Wisconsin-Madison Lecture notes created by Natalie Enright Jerger

Mikko Lipasti-University of Wisconsin Lecture Outline Introduction to Networks Network Topologies Network Routing Network Flow Control Router Microarchitecture Technology example: On-chip Nanophotonics Mikko Lipasti-University of Wisconsin

What is an Interconnection Network? Application Algorithm Logic Mem Logic Mem Programming Language Operating System Instruction Set Architecture Communication Computer Architecture Microarchitecture Register Transfer Level Circuits Logic Mem Logic Mem Devices Technology Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

What is an Interconnection Network? Application Algorithm Logic Mem Logic Mem Programming Language Operating System Instruction Set Architecture Dedicated Channels Computer Architecture Microarchitecture Register Transfer Level Circuits Logic Mem Logic Mem Devices Technology Application: Ideally wants low-latency, high-bandwidth, dedicated channels between logic and memory Technology: Dedicated channels too expensive in terms of area and power Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

What is an Interconnection Network? Application Algorithm Logic Mem Logic Mem Programming Language Operating System Instruction Set Architecture Interconnection Network Computer Architecture Microarchitecture Register Transfer Level Circuits Logic Mem Logic Mem Devices Technology An Interconnection Network is a programmable system that transports data between terminals Technology: Interconnection network helps efficiently utilize scarce resources Application: Managing communication can be critical to performance Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Interconnection Networks Introduction Interconnection networks should be designed to transfer the maximum amount of information within the least amount of time (and cost, power constraints) so as not to bottleneck the system Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Types of Interconnection Networks Interconnection networks can be grouped into four domains Depending on number and proximity of devices to be connected On-Chip Networks (OCNs or NoCs) Devices include microarchitectural elements (functional units, register files), caches, directories, processors Current/Future systems: dozens, hundreds of devices Ex: Intel TeraFLOPS research prototypes – 80 cores Intel Single-chip Cloud Computer – 48 cores Proximity: millimeters Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Types of Interconnection Networks (2) System/Storage Area Networks (SANs) Multiprocessor and multicomputer systems Interprocessor and processor-memory interconnections Server and data center environments Storage and I/O components Hundreds to thousands of devices interconnected IBM Blue Gene/L supercomputer (64K nodes, each with 2 processors) Maximum interconnect distance: tens of meters (typical) to a few hundred meters Examples (standards and proprietary) InfiniBand, Myrinet, Quadrics, Advanced Switching Interconnect Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Types of Interconnection Networks (3) Local Area Networks (LANs) Interconnect autonomous computer systems Machine room or throughout a building or campus Hundreds of devices interconnected (1,000s with bridging) Maximum interconnect distance few kilometers to few tens of kilometers Example (most popular): Ethernet, with 10 Gbps over 40Km Wide Area Networks (WANs) Interconnect systems distributed across globe Internetworking support required Many millions of devices interconnected Max distance: many thousands of kilometers Example: ATM (asynchronous transfer mode) Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Interconnection Network Domains Wide Area Networks 106 Metropolitan Area Networks Local Area Networks 103 Distance (meters) System/Storage Area Networks 100 10-3 1 10 100 1,000 10,000 >100,000 Number of devices interconnected Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger) Slide courtesy Timothy Mark Pinkston and José Duato

Interconnection Network Domains Wide Area Networks 106 Metropolitan Area Networks Local Area Networks 103 Distance (meters) System/Storage Area Networks 100 On-Chip Interconnection Networks/Networks-on-Chip 10-3 1 10 100 1,000 10,000 >100,000 Number of devices interconnected Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger) Slide courtesy Timothy Mark Pinkston and José Duato

Why Study Networks on Chip? Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Why Study On-Chip Networks? Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Example of Multi- and Many-Core Architectures Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Why study interconnects? They provide external connectivity from system to outside world Also, connectivity within a single computer system at many levels I/O units, boards, chips, modules and blocks inside chips Trends: high demand on communication bandwidth increased computing power and storage capacity switched networks are replacing buses Computer architects/engineers must understand interconnect problems and solutions in order to more effectively design and evaluate systems Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger) Slide courtesy Timothy Mark Pinkston and José Duato

On-Chip Networks (OCN or NoCs) Why On-Chip Network? Ad-hoc wiring does not scale beyond a small number of cores Prohibitive area Long latency OCN offers scalability efficient multiplexing of communication often modular in nature (ease verification) Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Differences between on-chip and off-chip networks Significant research in multi-chassis interconnection networks (off-chip) Supercomputers Clusters of workstations Internet routers Leverage research and insight but… Constraints are different New opportunities Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) Off-chip vs. on-chip Off-chip: I/O bottlenecks Pin-limited bandwidth Inherent overheads of off-chip I/O transmission On-chip Wiring constraints Metal layer limitations Horizontal and vertical layout Short, fixed length Repeater insertion limits routing of wires Avoid routing over dense logic Impact wiring density Tight area and power budgets Ultra-low on-chip latencies Abundant on-chip wires Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) Off-Chip vs On-Chip On-Chip Power Consume 10-15% or more of die power budget Latency Different order of magnitude Routers consume significant fraction of latency Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) New opportunities Abundant wiring Change in relative cost of wires and buffers Many short flits Tightly integrated into system Not commodity – fully customized design Allows for optimization with uncore Cache coherence Emerging technology Optics 3D Tight area and power budgets Ultra-low on-chip latencies Abundant on-chip wires Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

On-Chip Network Evolution Ad hoc wiring Small number of nodes Buses and Crossbars Simplest variant of on-chip networks Low core counts Like traditional multiprocessors Bus traffic quickly saturates with a modest number of cores Crossbars: higher bandwidth Poor area and power scaling Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) Multicore Examples (1) 1 2 3 4 5 1 XBAR 2 3 4 5 Sun Niagara Niagara 2: 8x9 crossbar (area ~= core) Rock: Hierarchical crossbar (5x5 crossbar connecting clusters of 4 cores) Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) Multicore Examples (2) IBM Cell Element Interconnect Bus 12 elements 4 unidirectional rings 16 Bytes wide Operates at 1.6 GHz RING \ IBM Cell Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

ECE 1749H: Interconnection Networks (Enright Jerger) Many Core Example Intel TeraFLOPS 80 core prototype 5 GHz Each tile: Processing engine + on-chip network router 2D MESH Some more details about teraflops from intro of book The last current example I have is the Intel Polaris prototype. This is an 80-core prototype that connects cores via a 2-D grid or mesh network. I’ve highlighted the mesh in white on the die photo. At each intersection in the mesh, there is a router. The router makes decisions about how messages travel through the network. If you look at the blown up photo of one of the 80 tiles, you can see that the router occupies roughly 25% of the area of a tile. Significant area and power will be devoted to the on-chip network in future designs so it is crucial that we utilize this area efficiently for high performing solutions. Add RAW, TRIPs ECE 1749H: Interconnection Networks (Enright Jerger) Spring 2014

Many-Core Example (2): Intel SCC Potential platform for project Intel’s Single-chip Cloud Computer (SCC) uses a 2D mesh with state of the art routers Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Performance and Cost Performance: latency and throughput Latency (sec) Zero load latency Amount of traffic network can accept before all message start to experience very high latencies Theoretical lower bound on latency (zero load latency) and upper bound on throughput – topology, routing, flow control all impact these as we will see in coming weeks. Offered Traffic (bits/sec) Saturation throughput Performance: latency and throughput Cost: area and power Spring 2014 ECE 1749H: Interconnection Networks (Enright Jerger)

Mikko Lipasti-University of Wisconsin Lecture Outline Introduction to Networks Network Topologies Network Routing Network Flow Control Router Microarchitecture Technology example: On-chip Nanophotonics Mikko Lipasti-University of Wisconsin