Virtual-Channel Flow Control William J. Dally

Slides:



Advertisements
Similar presentations
Interconnection Networks: Flow Control and Microarchitecture.
Advertisements

Prof. Natalie Enright Jerger
Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.
REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.
Evaluating Bufferless Flow Control for On-Chip Networks George Michelogiannakis, Daniel Sanchez, William J. Dally, Christos Kozyrakis Stanford University.
1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,
Flattened Butterfly: A Cost-Efficient Topology for High-Radix Networks ______________________________ John Kim, William J. Dally &Dennis Abts Presented.
What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.
Allocator Implementations for Network-on-Chip Routers Daniel U. Becker and William J. Dally Concurrent VLSI Architecture Group Stanford University.
Nick McKeown CS244 Lecture 6 Packet Switches. What you said The very premise of the paper was a bit of an eye- opener for me, for previously I had never.
NETWORK ON CHIP ROUTER Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Final presentation part A Winter 2006.
1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.
Network based System on Chip Final Presentation Part B Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Network based System on Chip Part A Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
t Popularity of the Internet t Provides universal interconnection between individual groups that use different hardware suited for their needs t Based.
1 Lecture 13: Interconnection Networks Topics: flow control, router pipelines, case studies.
1 Lecture 25: Interconnection Networks Topics: flow control, router microarchitecture Final exam:  Dec 4 th 9am – 10:40am  ~15-20% on pre-midterm  post-midterm:
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.
CSE 291-a Interconnection Networks Lecture 10: Flow Control February 21, 2007 Prof. Chung-Kuan Cheng CSE Dept, UC San Diego Winter 2007 Transcribed by.
1 Lecture 25: Interconnection Networks, Disks Topics: flow control, router microarchitecture, RAID.
Issues in System-Level Direct Networks Jason D. Bakos.
1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.
1 Lecture 26: Interconnection Networks Topics: flow control, router microarchitecture.
Orion: A Power-Performance Simulator for Interconnection Networks Presented by: Ilya Tabakh RC Reading Group4/19/2006.
1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.
Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Switching Techniques Student: Blidaru Catalina Elena.
High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.
1 Lecture 23: Interconnection Networks Topics: Router microarchitecture, topologies Final exam next Tuesday: same rules as the first midterm Next semester:
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
On-Chip Networks and Testing
Elastic-Buffer Flow-Control for On-Chip Networks
Networks-on-Chips (NoCs) Basics
Dynamic Networks CS 213, LECTURE 15 L.N. Bhuyan CS258 S99.
1 Copyright © Monash University ATM Switch Design Philip Branch Centre for Telecommunications and Information Engineering (CTIE) Monash University
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Author : Jing Lin, Xiaola Lin, Liang Tang Publish Journal of parallel and Distributed Computing MAKING-A-STOP: A NEW BUFFERLESS ROUTING ALGORITHM FOR ON-CHIP.
1 Message passing architectures and routing CEG 4131 Computer Architecture III Miodrag Bolic Material for these slides is taken from the book: W. Dally,
George Michelogiannakis William J. Dally Stanford University Router Designs for Elastic- Buffer On-Chip Networks.
George Michelogiannakis, Prof. William J. Dally Concurrent architecture & VLSI group Stanford University Elastic Buffer Flow Control for On-chip Networks.
O1TURN : Near-Optimal Worst-Case Throughput Routing for 2D-Mesh Networks DaeHo Seo, Akif Ali, WonTaek Lim Nauman Rafique, Mithuna Thottethodi School of.
CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.
1 Lecture 15: Interconnection Routing Topics: deadlock, flow control.
Interconnect simulation. Different levels for Evaluating an architecture Numerical models – Mathematic formulations to obtain performance characteristics.
BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.
NC2 (No6) 1 Maximally Adaptive Routing Maximize adaptivity for a double-x routing based on turn model. Virtual network 0 Virtual network 1 Maximally adaptive.
Lecture 16: Router Design
CS440 Computer Networks 1 Packet Switching Neil Tang 10/6/2008.
Networks: Routing, Deadlock, Flow Control, Switch Design, Case Studies Alvin R. Lebeck CPS 220.
1 Switching and Forwarding Sections Connecting More Than Two Hosts Multi-access link: Ethernet, wireless –Single physical link, shared by multiple.
1 Low Latency Multimedia Broadcast in Multi-Rate Wireless Meshes Chun Tung Chou, Archan Misra Proc. 1st IEEE Workshop on Wireless Mesh Networks (WIMESH),
Network On Chip Cache Coherency Final presentation – Part A Students: Zemer Tzach Kalifon Ethan Kalifon Ethan Instructor: Walter Isaschar Instructor: Walter.
1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.
Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 
The network-on-chip protocol
Lecture 23: Interconnection Networks
Switching Techniques In large networks there might be multiple paths linking sender and receiver. Information may be switched as it travels through various.
Mechanics of Flow Control
Switching, routing, and flow control in interconnection networks
Virtual-Channel Flow Control
Switching Techniques.
Natalie Enright Jerger, Li Shiuan Peh, and Mikko Lipasti
CEG 4131 Computer Architecture III Miodrag Bolic
EE 122: Lecture 7 Ion Stoica September 18, 2001.
CS 6290 Many-core & Interconnect
Lecture 25: Interconnection Networks
Multiprocessors and Multi-computers
Presentation transcript:

Virtual-Channel Flow Control William J. Dally Presented by: Nick Kirchem March 5, 2004

Motivation Interconnection network is critical Performance sensitive to network latency & throughput Interconnect = large fraction of cost and power consumption Interconnect throughput is limited to a fraction of capacity due to coupled resource allocation Single buffers associated with physical channels Blocks entire physical channel True for circuit switching & wormhole routing

Solution: Virtual Channels Add “lanes” for each physical channel (lane = virtual channel)

VCs and Flow Control Background Virtual Channels decouple physical channels from buffer memory The most costly resources of interconnect’n network Associate multiple virtual channels with single physical channel Paper analyzes Flow Control Determines how resources are allocated, and How collisions over resources are resolved Most beneficial to flow control strategies that block

Virtual Channels Structure Each node contains set of buffers and a switch

Virtual Channels Structure Organize flit buffers into several lanes

Virtual Channels State Logic Status Register for Transmitting Node Lane-is-free bit Number of free flit buffers in lane (Optionally) Priority of packet in lane Status Register for Receiving Node Input & Output pointers for each lane buffer Channel state (free, waiting, active)

Virtual Channels State Logic

VC State Logic Storage Overhead Number of bits of storage required for l lanes, b flit buffers, and pri priority bits: Typical scenario (b=16, l=4, pri=0) requires: 36 bits of overhead with virtual channels 17 bits with no virtual channels Small compared to total storage of 512 bits

VC Operation Packet arrives at node Assigned output channel by routing algorithm Based on destination and output channel status Assigned to any free virtual channel (lane) Blocks if none are available Flit advanced by flow control Must gain access to a path through switch, and Access to the physical channel to input of next node Lane is deallocated when last flit leaves node

Allocation Policies Allocate physical channel bandwidth for lanes that: Have flit ready to transmit Have room for flit at receiving end Can use any arbitration algorithm Random, round-robin, priority Deadline scheduling (schedule by age)

VC Implementation Issues Integration design changes Replace FIFO buffers with multilane buffers Modify switch for larger # of inputs and outputs Flow control protocol modification Switch Complexity Added complexity to ACK when free buffer space opens up (identify lane = additional bits)

Virtual Channel Analysis Some assumptions: Packet destinations uniformly randomly distributed Arriving packet is consumed without waiting Single flit buffer for each lane Packet blocking probabilities are independent Lots of Math…

VC Analysis Results

Experimental Results Simulator (C Program) Various topologies and VC depth Throughput and Latency Analysis match predicted performance Better to have more lanes with less depth than vice versa Scheduling Algorithms show possibilities of performance given priorities or deadlines

Experimental Results

Conclusion and Questions Network throughput and latency improved by decoupling physical channels from buffers Is it worth the added complexity? Under which systems/network topologies would it be useful? Where would it not be so useful?