Dan Marconett Next-Generation Networking Systems Lab

Slides:

Advertisements

Similar presentations

Scalable Routing In Delay Tolerant Networks

Advertisements

1 Keith D. Underwood, Eric Borch May 16, 2011 A Unified Algorithm for both Randomized Deterministic and Adaptive Routing in Torus Networks.

Two-Market Inter-domain Bandwidth Contracting

Washington State University

Prof. Natalie Enright Jerger

Networks-on-Chip.

Routing and Congestion Problems in General Networks Presented by Jun Zou CAS 744.

1 Introduction to Network Layer Lesson 09 NETS2150/2850 School of Information Technologies.

Delay Analysis and Optimality of Scheduling Policies for Multihop Wireless Networks Gagan Raj Gupta Post-Doctoral Research Associate with the Parallel.

Scalable Rule Management for Data Centers Masoud Moshref, Minlan Yu, Abhishek Sharma, Ramesh Govindan 4/3/2013.

A Novel 3D Layer-Multiplexed On-Chip Network

Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.

Flattened Butterfly Topology for On-Chip Networks John Kim, James Balfour, and William J. Dally Presented by Jun Pang.

REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.

1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.

Module R R RRR R RRRRR RR R R R R Efficient Link Capacity and QoS Design for Wormhole Network-on-Chip Zvika Guz, Isask ’ har Walter, Evgeny Bolotin, Israel.

1 Lecture 12: Interconnection Networks Topics: dimension/arity, routing, deadlock, flow control.

Network-on-Chip Network Adapter and Network Issues System-on-Chip Group, CSE-IMM, DTU.

Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks Kapre et. al RC Reading Group – 3/29/2006 Presenter: Ilya Tabakh.

Network-on-Chip An Overview System-on-Chip Group, CSE-IMM, DTU.

1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control Final exam reminders:  Plan well – attempt every question.

Network-on-Chip Examples System-on-Chip Group, CSE-IMM, DTU.

Rotary Router : An Efficient Architecture for CMP Interconnection Networks Pablo Abad, Valentín Puente, Pablo Prieto, and Jose Angel Gregorio University.

1 Evgeny Bolotin – ClubNet Nov 2003 Network on Chip (NoC) Evgeny Bolotin Supervisors: Israel Cidon, Ran Ginosar and Avinoam Kolodny ClubNet - November.

1 Lecture 24: Interconnection Networks Topics: topologies, routing, deadlocks, flow control.

1 Indirect Adaptive Routing on Large Scale Interconnection Networks Nan Jiang, William J. Dally Computer System Laboratory Stanford University John Kim.

1 Lecture 25: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix E) Review session,

Network-on-Chip: Communication Synthesis Department of Computer Science Texas A&M University.

Storage area network and System area network (SAN)

Low-Latency Virtual-Channel Routers for On-Chip Networks Robert Mullins, Andrew West, Simon Moore Presented by Sailesh Kumar.

Dragonfly Topology and Routing

Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.

Data Communications and Networking

Interconnection Networks. Applications of Interconnection Nets Interconnection networks are used everywhere! ◦ Supercomputers – connecting the processors.

High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.

Networks-on-Chip. Seminar contents  The Premises  Homogenous and Heterogeneous Systems- on-Chip and their interconnection networks  The Network-on-Chip.

1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.

Chapter 4: Managing LAN Traffic

Communication Synthesis: Buses and Network-on-Chip (NOC) Dr. Eng. Amr T. Abdel-Hamid ELECT 1002 Spring 2008 System-On-a-Chip Design.

On-Chip Networks and Testing

High-Performance Networks for Dataflow Architectures Pravin Bhat Andrew Putnam.

Design, Synthesis and Test of Network on Chips

Déjà Vu Switching for Multiplane NoCs NOCS’12 University of Pittsburgh Ahmed Abousamra Rami MelhemAlex Jones.

Switched network.

QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.

High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.

High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.

Computer Networks with Internet Technology William Stallings

CS 8501 Networks-on-Chip (NoCs) Lukasz Szafaryn 15 FEB 10.

Packet switching network Data is divided into packets. Transfer of information as payload in data packets Packets undergo random delays & possible loss.

Anshul Kumar, CSE IITD ECE729 : Advanced Computer Architecture Lecture 27, 28: Interconnection Mechanisms In Multiprocessors 29 th, 31 st March, 2010.

BZUPAGES.COM Presentation On SWITCHING TECHNIQUE Presented To; Sir Taimoor Presented By; Beenish Jahangir 07_04 Uzma Noreen 07_08 Tayyaba Jahangir 07_33.

Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.

1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.

Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.

Virtual-Channel Flow Control William J. Dally

1 Lecture 24: Interconnection Networks Topics: communication latency, centralized and decentralized switches, routing, deadlocks (Appendix F)

1 Lecture 14: Interconnection Networks Topics: dimension vs. arity, deadlock.

Effective bandwidth with link pipelining Pipeline the flight and transmission of packets over the links Overlap the sending overhead with the transport.

Performance Comparison of Ad Hoc Network Routing Protocols Presented by Venkata Suresh Tamminiedi Computer Science Department Georgia State University.

1 Lecture 29: Interconnection Networks Papers: Express Virtual Channels: Towards the Ideal Interconnection Fabric, ISCA’07, Princeton Interconnect Design.

1 Lecture 22: Interconnection Networks Topics: Routing, deadlock, flow control, virtual channels.

Network-on-Chip Paradigm Erman Doğan. OUTLINE SoC Communication Basics  Bus Architecture  Pros, Cons and Alternatives NoC  Why NoC?  Components 

Azeddien M. Sllame, Amani Hasan Abdelkader

Israel Cidon, Ran Ginosar and Avinoam Kolodny

Lecture 14: Interconnection Networks

Multiprocessor network topologies

Networks-on-Chip.

Lecture: Interconnection Networks

Multiprocessors and Multi-computers

Presentation transcript:

A Survey of Architectural Design and Implementation Tradeoffs in Network on Chip Systems Dan Marconett Next-Generation Networking Systems Lab University of California, Davis dmarconett@ucdavis.edu Slide_1

Overview Introduction SoC/NoC Architectures Routing Strategies Energy Dissipation Conclusion Slide_2

SoC What is System-on-Chip (SoC)? Integration of multiple computer components (i.e. microcontroller, memory blocks, timers, etc.) onto a single silicon chip Each on chip component referred to as a block Block abstraction enables component-level design of SoC containing multiple proprietary elements Slide_3

NoC What is Network-on-Chip (NoC)? Leveraging existing computer networking principles to improve inter-component intra-chip communications for SoC Each on chip component connected by switch to a particular comm wire(s) Improvement over standard bus based interconnections for SoC architectures in terms of throughput Slide_4

Overview Introduction SoC/NoC Architectures Routing Strategies Energy Dissipation Conclusion Slide_5

Architectures: CLICHE CLICHÉ: Chip-Level Integration of Communicating Heterogeneous Elements Two-dimensional mesh network layout for NoC design All switches are connected to the four closest other switches and target resource block, except those switches on the edge of the layout Connections are two unidirectional links Slide_6

Architectures: Folded Torus Similar to mesh based architectures Wires are wrapped around from the top component to the bottom and rightmost to leftmost Smaller hop count Higher bandwidth Decreased Contention Increased chip space usage Slide_7

Architectures: BFT BFT: Butterfly Fat Tree Each node in tree model has coordinates (level, position) where level is depth and position is from left to right Leaves are component blocks Interior nodes are switches Four child ports per switch and two parent ports LogN levels, ith level has n/(2^i+1) switches, n = leaves (blocks) Use traffic aggregation to reduce congestion Slide_8

Architectures: SPIN SPIN: Scalable, Programmable, Integrated Network Leverages the Butterfly Fat Tree design Now every level has same number switches Network grows like (NlogN)/8 Trades area overhead and decreased power efficiency for higher throughput Illustrative of performance vs. power consumption Slide_9

Architectures: Octagon Standard model: 8 components, 12 interconnects Design complexity increases linearly with number of nodes Largest packet travel distance is two hops High throughput Shortest path routing easy to implement Slide_10

Overview Introduction SoC/NoC Architectures Routing Strategies Energy Dissipation Conclusion Slide_11

Routing: Circuit/Packet Switching Circuit Switching Dedicated path, or circuit, is established over which data packets will travel Naturally lends itself to time-sensitive guaranteed service due to resource allocation Reservation of bandwidth decreases overall throughput and increases average delays Packet Switching Intermediate routers are now responsible for the routing of individual packets through the network, rather than following a single path Provides for so-called best-effort services Slide_12

Routing: Wormhole/Virtual Cut Through Wormhole Switching Message is divided up into smaller, fixed length flow units called flits Only first flit contains routing information, subsequent flits follow Buffer size is significantly reduced due to the limitation on the number of flits needed to be buffered at any given time Virtual Cut Through Switching Much like Wormhole switching Header flit can travel ahead and undergo processing while remaining flits are still navigating the network Higher acceptance rates and lower latencies than Wormhole Slide_13

Routing: Contention Contention occurs when routers or IP blocks attempt to send data over the same link at the same time For Circuit Switching, contention is resolved at the time of actual connection setup For packet switching, contention resolution is handled at a much finer level, by the router buffering and scheduling individual packets of information Better overall performance for packet switched networks at the cost of lack of service guarantee Slide_14

Overview Introduction SoC/NoC Architectures Routing Strategies Energy Dissipation Conclusion Slide_15

Energy Dissipation: Architectures Two causes for dissipation, switches and wire segments Many parameters in the architectural design phase which affect the key trade-off of performance vs. power dissipation Length of physical wires Switching techniques Buffer allocation Types of guaranteed service The topology itself Slide_16

Energy Dissipation: Architectures (2) Pande et al. [10] used a simulator to investigate various metrics, including energy dissipation, with respect to the five main architectures Average dynamic energy dissipated per event, each layout containing 256 functional blocks Energy dissipation increases linearly with the increase of virtual channels for all five architectures Small number (4) of virtual channels will keep energy dissipation low without giving up throughput When the traffic load was analyzed, it was found that the energy dissipation reached an upper limit when throughput was maximized Architectures with more elaborate topologies, and therefore higher degrees of connectivity (such as SPIN and Octagon) have a higher much greater energy dissipation on average (~60 nj vs. 250-350 nj) Slide_17

Energy Dissipation: Switching How to route information from block A to block B in such a way that the constraints on energy consumption are maintained Banerjee et al. [9] address this issue through a modeling approach based on a 4x4 mesh layout Virtual-cut Through Switching versus Wormhole Switching For both routing techniques, energy dissipation rises linearly with the injection rate of data packets until the network is fully congested, after which it is constant Both techniques yield same power consumption Virtual-Cut Through switching produces higher acceptance rates and lower latencies than Wormhole Switching, therefore VCT is preferred Slide_18

Overview Introduction SoC/NoC Architectures Routing Strategies Energy Dissipation Conclusion Slide_19

Conclusion More elaborate layouts with higher degrees of connectivity (SPIN and Octagon) were seen to have much higher rates of energy dissipation, however, they also yield increased throughput Elaborate architectures also take up more space on the silicon chip VCT is preferred to Wormhole due to decreased latency, though both have same energy dissipation for given traffic loads Decide on priorities; communication reliability, energy efficiency, increased throughput, decreased latency….? Slide_20

References [1] E. Rijpkema, K. Goossens, A. Radulescu, J. Dielssen, J. van Meerbergen, P. Wielage, and E. Waterlander, “Trade- offs in the Design of a Router with Both Guaranteed and Best-Effort Services for Networks on Chip,” IEE Proceedings Computers and Digital Techniques, vol. 150, no. 5, pp. 294-302, Sept. 2003. [2] W. Dally, C. Seitz, “Deadlock-free Message Routing in Multiprocessor Interconnection Networks,” IEEE Transactions on Computers, vol. C-34, no. 10, pp. 547-553, May 1987. [3] S. Kumar, A. Jantsch, J. Soininen, M. Forsell, M. Millberg, J. Oberg, K. Tiensyrja, and A. Hemani, “A Network on Chip Architecture and Design Methodology,” Proceedings International Symposium VLSI (ISVLSI), pp. 117-124, 2002. [4] W. J. Dally and B. Towles, “Route Packets, Not Wires: On-Chip Interconnection Networks,” Proceedings Design and Automation Conference (DAC), pp. 683-689, 2001. [5] P. P. Pande, C. Grecu, A. Ivanov, and R. Saleh, “Design of a Switch for Network on Chip Applications,” Proceedings International Symposium on Circuits and Systems (ISCAS), vol. 5, pp.217-220, May 2003. [6] P. Guerrier and A. Greiner, “A Generic Architecture for On-Chip Packet-Switched Interconnections,” Proceedings Design and Test in Europe (DATE), pp. 250-256, Mar. 2000. [7] F. Karim, A. Nguyen, and Sujit Dey, “An Interconnect Architecture For Networking Systems on Chips,” IEEE Micro, vol. 22, no. 5, pp. 36-45, Sept./Oct. 2002. [8] Ateris, “A comparison of Network-on-Chip and Buses,” http://www.arteris.com/noc_whitepaper.pdf. [9] Nilanjan Banerjee, Praveen Vellanki, Karam S. Chatha, "A Power and Performance Model for Network-on-Chip Architectures," Proceedings of the Design, Automation, and Test in Europe Conference and Exhibition (DATE) , p. 21250, 2004. [10] Partha Pratim Pande, Cristian Grecu, Michael Jones, Andre Ivanov, Resve Saleh, "Performance Evaluation and Design Trade-Offs for Network-on-Chip Interconnect Architectures," IEEE Transactions on Computers ,vol. 54, no. 8, pp. 1025-1040, August, 2005. Slide_21