Download presentation
Presentation is loading. Please wait.
Published byEverett Harvey Modified over 9 years ago
1
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol Computer Engineering Department, Sharif University of Technology, Tehran, Iran modarressi@ce.sharif.edu
2
2 Sharif University of Technology Outline Introduction and Motivations Virtual Point-to-Point (VIP) Connections Static VIP Construction Scheme Dynamic VIP Construction Scheme Setup Network Evaluation Results Conclusions and Future Work
3
3 On-Chip Communication Mechanisms Packet-Switched NoCs Good Resource Utilization Modest Design Effort/Time Due to Structured and Predictable Links Some Power and Performance Overheads Due to Multi-Stage Pipelined Routers Dedicated Point-to-Point Links Ideal Power and Performance Poor Scalability: Significant Area Overhead for Large Systems Significant Design Effort/Time Due to Non-Predictable Link Properties Virtual Point-to-Point Connections in a Packet-Switched NoC
4
4 VIP Connections VIP: VIrtual Point-to-point Connections Over One VC (Virtual Channel) of Each Physical Channel Bypass Some Router Pipeline Stages Inexpensive Extensions to a Traditional Wormhole Router Router Control Unit, Arbiter, Buffer of the VIP Virtual Channels
5
5 Router Architecture Buffer at the VIP Virtual Channels Is Replaced by a Register (1-Flit Buffer) VIP Paths Are Kept by VIP Allocator Units at Output Ports Determines Which Input Is Connected to This Port Along the VIP Allocates Output Port to VIP When Control Signals Indicate That the VIP Has an Incoming Flit to Forward A Flow-Control Mechanism Prevents Starvation in Packet- Switched Flits
6
6 VIP Connections A VIP Is Constructed by Chaining the VIP Registers in the Routers Between the Source And Destination Nodes of a Communication Flow Provides a Virtual Dedicated Pipelined Link With 1-flit VIP Buffers as Staging Registers Flits Only Travel Over the Crossbars and Links Which Cover the Actual Physical Distance Between Their Source and Destination Nodes Skip Through Buffer Read, Buffer Write, and Allocation Operations
7
7 VIP Connections VIPs Are Not Allowed to Share a Common Link To Remove Buffering, Arbitration,… A Limited Number of VIPs in a Network But VIPs Cover a Significant Portion of On-Chip Traffic Due to Communication Locality In Most Multi-Core SoC Applications Each Core Communicates With a Few Other Cores In CMP Workloads Each Node Tends to Have a Small Number of Favored Destinations for Its Messages
8
8 VIP Construction Algorithm - Static Based on Application Traffic Pattern Input Applications Are Described by a Task-Graph (TG) A Heuristic Algorithm Map the TG Cores into the Nodes of a Mesh-based NoC Construct VIP for TG Edges in Order of Their Communication Volumes Find a Path Through Packet-Switched Network for a TG Edge If There Are Not Sufficient Free Resources to Build a VIP for It
9
9 VIPs for the VOPD Application VIPs Cover 100% of the On-Chip Traffic for This Application Static VIP Construction Scheme: Benchmarks: VOPD, MWD, MPEG, MP3+H263 Up to 58% Reduction in Message Latency (39% on Average) Up to 65% Reduction in Power Consumption (49% on Average)
10
10 VIPs vs. Physical Point-to-Point Connections VIPs Offer: Power and Performance Close to Dedicated Physical Point-to- Point Connections More Flexibility Dynamically Reconfigurable Based on the Traffic Pattern of the Running Application Less Design Effort Customized Dedicated Connections Over Regular Components
11
11 Dynamic VIP Construction An Alternative VIP Construction Scheme Dynamically Changes the VIP Connections in Response to Communication Requirements Imposed By the Running Application Monitoring the NoC Traffic Detecting High-Volume Communications and Constructing a VIP for Them Select the Best Route for a VIP Using a Simple Setup Network
12
12 Setup Network Setup Network Structure A Light-Weight Control Network Simple Node Structure and Small Bit-Width The Same Topology as the Main Data Network Setup Network Operation Keep the Track of the Number and Destination of Packets Sent by Each Node Select Traffic Flows Weighting Higher Than a Threshold (Bit/Sec.) Finds a Path Along One of the Shortest Paths Between the Source and Destination Nodes of the Traffic Flow to Construct a VIP
13
13 Dynamic VIP Construction Establishing a New VIP May Tear Down Some Existing VIPs Cost of a VIP: The Cumulative Weight (bit/sec.) of the VIPs That Will Be Torn Down By This New VIP Setup Network : Finds the Path With Minimum Cost Sends the Cost to the Source Node to Decide on Establishing the New VIP A New VIP Is Established If the Cumulative Weight of the Torn Down VIPs Is Less Than the Weight of the Requesting Traffic Flow
14
14 Setup Network VIP Setup Procedure : Arbitrating Among VIP Setup Requests Running the Distributed VIP Setup Algorithm Setting Up a VIP in the Data Network By Configuring the VIP Allocator of the Nodes Along the VIP Path Tearing Down Conflicting VIPs Each Setup Network Node Contains the Configuration Information of Its Corresponding Data Network Node Due to the Distributed Nature of the Algorithm Short Reconfiguration Time
15
S D 4 5 0 2 9 8 3 5 5 0 5 4 7 4 5 12 10 9 9 12 15 12 8 21 Port Cost ( Weight of the VIP Using It ) 1. Add the Received Cost (4) to the Weight of Ports Along the Shortest Path (the W and N Ports) toward the Destination Node 2. Send the New Costs (9 and 12) to the Neighboring Nodes Along the Destination Node Select the Minimum Cost and Keep the Port from Which the Smaller Cost Is Received 15
16
16 Dynamic VIP Construction The Setup Network Operates in Parallel with Packet Transmission in Packet-switched Network Hide the Setup Time The Setup Network Has a Small Bit-width and Operates Infrequently (Only When a High-volume Flow Is Detected) Negligible Power and Area Overhead
17
17 Evaluation Results XMulator NoC Simulator (www.xmulator.org)www.xmulator.org A C# -based Simulator Orion Power Library Comparison with a Conventional NoC (5-Stage Pipelined Wormhole Switch) Multi-Core SoC Traffic: H.263 Decoder+MP3 Decoder, H.263 Decoder+ MP3 Encoder, MP3 Decoder+ MP3 Encoder 38% Reduction in Message Latency, 46% Reduction in Power Consumption
18
18 Evaluation Results Synthetic Traffic: N-Hot Traffic: 80% of Messages to Exactly N Destination, 20% to Randomly Chosen Nodes Power (nJ/Cycle) Message Latency (cycles for 8-flit packets)
19
19 Summary and Future Work Adaptable Virtual Point-to-Point Connections in a Packet- Switched NoC Benefit from the Advantages of Both Communication Methods Two Static and Dynamic VIP Construction Schemes Significant Power/Latency Reduction Future Work Comparing the Method with Related Work; Express Virtual Channels, Single-Cycle Routers, … Precise Area/Power Results by Implementing the NoC in Hardware Analytical Models Show Small Area Overhead
20
20 Questions? modarressi@ce.sharif.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.