Presenter : Cheng_Ta Wu Masoumeh Ebrahimi, Masoud Daneshtalab, N P Sreejesh, Pasi Liljeberg, Hannu Tenhunen Department of Information Technology, University.

Slides:

Advertisements

Similar presentations

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.

Advertisements

Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004.

Reporter :LYWang We propose a multimedia SoC platform with a crossbar on-chip bus which can reduce the bottleneck of on-chip communication.

Presenter : Cheng_Ta Wu Shunitsu Kohara† Naoki Tomono†,‡ Jumpei Uchida† Yuichiro Miyaoka†,∗Nozomu Togawa‡ Masao Yanagisawa† Tatsuo Ohtsuki† † Department.

An Analytical Model for Worst-case Reorder Buffer Size of Multi-path Minimal Routing NoCs Gaoming Du 1, Miao Li 1, Zhonghai Lu 2, Minglun Gao 1, Chunhua.

Feng-Xiang Huang 2015/5/4 International Symposium Quality Electronic Design (ISQED), th M. H Neishaburi, Zeljko Zilic, McGill University, Quebec.

Reporter:PCLee With a significant increase in the design complexity of cores and associated communication among them, post-silicon validation.

What is Flow Control ? Flow Control determines how a network resources, such as channel bandwidth, buffer capacity and control state are allocated to packet.

Parallel System Performance CS 524 – High-Performance Computing.

Network based System on Chip Final Presentation Part B Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.

An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.

1 Lecture 2: Snooping and Directory Protocols Topics: Snooping wrap-up and directory implementations.

CS335 Networking & Network Administration Tuesday, May 11, 2010.

1 Lecture 21: Router Design Papers: Power-Driven Design of Router Microarchitectures in On-Chip Networks, MICRO’03, Princeton A Gracefully Degrading and.

Adaptive Routing in (Q)NoC

Parallel System Performance CS 524 – High-Performance Computing.

Chapter 19 Binding Protocol Addresses (ARP) Chapter 20 IP Datagrams and Datagram Forwarding.

Ch 23 1 Based on Data Communications and Networking, 4th Edition. by Behrouz A. Forouzan, McGraw-Hill Companies, Inc., 2007 Ameera Almasoud.

An Intelligent Cache System with Hardware Prefetching for High Performance Jung-Hoon Lee; Seh-woong Jeong; Shin-Dug Kim; Weems, C.C. IEEE Transactions.

Chapter 9 Classification And Forwarding. Outline.

COS 420 Day 16. Agenda Assignment 3 Corrected Poor results 1 C and 2 Ds Spring Break?? Assignment 4 Posted Chap Due April 6 Individual Project Presentations.

Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.

Network Layer (Part IV). Overview A router is a type of internetworking device that passes data packets between networks based on Layer 3 addresses. A.

Presenter : Shao-Cheih Hou Sight count : 11 ASPDAC ‘08.

Paper Review Building a Robust Software-based Router Using Network Processors.

On-Chip Networks and Testing

José Vicente Escamilla José Flich Pedro Javier García 1.

DNS (Domain Name System) Protocol On the Internet, the DNS associates various sorts of information with domain names. A domain name is a meaningful and.

National Institute Of Science & Technology Mobile IP Jiten Mishra (EC ) [1] MOBILE IP Under the guidance of Mr. N. Srinivasu By Jiten Mishra EC

1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.

LIBRA: Multi-mode On-Chip Network Arbitration for Locality-Oblivious Task Placement Gwangsun Kim Computer Science Department Korea Advanced Institute of.

Circuit & Packet Switching. ► Two ways of achieving the same goal. ► The transfer of data across networks. ► Both methods have advantages and disadvantages.

Sami Al-wakeel 1 Data Transmission and Computer Networks The Switching Networks.

Presenter: Min-Yu Lo 2015/10/19 Asit K. Mishra, N. Vijaykrishnan, Chita R. Das Computer Architecture (ISCA), th Annual International Symposium on.

2013/01/14 Yun-Chung Yang Energy-Efficient Trace Reuse Cache for Embedded Processors Yi-Ying Tsai and Chung-Ho Chen 2010 IEEE Transactions On Very Large.

COP 4930 Computer Network Projects Summer C 2004 Prof. Roy B. Levow Lecture 3.

Delivery, Forwarding, and Routing of IP Packets

Multiplexing FDM & TDM. Multiplexing When two communicating nodes are connected through a media, it generally happens that bandwidth of media is several.

Chapter 19 Binding Protocol Addresses (ARP) A frame transmitted across a physical network must contain the hardware address of the destination. Before.

1 DSP handling of Video sources and Etherenet data flow Supervisor: Moni Orbach Students: Reuven Yogev Raviv Zehurai Technion – Israel Institute of Technology.

Internet Protocols (chapter 18) CSE 3213 Fall 2011.

Chapter 24 Transport Control Protocol (TCP) Layer 4 protocol Responsible for reliable end-to-end transmission Provides illusion of reliable network to.

A Scalable Routing Protocol for Ad Hoc Networks Eric Arnaud Id:

Soc 5.1 Chapter 5 Interconnect Computer System Design System-on-Chip by M. Flynn & W. Luk Pub. Wiley 2011 (copyright 2011)

Router Architecture. December 21, 2015SoC Architecture2 Network-on-Chip Information in the form of packets is routed via channels and switches from one.

Content Routing Protocol Design Karthikeyan Ganesan Shruti Venkatesh Rafay Zamir.

컴퓨터교육과 이상욱 Published in: COMPUTER ARCHITECTURE LETTERS (VOL. 10, NO. 1) Issue Date: JANUARY-JUNE 2011 Publisher: IEEE Authors: Omer Khan (Massachusetts.

Voice Over Internet Protocol (VoIP) Copyright © 2006 Heathkit Company, Inc. All Rights Reserved Presentation 5 – VoIP and the OSI Model.

Block-Based Packet Buffer with Deterministic Packet Departures Hao Wang and Bill Lin University of California, San Diego HSPR 2010, Dallas.

18-WAN Technologies and Dynamic routing Dr. John P. Abraham Professor UTPA.

Pony – The occam-π Network Environment A Unified Model for Inter- and Intra-processor Concurrency Mario Schweigler Computing Laboratory, University of.

Flow Control Ben Abdallah Abderazek The University of Aizu

COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs in Parallel Machines Dr. Xiao Qin Auburn University

Scaling the Network Chapters 3-4 Part 2

Cellular IP: A New Approach to Internet Host Mobility

Packet Switching Outline Store-and-Forward Switches

Packet Switching Datagram Approach Virtual Circuit Approach

Lecture 23: Router Design

Chapter 3 Part 3 Switching and Bridging

18-WAN Technologies and Dynamic routing

Architecture of Parallel Computers CSC / ECE 506 Summer 2006 Scalable Programming Models Lecture 11 6/19/2006 Dr Steve Hunter.

Accelerating Dependent Cache Misses with an Enhanced Memory Controller

On-time Network On-chip

EE 122: Lecture 7 Ion Stoica September 18, 2001.

Chapter 3 Part 3 Switching and Bridging

Lecture 25: Interconnection Networks

Transport Layer 9/22/2019.

Authors: Ding-Yuan Lee, Ching-Che Wang, An-Yeu Wu Publisher: 2019 VLSI

2019/10/19 Efficient Software Packet Processing on Heterogeneous and Asymmetric Hardware Architectures Author: Eva Papadogiannaki, Lazaros Koromilas, Giorgos.

Multiprocessors and Multi-computers

Presentation transcript:

Presenter : Cheng_Ta Wu Masoumeh Ebrahimi, Masoud Daneshtalab, N P Sreejesh, Pasi Liljeberg, Hannu Tenhunen Department of Information Technology, University of Turku, Turku, Finland NORCHP 2009

Abstract What’s the problem Related works The proposed method Experiment Results

In this paper, we present novel network interface architecture for on-chip networks to increase memory parallelism and to improve the resource utilization. The proposed architecture exploits AXI transaction based protocol to be compatible with existing IP cores. Experimental results with synthetic test case demonstrate that the proposed architecture outperforms the conventional architecture in term of latency.

According to our observation, the utilization of reorder buffer in NIs is significantly low. Therefore, the traditional buffer management is not efficient enough for NIs.

[6] Transaction ID renaming [10] Moving the reorder buffer resources from NI into network routers [7] Supporting shared memory abstraction and flexible network configuration Using global synchronization the performance might be degraded, and the cost of hardware overhead is too high Increasing latency [5] NISAR (network interface architecture supporting adaptive routing) Low buffer utilization, and no support burst transaction

Master-side NI architecture Slave-side NI architecture

Both NI are partitioned into two paths  Forward path: transferring the requests to the network 。 AXI-Queue, Packetizer unit, Reorder unit  Reverse path: receiving the responses from the network 。 Packet-Queue, Depacketizer unit, Reorder unit

AXI-Queue:  Performs the arbitration between the write and read transaction channels and stores requests into write or read requests buffers.  If admitted by the reorder unit the request message will be sent to the packetizer unit. Packetizer:  Convert incoming messages from the AXI-Queue into header and data flits.

Packet-Queue:  Receives packets from the router.  If the packet is out of order(according to the sequence number), it is transmitted to the reorder buffer, otherwise it will be delivered to the Depacketizer unit directly. Depacketizer:  restore packets coming from either the reorder buffer or Packet- Queue into the original data format of the AXI master core.

Including a Status-Register, a Status-Table, a Reorder buffer, and a Reorder-Table In the forward path:  Preparing the sequence number for corresponding transaction ID, and avoiding overflow of the reorder buffer by the admittance mechanism are provided by this unit. In the reverse path:  Determines where the outstanding packets from the packet-queue should be transmitted(recorder buffer or Depacketizer), and when the packets in the reorder buffer could released to the depacketizer

Status-Register and Status-Table:  Status-Register: 。 It’s an n-bit register where each bit corresponds to one of the AXI transaction IDs. This register records whether there are one or more messages with the same transaction ID being issued or not.  Status-Table: 。 Each entry of this table is considered for messages with the same transaction ID, and includes valid tag (v), Transaction ID (T-ID), Number of outstanding Transactions (N-T), and the Expecting Sequence number (E-S). Size_nm: size of new message Size_AOM: size of all outstanding messages

Reorder-table and reorder-buffer  Each row of the reorder table corresponds to an out-of-order packet stored in the reorder buffer.  Reorder-Table includes the valid tag (v), the transaction ID (T-ID), the sequence number (S-N),and the head pointer (P).  Whenever an in-order packet delivered to the depacketizer unit, the depacketizer controller checks the reorder table for the validity of any stored packet with the same transaction ID and next sequence number. If so, the stored packet will be released from the reorder unit to the depacketizer unit.

To avoid losing the order of header information carried by arriving requests, a FIFO has been considered

In the first configuration (A), out of 25 nodes, ten nodes are assumed to be processor (master cores-with master NI) and other fifteen nodes are memories (slave cores-with slave NI). For the second configuration (B), each node is considered to have a processor and a memory (master core with master-NI, and slave cores with slave-NI). Latency defined as the number of cycles between the initiation of a request operation issued by a master and the time when the response is completely delivered to the master from the memory. And the request rate is defined as the ratio of the successful read/write request injections into the NI over the total number of injection attempts. Baseline architecture is according to the reference [5][6]

Automatic Interface Synthesis based on the Classification of Interface Protocols of IPs Protocol Transducer Synthesis using Divide and Conquer approach Efficient Network Interface Architecture for Network-on-chip