RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer.

Slides:



Advertisements
Similar presentations
Network II.5 simulator ..
Advertisements

RAMP Gold : An FPGA-based Architecture Simulator for Multiprocessors Zhangxi Tan, Andrew Waterman, David Patterson, Krste Asanovic Parallel Computing Lab,
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
1 Network Concepts Rong Wang CGS3285 School of Computer Science University of Central Florida Spring2004.
OSI MODEL Maninder Kaur
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts Amherst Operating Systems CMPSCI 377 Lecture.
CCU EE&CTR1 Software Architecture Overview Nick Wang & Ting-Chao Hou National Chung Cheng University Control Plane-Platform Development Kit.
Reference: Message Passing Fundamentals.
SCORE - Stream Computations Organized for Reconfigurable Execution Eylon Caspi, Michael Chu, Randy Huang, Joseph Yeh, Yury Markovskiy Andre DeHon, John.
Department of Computer Engineering University of California at Santa Cruz Networking Systems (1) Hai Tao.
ECE669 L11: Static Routing Architectures March 4, 2004 ECE 669 Parallel Computer Architecture Lecture 11 Static Routing Architectures.
A CHAT CLIENT-SERVER MODULE IN JAVA BY MAHTAB M HUSSAIN MAYANK MOHAN ISE 582 FALL 2003 PROJECT.
10 - Network Layer. Network layer r transport segment from sending to receiving host r on sending side encapsulates segments into datagrams r on rcving.
Networks: Introduction1 CS4514 Computer Networks Term B03 Professor Bob Kinicki.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2009 Dataflow.
Computer Networks: Introduction1 Introduction. Computer Networks: Introduction2 Network Definitions and Classification Preliminary definitions and network.
EEC-484/584 Computer Networks Lecture 2 Wenbing Zhao
Review on Networking Technologies Linda Wu (CMPT )
1 RAMP Infrastructure Krste Asanovic UC Berkeley RAMP Tutorial, ISCA/FCRC, San Diego June 10, 2007.
WXES2106 Network Technology Semester /2005 Chapter 8 Intermediate TCP CCNA2: Module 10.
COE 342: Data & Computer Communications (T042) Dr. Marwan Abu-Amara Chapter 2: Protocols and Architecture.
Chapter 4.1 Interprocess Communication And Coordination By Shruti Poundarik.
Course Instructor: Aisha Azeem
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public 1 Version 4.0 Communicating over the Network Network Fundamentals – Chapter 2.
Lecture slides prepared for “Business Data Communications”, 7/e, by William Stallings and Tom Case, Chapter 8 “TCP/IP”.
TCP/IP Web Design & Layout January 23, TCP/IP For Dummies  The guts and the rules of the Internet and World Wide Web. A set of protocols, services,
Protocols and the TCP/IP Suite Chapter 4. Multilayer communication. A series of layers, each built upon the one below it. The purpose of each layer is.
High Speed Digital Design Project SpaceWire Router Student: Asaf Bercovich Instructor: Mony Orbach Semester: Winter 2009/ Semester Project Date:
Chapter 4: Managing LAN Traffic
Introduction to Interconnection Networks. Introduction to Interconnection network Digital systems(DS) are pervasive in modern society. Digital computers.
NetworkProtocols. Objectives Identify characteristics of TCP/IP, IPX/SPX, NetBIOS, and AppleTalk Understand position of network protocols in OSI Model.
Presentation on Osi & TCP/IP MODEL
What is a Protocol A set of definitions and rules defining the method by which data is transferred between two or more entities or systems. The key elements.
William Stallings Data and Computer Communications 7 th Edition Data Communications and Networks Overview Protocols and Architecture.
CHAPTER 3 TOP LEVEL VIEW OF COMPUTER FUNCTION AND INTERCONNECTION
Service Primitives Six service primitives that provide a simple connection-oriented service 4/23/2017
QoS Support in High-Speed, Wormhole Routing Networks Mario Gerla, B. Kannan, Bruce Kwan, Prasasth Palanti,Simon Walton.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Chapter 17 - Internetworking: Concepts, Architecture, and Protocols 1. Internetworking concepts 2. Router 3. protocol for internetworking 4. TCP/ IP layering.
The OSI Model.
Top Level View of Computer Function and Interconnection.
Fundamentals of Computer Networks ECE 478/578 Lecture #19: Transport Layer Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Henri Kujala Integration of programmable logic into a network front-end of a telecontrol system Supervisor: Professor Patric Östergård Instructor: Jouni.
1. I NTRODUCTION TO N ETWORKS Network programming is surprisingly easy in Java ◦ Most of the classes relevant to network programming are in the java.net.
Geneva, Switzerland, 11 June 2012 Switching and routing in Future Network John Grant Nine Tiles
1 CHAPTER 8 TELECOMMUNICATIONSANDNETWORKS. 2 TELECOMMUNICATIONS Telecommunications: Communication of all types of information, including digital data,
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics Basics of register-transfer design: –data paths and controllers; –ASM charts. Pipelining.
Input/Output Computer component : Input/Output I/O Modules External Devices I/O Modules Function and Structure I/O Operation Techniques I/O Channels and.
EEE440 Computer Architecture
Krista Lozada iAcademy First Term 2009
IP1 The Underlying Technologies. What is inside the Internet? Or What are the key underlying technologies that make it work so successfully? –Packet Switching.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
William Stallings Data and Computer Communications
NETWORK HARDWARE CABLES NETWORK INTERFACE CARD (NIC)
Hwajung Lee.  Interprocess Communication (IPC) is at the heart of distributed computing.  Processes and Threads  Process is the execution of a program.
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
An Introduction to Networking
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Manifold Execution Model and System.
Forwarding.
Ch 2. Network Models. 1. LAYERED TASKS Concept of layers – Consider two friends who communicate through mail – What happens when one sends a letter to.
AMQP, Message Broker Babu Ram Dawadi. overview Why MOM architecture? Messaging broker like RabbitMQ in brief RabbitMQ AMQP – What is it ?
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Introduction.
©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Chapter 5:Architectural Design l Establishing the overall structure of a software.
Dynamic connection system
CS 286 Computer Organization and Architecture
Azeddien M. Sllame, Amani Hasan Abdelkader
ECE-C662 Introduction to Behavioral Synthesis Knapp Text Ch
Presentation transcript:

RAMP Common Interface Krste Asanovic Derek Chiou Joel Emer

General Requirements Provide a language agnostic environment that facilitates sharing of modules Provide a modeling standard to facilitate the representation of time in the model target system that is independent of the host cycle time Provide a reusable set of ‘unmodel’ services that can be used by different projects Provide an underlying communication standard that can be used to specify standard interfaces Facilitate the creation of a specific set of modules that can be shared and that communicate via standard interfaces

Key infrastructure components Modeling core architecture Modeling time Implementing inter-module data communication Simulation control and support infrastructure (unModel) o simulation control  communication to front-end or control processor o simulation support  stats, events, assertions, knobs... Virtual Platform o Local memory access o Shared memory access o Host to FPGA communication channel

Target and Host RTL Target RTL Model RTL Unmodel RTL Host RTL Platform RTL

Translation from Target RTL to Model RTL Start (conceptually) with final RTL Partition design into units and channels o All inter-unit communication goes over channels o Channels have fixed latency  they are a systolic pipeline  latency set by what was mapped into the channel Representation as a bipartite graph Unit Channel

Translation from Target to Model (2) Change representation of time from edges to tokens o Encapsulate data sent on an edge into a timing token  data on the timing channel is 1-1 mapping of original data signals o Replace each channel with a timing token channel  timing channel is a FIFO that transports timing tokens, e.g., A-ports o Convert unit to sink and source tokens by abiding by the following:  Unit waits for tokens on all inputs and reads them  Performs same computation as it did  Dequeues all input tokens  Sends a token on all outputs o Note: channel must be initialiized Proof of equivalence to be provided

Distributed Timing Example Unit A Unit B Latency L D Target : RDYs RDY Host: Unit A Unit B DD Start Done Start Done DEQs ENQDEQ Pipeline target channel implemented as distributed FIFO with at least L buffers

Retiming to simply host model A shift register in the RTL can be converted into a timing token channel with the same latency. A perfectly systolic computation in the RTL can be converted into a timing token channel with the same latency and the functionality of the pipeline must be moved into the 'unit'. In general any retiming that exposes a series of shift registers allows one to convert the shift registers into a timing token channel. 1 1 Multiply 2 Tokenized Target Retimed Tokenized Target

Definition: firing A token-machine unit firing corresponds to the modeling of a single target machine cycle in that unit. A token-machine unit firing comprises: o Reading one token from each input channel o Compute based on tokens and internal state o Writing one token to each output channel

Multi-cycle host units The reads of all input tokens and writes of all output tokens can each be in different host cycles (while still reading each input and writing each output once each modelled cycle) 2 Tokenized Target HostMulti-cycle host A firing can be implemented by reading all token inputs, computing and writing all token outputs using multiple host cycles o This is an example of a 'multi-cycle firing‘ and is what allows target cycle accounting to be independent of host cycles.

Pipelined Host Units Multiple firings of a single token-machine unit can be overlapped (e.g., pipelined) so long as: o the token firing rules are maintained and o any inter-firing data dependencies internal to the token- machine unit are also maintained. Consequence is that multiple target cycles are in flight in a host unit at the same time.

Multiplexed host units Firings from distinct target units can be multiplexed on a single host unit o The multiplexed unit has a distinct copy of state for each target unit being modeled o The multiplexed unit must read tokens from channels associated with the proper target unit. o This might be accomplished by multiplexing the channels themselves. Probably simple if all communication in each target unit is to the same token machine unit port Unit 1 Unit 2 Channel Tokenized Target Host

Basic channel interface A FIFO interface… o Send: o out notFull; o in [n:0] enq_data; o in enq_en; o Recv: o out notEmpty; o out[n:0] first; o in deq;

Channel Interface Variants Parallel channels (same source and dest and same latency) can be combined into a single timing channel - this reduces flow control overhead Communication on wide channels might be fragmented or packetized across multiple host cycles and internally reassembled into one token. Unit sees flow control at fragment level, but channel guarantees delivery at the token level.

Multiple clock domains Simple cross clock domain communication can be handled with rate matchers at fast end of channel. Unit B – 66 MHz Channel Unit A – 100 MHz

Channel No Message Often as part of the process of abstracting a design into a model there is a situation where a communication is viewed as not happening… For example, To accommodate this situation an channel may include explicit transmission of a 'no message' token data enable

Interface Layers Point-to-point Ring Tree Bus Point-to-point One-to-many Many-to-one unModel domain Intra-FPGA Inter-FPGA CPU-to-FPGA dedicated channel TDM (multithreaded) channel Direct + Client/Server One-way Client/Server Logical Topology Physical Network Physical Link Flow Control Buffering Timing Servers Model domain Units communication domain Services

Multi-layer implementations Presentation Logical Topology Physical Network Physical Link Flow Control Buffering Timing RDL channels Units FAST connectors A-ports “Soft connections”

Logical Topology Semantics Represents host-level inter-module communication Supports both model and unmodel traffic Latency may be more than one host cycle Multiple patterns to be supported One-to-one One-to-many Many-to-one Must be expressible in multiple languages o Bluespec, Verilog...

Pattern Examples 1-to-1 –Timing channels 1-to-many –“run” command broadcast from controller Many-to-one –assertion violation reporting

Logical Topology Endpoint Interface Endpoints are simply FIFOs o Send: out notFull; in [n:0] enq_data; in enq_en; o Recv: out notEmpty; out[n:0] first; in deq; Clocking o endpoint has same clock as module connected to it o cross host clock domain communication must be supported Conifguration Meta-information o connection name o connection direction o connection pattern

Logical Topologies/Physical Interconnect AsAs AdAd BsBs BdBd Example: shared ring A s at station 1 communicates with A d at station 2 B s at station 2 communicates with B d at station 4 Intra-FPGA link

Interface Layers Point-to-point Ring Tree Bus Point-to-point One-to-many Many-to-one unModel domain Intra-FPGA Inter-FPGA CPU-to-FPGA dedicated channel TDM (multithreaded) channel Connections One-way Client/Server Logical Topology Physical Network Physical Link Flow Control Buffering Timing Servers Model domain Units communication domain Services

Physical Network Characteristics Host-level communication fabric Reliable transmission Deadlock Free Includes buffering for meeting above requirements Additional buffering is provide at higher layers

Physical Link Interface Semantics Host-level communication channel FIFO-style interface Decoupled input/output Error-free (reliable delivery) Uni-directional Point-to-point Packet description (TBD) Indeterminate (but finite) latency

Interface Layers Point-to-point Ring Tree Bus Point-to-point One-to-many Many-to-one unModel domain Intra-FPGA Inter-FPGA CPU-to-FPGA dedicated channel TDM (multithreaded) channel Connections One-way Client/Server Logical Topology Physical Network Physical Link Flow Control Buffering Timing Servers Model domain Units communication domain Services

UnModel Support Services Run control Units can be commands to start, stop, etc… Dynamic parameters Units can be configured at runtime Statistics Unit can collect and report event counts Event logging Unit can log a series of events for each cycle Assertions Unit can do runtime checks of invariants and report violations

Service Organization Stat Dynamic Param Local Control Unit Global Controller Host CPU Global Control Param Controller Stat Controller

Servers and services interface Service interface is implemented via separate input and output channels that handle requests and responses Each input/output pair forms a service which implements multiple methods Request / response is in-order for a single service Synchronization between calls to different services must be provided by clients. We provide serializability of operations.

Build process Handling logical endpoint connections Would like to avoid requiring parents to need to specify connections Bluespec: use static elaboration, e.g., “soft connections” Verilog: use TBD preprocessor Who maps logical connections to physical networks? Locally Globally 'Static' build parameters 'Dynamic' run parameters

Backup