Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder.

Slides:



Advertisements
Similar presentations
Multiprocessor Architecture for Image processing Mayank Kumar – 2006EE10331 Pushpendre Rastogi – 2006EE50412 Under the guidance of Dr.Anshul Kumar.
Advertisements

IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
1 Architectural Complexity: Opening the Black Box Methods for Exposing Internal Functionality of Complex Single and Multiple Processor Systems EECC-756.
PRESENTED BY: PRIYANK GUPTA 04/02/2012 Generic Low Latency NoC Router Architecture for FPGA Computing Systems & A Complete Network on Chip Emulation Framework.
Addressing the System-on-a-Chip Interconnect Woes Through Communication-Based Design N. Vinay Krishnan EE249 Class Presentation.
1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester.
NETWORK ON CHIP ROUTER Students : Itzik Ben - shushan Jonathan Silber Instructor : Isaschar Walter Final presentation part B Spring 2006.
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Part A Final Presentation.
Network based System on Chip Performed by: Medvedev Alexey Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Network based System on Chip Students: Medvedev Alexey Shimon Ofir Supervisor: Walter Isaschar (Zigmond) Winter-Spring 2006.
Hardware accelerator for PPC microprocessor Final presentation By: Instructor: Kopitman Reem Fiksman Evgeny Stolberg Dmitri.
Reliable Storage using Reed- Solomon coding Winter 2004/2005 Part B Final Presentation Ilan Rosenfeld & Moshe Karl Instructor: Isaschar Walter.
Configurable System-on-Chip: Xilinx EDK
Modern trends in computer architecture and semiconductor scaling are leading towards the design of chips with more and more processor cores. Highly concurrent.
Reliable Data Storage using Reed Solomon Code Supervised by: Isaschar (Zigi) Walter Performed by: Ilan Rosenfeld, Moshe Karl Spring 2004 Midterm Presentation.
Experiences Implementing Tinuso in gem5 Maxwell Walter, Pascal Schleuniger, Andreas Erik Hindborg, Carl Christian Kjærgaard, Nicklas Bo Jensen, Sven Karlsson.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
Technion Digital Lab Project Performance evaluation of Virtex-II-Pro embedded solution of Xilinx Students: Tsimerman Igor Firdman Leonid Firdman.
Using FPGAs with Embedded Processors for Complete Hardware and Software Systems Jonah Weber May 2, 2006.
Router Architectures An overview of router architectures.
Mohamed ABDELFATTAH Vaughn BETZ. 2 Why NoCs on FPGAs? Embedded NoCs Area & Power Analysis Comparison Against P2P/Buses 4 4.
System Architecture A Reconfigurable and Programmable Gigabit Network Interface Card Jeff Shafer, Hyong-Youb Kim, Paul Willmann, Dr. Scott Rixner Rice.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Back-end Timing Models Core Models.
GallagherP188/MAPLD20041 Accelerating DSP Algorithms Using FPGAs Sean Gallagher DSP Specialist Xilinx Inc.
Module I Overview of Computer Architecture and Organization.
General Purpose FIFO on Virtex-6 FPGA ML605 board midterm presentation
Students: Oleg Korenev Eugene Reznik Supervisor: Rolf Hilgendorf
Presenter : Cheng-Ta Wu Vijay D’silva, S. Ramesh Indian Institute of Technology Bombay Arcot Sowmya University of New South Wales, Sydney.
On-Chip Networks and Testing
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
SHAPES scalable Software Hardware Architecture Platform for Embedded Systems Hardware Architecture Atmel Roma, INFN Roma, ST Microelectronics Grenoble,
Hardware Design This material exempt per Department of Commerce license exception TSU.
B212/MAPLD 2005 Craven1 Configurable Soft Processor Arrays Using the OpenFire Processor Stephen Craven Cameron Patterson Peter Athanas Configurable Computing.
1 3-General Purpose Processors: Altera Nios II 2 Altera Nios II processor A 32-bit soft core processor from Altera Comes in three cores: Fast, Standard,
© 2004 Xilinx, Inc. All Rights Reserved EDK Overview.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
© 2007 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Hardware Design INF3430 MicroBlaze 7.1.
J. Christiansen, CERN - EP/MIC
Micro-Research Finland Oy Components for Integrating Device Controllers for Fast Orbit Feedback Jukka Pietarinen EPICS Collaboration Meeting Knoxville.
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
Veronica Eyo Sharvari Joshi. System on chip Overview Transition from Ad hoc System On Chip design to Platform based design Partitioning the communication.
EEE440 Computer Architecture
Part A Presentation Implementation of DSP Algorithm on SoC Student : Einat Tevel Supervisor : Isaschar Walter Accompanying engineer : Emilia Burlak The.
EE3A1 Computer Hardware and Digital Design
CS 4396 Computer Networks Lab Router Architectures.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | SCHOOL OF COMPUTER SCIENCE | GEORGIA INSTITUTE OF TECHNOLOGY MANIFOLD Manifold Execution Model and System.
FPL Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs.
This material exempt per Department of Commerce license exception TSU Xilinx On-Chip Debug.
1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.
Making Good Points : Application-Specific Pareto-Point Generation for Design Space Exploration using Rigorous Statistical Methods David Sheldon, Frank.
CPU/BIOS/BUS CES Industries, Inc. Lesson 8.  Brain of the computer  It is a “Logical Child, that is brain dead”  It can only run programs, and follow.
DDRIII BASED GENERAL PURPOSE FIFO ON VIRTEX-6 FPGA ML605 BOARD PART B PRESENTATION STUDENTS: OLEG KORENEV EUGENE REZNIK SUPERVISOR: ROLF HILGENDORF 1 Semester:
Survey of Reconfigurable Logic Technologies
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
System on a Programmable Chip (System on a Reprogrammable Chip)
1 Architecture of Datapath- oriented Coarse-grain Logic and Routing for FPGAs Andy Ye, Jonathan Rose, David Lewis Department of Electrical and Computer.
New Opportunities for Computer Architecture Research Using High-Density FPGAs and Design Tools Nahi Abdul-Ghani, Patrick Akl, Mohammad El-Majzoub, Maroulla.
Introduction to Programmable Logic
Andrew Putnam University of Washington RAMP Retreat January 17, 2008
Ming Liu, Wolfgang Kuehn, Zhonghai Lu, Axel Jantsch
BIC 10503: COMPUTER ARCHITECTURE
Network-on-Chip Programmable Platform in Versal™ ACAP Architecture
Presentation transcript:

Onchip Interconnect Exploration for Multicore Processors Utilizing FPGAs Graham Schelle and Dirk Grunwald University of Colorado at Boulder

Outline Network on Chip (NoC) defined Network on Chip (NoC) defined Current onchip interconnect tools Current onchip interconnect tools NoCem (NoC Emulator) specification NoCem (NoC Emulator) specification What else is needed before release What else is needed before release We want it to be used…and cited We want it to be used…and cited Conclusions Conclusions

Network on Chip Defined (in 1 slide!) Power/design concerns in modern processors lead to multicore chips Transistors seen as “free” allowing more transistors for non- computational tasks Network on Chip Networking scales to infinite number of access points and is well understood High speed clocking leads to signals not propagating across chip in single cycle

Onchip Interconnects for FPGAs Existing Buses on FPGAs Existing Buses on FPGAs PLB,OPB,FSL PLB,OPB,FSL Can have multiple masters (e.g. processors) Can have multiple masters (e.g. processors) Scale well for current uses of FPGAs Scale well for current uses of FPGAs Existing NoCs Existing NoCs Research projects Research projects Proprietary projects Proprietary projects Application specific (streaming…) Application specific (streaming…) Not built for parameterization, some other VALID focus Not built for parameterization, some other VALID focus

NoCem Specification Synthesizable VHDL Synthesizable VHDL Heavy use of generics / generate statements Heavy use of generics / generate statements Requires minimal Xilinx IP (FIFOs…) Requires minimal Xilinx IP (FIFOs…) To modify anything To modify anything Change generics, everything automatically generated Change generics, everything automatically generated E.g. to go from 2x2 mesh with 16b datawidth to 4x4 torus with 8b datawidth, change 3 lines of code! E.g. to go from 2x2 mesh with 16b datawidth to 4x4 torus with 8b datawidth, change 3 lines of code!

NoCem Interface FIFO-ish FIFO-ish Enqueue and dequeue path for every access point Enqueue and dequeue path for every access point Packet Control and Data paths Packet Control and Data paths Meaning of those paths depends on NoC configuration Meaning of those paths depends on NoC configuration Datapath Datapath Only variable width. Length of packet determined by packet control Only variable width. Length of packet determined by packet control Packet control: src, dest, packet length Packet control: src, dest, packet length Underlying Network reads toplevel packet structure, reads correct fields at correct times Underlying Network reads toplevel packet structure, reads correct fields at correct times

NoCem Bridges Use Existing Buses, bridge to NoC Use Existing Buses, bridge to NoC Integration into existing Xilinx tool flows Integration into existing Xilinx tool flows NoC can look like memory, SoC, … NoC can look like memory, SoC, … Use IPIF interface Use IPIF interface PLB, OPB PLB, OPB Different bus widths… Different bus widths… But processors both 32b But processors both 32b

How Big is NoCem? NoC Dimensions DatawidthLUTs xc2vp30 LUTs used 2x216b4,08614% 3x316b11,69342% 4x416b21,57078% 2x232b5,82221% 3x332b16,39459% 4x432b34,370125% Mesh, 16-deep channel FIFOs, RR Arbitration

Example Uses Memory Architecture (in paper) Memory Architecture (in paper) Various distributed cache configurations Various distributed cache configurations Asymmetric Processor Configuration Asymmetric Processor Configuration Using Microblaze, PowerPC Using Microblaze, PowerPC Special Processor Offloads Special Processor Offloads Floating Point, Network Processing Floating Point, Network Processing All can be emulated over NoC using NoCem…

For Release We want NoCem to be used! We want NoCem to be used! Already in use at CU Boulder Already in use at CU Boulder Full source will be made available online Full source will be made available online To do for release To do for release Clean/zip up code Clean/zip up code Some Documentation Some Documentation ETA: April 2006 ETA: April 2006

Conclusions NoCem as a research tool NoCem as a research tool Open source Open source Non-proprietary Non-proprietary Non application Specific Non application Specific NoCem for multicore processor research NoCem for multicore processor research Allows NoC exploration Allows NoC exploration Easy integration into Xilinx EDK flow Easy integration into Xilinx EDK flow Useful for a variety of research topics in this space Useful for a variety of research topics in this space

Any Questions?