The Microprocessor is no more General Purpose. Design Gap.

Slides:



Advertisements
Similar presentations
A Novel 3D Layer-Multiplexed On-Chip Network
Advertisements

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.
Lecture 6: Multicore Systems
Survey of Reconfigurable Logic Technologies
Lecture 9: Coarse Grained FPGA Architecture October 6, 2004 ECE 697F Reconfigurable Computing Lecture 9 Coarse Grained FPGA Architecture.
Floating-Point FPGA (FPFPGA) Architecture and Modeling (A paper review) Jason Luu ECE University of Toronto Oct 27, 2009.
Course-Grained Reconfigurable Devices. 2 Dataflow Machines General Structure:  ALU-computing elements,  Programmable interconnections,  I/O components.
Computer Architecture & Organization
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Reconfigurable Computing: What, Why, and Implications for Design Automation André DeHon and John Wawrzynek June 23, 1999 BRASS Project University of California.
Seven Minute Madness: Special-Purpose Parallel Architectures Dr. Jason D. Bakos.
Packet-Switched vs. Time-Multiplexed FPGA Overlay Networks Kapre et. al RC Reading Group – 3/29/2006 Presenter: Ilya Tabakh.
Team Morphing Architecture Reconfigurable Computational Platform for Space.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Evolution of implementation technologies
Programmable logic and FPGA
Dynamic NoC. 2 Limitations of Fixed NoC Communication NoC for reconfigurable devices:  NOC: a viable infrastructure for communication among task dynamically.
Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.
Octavo: An FPGA-Centric Processor Architecture Charles Eric LaForest J. Gregory Steffan ECE, University of Toronto FPGA 2012, February 24.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
Paper Review I Coarse Grained Reconfigurable Arrays Presented By: Matthew Mayhew I.D.# ENG*6530 Tues, June, 10,
2007 Sept 06SYSC 2001* - Fall SYSC2001-Ch1.ppt1 Computer Architecture & Organization  Instruction set, number of bits used for data representation,
CS668- Lecture 2 - Sept. 30 Today’s topics Parallel Architectures (Chapter 2) Memory Hierarchy Busses and Switched Networks Interconnection Network Topologies.
Making FPGAs a Cost-Effective Computing Architecture Tom VanCourt Yongfeng Gu Martin Herbordt Boston University BOSTON UNIVERSITY.
Architectures for mobile and wireless systems Ese 566 Report 1 Hui Zhang Preethi Karthik.
A RISC ARCHITECTURE EXTENDED BY AN EFFICIENT TIGHTLY COUPLED RECONFIGURABLE UNIT Nikolaos Vassiliadis N. Kavvadias, G. Theodoridis, S. Nikolaidis Section.
Efficient Mapping onto Coarse-Grained Reconfigurable Architectures using Graph Drawing based Algorithm Jonghee Yoon, Aviral Shrivastava *, Minwook Ahn,
Amalgam: a Reconfigurable Processor for Future Fabrication Processes Nicholas P. Carter University of Illinois at Urbana-Champaign.
CHAPTER 12 INTRODUCTION TO PARALLEL PROCESSING CS 147 Guy Wong page
J. Christiansen, CERN - EP/MIC
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.
80-Tile Teraflop Network-On- Chip 1. Contents Overview of the chip Architecture ▫Computational Core ▫Mesh Network Router ▫Power save features Performance.
L11: Lower Power High Level Synthesis(2) 성균관대학교 조 준 동 교수
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 CPLDs and FPGAs: Technology and Design Features.
Copyright © 2009 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Principles of Parallel Programming First Edition by Calvin Lin Lawrence Snyder.
COARSE GRAINED RECONFIGURABLE ARCHITECTURE FOR VARIABLE BLOCK SIZE MOTION ESTIMATION 03/26/
ISSS 2001, Montréal1 ISSS’01 S.Derrien, S.Rajopadhye, S.Sur-Kolay* IRISA France *ISI calcutta Combined Instruction and Loop Level Parallelism for Regular.
EE3A1 Computer Hardware and Digital Design
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) Reconfigurable Architectures Forces that drive.
COARSE GRAINED RECONFIGURABLE ARCHITECTURES 04/18/2014 Aditi Sharma Dhiraj Chaudhary Pruthvi Gowda Rachana Raj Sunku DAY
M.Mohajjel. Why? TTM (Time-to-market) Prototyping Reconfigurable and Custom Computing 2Digital System Design.
A Programmable Single Chip Digital Signal Processing Engine MAPLD 2005 Paul Chiang, MathStar Inc. Pius Ng, Apache Design Solutions.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #12 – Systolic.
Greg Alkire/Brian Smith 197 MAPLD An Ultra Low Power Reconfigurable Task Processor for Space Brian Smith, Greg Alkire – PicoDyne Inc. Wes Powell.
Jason Jong Kyu Park, Yongjun Park, and Scott Mahlke
IT3002 Computer Architecture
An Automated Development Framework for a RISC Processor with Reconfigurable Instruction Set Extensions Nikolaos Vassiliadis, George Theodoridis and Spiridon.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
1 - CPRE 583 (Reconfigurable Computing): Reconfigurable Computing Architectures Iowa State University (Ames) CPRE 583 Reconfigurable Computing Lecture.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #23 – Function.
1 Basic Processor Architecture. 2 Building Blocks of Processor Systems CPU.
Lecture 17: Dynamic Reconfiguration I November 10, 2004 ECE 697F Reconfigurable Computing Lecture 17 Dynamic Reconfiguration I Acknowledgement: Andre DeHon.
Implementing Tile-based Chip Multiprocessors with GALS Clocking Styles Zhiyi Yu, Bevan Baas VLSI Computation Lab, ECE Department University of California,
A Scalable Pipelined Associative SIMD Array With Reconfigurable PE Interconnection Network For Embedded Applications Hong Wang & Robert A. Walker Computer.
System on a Programmable Chip (System on a Reprogrammable Chip)
Runtime Reconfigurable Network-on- chips for FPGA-based systems Mugdha Puranik Department of Electrical and Computer Engineering
ESE534: Computer Organization
Topics Coarse-grained FPGAs. Reconfigurable systems.
Floating-Point FPGA (FPFPGA)
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
How does an SIMD computer work?
Laxmi Narayan Bhuyan SIMD Architectures Laxmi Narayan Bhuyan
We will be studying the architecture of XC3000.
Hyunchul Park, Kevin Fan, Manjunath Kudlur,Scott Mahlke
Characteristics of Reconfigurable Hardware
Fine-grained vs Coarse-grained multithreading
CprE / ComS 583 Reconfigurable Computing
Presentation transcript:

The Microprocessor is no more General Purpose

Design Gap

Problems with Fine Grained Approach FPGAs Area in-efficient – Percentage of chip area for wiring far too high Too slow – Unavoidable critical paths too long Routing and Placement is very complex

Problems with Fine Grained FPGAs

Coarse Grained Reconfigurable computing Uses reconfigurable arrays with path-widths greater than 1 bit More area-efficient Massive reduction in configuration memory and configuration time Drastic reduction in complexity of Placement & Routing

Coarse Grained Architectures Classification Mesh-based Linear Arrays based Cross-bar based

Mesh Based Architectures Arranges PEs in a 2-D array Encourages nearest neighbor links between adjacent PEs Eg. KressArray, Matrix, RAW, CHESS

Matrix – Mesh based Architecture

Matrix – Mesh Based Architecture

Architectures based on Linear Arrays Aimed at mapping pipelines on linear arrays If pipeline has forks longer lines spanning whole or part of the array are used Eg. RaPiD, PipeRench

PipeRench – Linear Array based architecture

PipeRench – Linear Array Based Architecture

Cross-bar based Architectures Communication Network is easy to route Uses restricted cross-bars with hierarchical interconnect to save area Eg. PADDI-1, PADDI-2, Pleiades

PADDI-2 – Cross-bar based architecture

PADDI-2 Cross-bar based Architecture

Coarse Grained Architectures

EGRA Architectural template to enable design space exploration Execute expressions as opposed to operations Supports heterogeneous cells and various memory interfaces

EGRA

Evolution of fine grained and coarse grained architectures

EGRA – at Cell Level

Architectural Exploration

Architectural exploration

EGRA vs CGRA vs FPGA

EGRA – at array level Organized as a mesh of cells of three types – RACs – Memories – Multipliers Cells are connected using both nearest neighbor and horizontal-vertical buses Each cell has a I/O interface, context memory and core

Control Unit

EGRA Operation DMA mode – Used to transfer data in bursts to EGRA – To program cells and to read/write from scratchpad memories Execution mode – Control unit orchestrates data flow between cells

EGRA – at array level

Experimental Results

EGRA Memory Interface Data register at the output of computational cells Memory cells can be scattered around in the array A scratchpad memory outside reconfigurable mesh

Architectural exploration - Area

Architectural exploration - Delay

MORA

The reconfigurable Cell

Operating modes of RC

Interconnection Topology Hierarchical – Level 1 used within 4x4 quadrant to provide nearest neighbor connectivity – Interleaved Horizontal and Vertical connectivity of length two – Each RC can receive data from at most two other RCs and send data to at-most four other RCs – Data and control across quadrants is guaranteed over Level 2 interconnection

Interconnection Topology

Computational Strategies Temporal computational load balancing Spatial computational load balancing