CprE / ComS 583 Reconfigurable Computing

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

+ CS 325: CS Hardware and Software Organization and Architecture Internal Memory.
Lecture 2: Field Programmable Gate Arrays I September 5, 2013 ECE 636 Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays I.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
CS294-6 Reconfigurable Computing Day 8 September 17, 1998 Interconnect Requirements.
The Spartan 3e FPGA. CS/EE 3710 The Spartan 3e FPGA  What’s inside the chip? How does it implement random logic? What other features can you use?  What.
Lecture 3: Field Programmable Gate Arrays II September 10, 2013 ECE 636 Reconfigurable Computing Lecture 3 Field Programmable Gate Arrays II.
Registers  Flip-flops are available in a variety of configurations. A simple one with two independent D flip-flops with clear and preset signals is illustrated.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 13: February 26, 2007 Interconnect 1: Requirements.
CS294-6 Reconfigurable Computing Day 2 August 27, 1998 FPGA Introduction.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
General FPGA Architecture Field Programmable Gate Array.
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #3 – FPGA.
EE 261 – Introduction to Logic Circuits Module #8 Page 1 EE 261 – Introduction to Logic Circuits Module #8 – Programmable Logic & Memory Topics A.Programmable.
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 7 Programmable.
Caltech CS184 Winter DeHon 1 CS184a: Computer Architecture (Structure and Organization) Day 13: February 4, 2005 Interconnect 1: Requirements.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
FPGA and CADs Presented by Peng Du & Xiaojun Bao.
DSD Presentation Introduction of Actel FPGA. page 22015/9/11 Presentation Outline  Overview  Actel FPGA Characteristic  Actel FPGA Architecture  Actel.
EGRE 427 Advanced Digital Design Figures from Application-Specific Integrated Circuits, Michael John Sebastian Smith, Addison Wesley, 1997 Chapter 4 Programmable.
J. Christiansen, CERN - EP/MIC
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
Programmable Logic Devices
Sept. 2005EE37E Adv. Digital Electronics Lesson 1 CPLDs and FPGAs: Technology and Design Features.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
EE3A1 Computer Hardware and Digital Design
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #4 – FPGA.
FPGA-Based System Design: Chapter 1 Copyright  2004 Prentice Hall PTR Moore’s Law n Gordon Moore: co-founder of Intel. n Predicted that number of transistors.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR Topics n FPGA fabric architecture concepts.
A Survey of Fault Tolerant Methodologies for FPGA’s Gökhan Kabukcu
Programmable Logic Devices
1 Introduction to Engineering Fall 2006 Lecture 17: Digital Tools 1.
Field Programmable Gate Arrays
Memory and Programmable Logic
ETE Digital Electronics
Prof. Hsien-Hsin Sean Lee
Sequential Programmable Devices
Sequential Logic Design
Memories.
COMP211 Computer Logic Design
Reconfigurable Architectures
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Recap DRAM Read Cycle DRAM Write Cycle FAST Page Access Mode
Give qualifications of instructors: DAP
Chapter 7.2 Computer Architecture
Overview The Design Space Programmable Implementation Technologies
Instructor: Dr. Phillip Jones
From Silicon to Microelectronics Yahya Lakys EE & CE 200 Fall 2014
Memory Units Memories store data in units from one to eight bits. The most common unit is the byte, which by definition is 8 bits. Computer memories are.
Chapter 11 Sequential Circuits.
CprE / ComS 583 Reconfigurable Computing
We will be studying the architecture of XC3000.
Multiple Drain Transistor-Based FPGA Architectures
CprE / ComS 583 Reconfigurable Computing
The Xilinx Virtex Series FPGA
Physical Implementation Manufactured IC Technologies
Topics Antifuse-based FPGA fabrics: Flash-based FPGAs Actel.
Semiconductor Memories
Topics Circuit design for FPGAs: Logic elements. Interconnect.
Islamic University - Gaza
Lecture 11 Logistics Last lecture Today HW3 due now
CprE / ComS 583 Reconfigurable Computing
The Xilinx Virtex Series FPGA
Give qualifications of instructors: DAP
ECE 352 Digital System Fundamentals
FIGURE 5-1 MOS Transistor, Symbols, and Switch Models
CprE / ComS 583 Reconfigurable Computing
Programmable logic and FPGA
Reconfigurable Computing (EN2911X, Fall07)
Presentation transcript:

CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #3 – FPGA Basics

CprE 583 – Reconfigurable Computing Quick Points HW #1 is out Due Thursday, September 7 (12pm) Submission via WebCT Possible strategies Standard Preferred Effort Level Assigned Due August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Recap FPGAs – spatial computation CPU – temporal computation FPGAs are by their nature more computationally “dense” than CPU In terms of number of computations / time / area Can be quantitatively measured and compared Capacity, cost, ease of programming still important issues Numerous challenges to reconfiguration August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Outline Recap FPGA Taxonomy Lookup Tables and Digital Logic Interconnect / Routing Structures FPGA Architectural Issues August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing FPGA Taxonomy Programming technology – how is the FPGA programmed? Where does it store configuration bits? SRAM Anti-fuse EPROM Flash memory (EEPROM) Logic cell architecture – what is the granularity of configurable component? Tradeoff between complexity and versitility Transistors Gates PAL/PLAs LUTs CPUs Interconnect architecture – how do the logic cells communicate? Tiled Hierarchical Local August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Anti-Fuse Technology Dielectric that prevents current flow Applying a voltage melts the dielectric One time programmable – not really reconfigurable computing August 29, 2006 CprE 583 – Reconfigurable Computing

Anti-Fuse Technology (cont.) Negligible programming overhead Fast routing Nonvolatile (hold data after power off) High level of security: No bitstream can be intercepted in the field Need a Scanning Electron Microscope (SEM) to figure out the anti-fuse states © Actel CprE 583 – Reconfigurable Computing August 29, 2006

CprE 583 – Reconfigurable Computing E(E)PROM Technology To program a transistor, a voltage differential between the access/floating gates accelerates electrons from the source fast enough to travel across the insulator to the floating gate Electrons prevent the access gate from closing EPROM – Erasable Programmable Read-Only Memory Nonvolatile Can be erased using UV light EEPROM – Electrically Erasable Programmable Read-Only Memory Removes the electrons by reversing the voltage differential Limited number of erases possible Precursor to Flash technology August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Flash / EEPROM Devices Migrated from early PLD technology Traditionally based on AND/OR architecture High ratio of logic-to-registers Future of Flash devices? Logic elements (LUTs and flip flops) Segmented routing Low logic to register ratio Actel ProASIC Plus logic cell CprE 583 – Reconfigurable Computing August 29, 2006

CprE 583 – Reconfigurable Computing SRAM Technology SRAM – Static Random Access Memory SRAM cells are larger (6 transistors) than anti-fuse or EEPROM Slower Less computational power per λ2 SRAM bits can be programmed many times August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Which Devices to Study? SRAM – ~95%, Flash/Anti-fuse – ~5% August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Lookup Tables (LUTs) What is a Lookup Table (LUT)? In most generic terms, a pre-loaded memory Great way of implementing some function without calculation – a cheat sheet int table[256] = {1,7,8,9,…14}; int Q, addr; ... Q = table[addr]; Data Addr Q R_W Answer Key . A(Q7) = 42 Q7: What is the answer to life, the universe, and everything? August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing LUTs and Digital Logic I1 I2 … Ik Cin carry logic Cout k-LUT DFF OUT Each k-LUT operates on k one-bit inputs Output is one data bit Can perform any Boolean function of k inputs August 29, 2006 CprE 583 – Reconfigurable Computing

LUTs and Digital Logic (cont.) k inputs  2k possible input values k-LUT corresponds to 2k x 1 bit memory How many different functions? Two-input (A1,A2) case F(A1,A2) = {0, A1 nor A2, Ā1 and A2, Ā1, A1 and Ā2, Ā2, A1 xor A2, A1 nand A2, A1 and A2, A1 xnor A2, A2, Ā1 or A2, A1, A1 or Ā2, A1 or A2, 1} = 16 different possibilities What to store in the LUT? 0 0 0 1 1 0 1 1 A1 A2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 CprE 583 – Reconfigurable Computing August 29, 2006

CprE 583 – Reconfigurable Computing LUTs and Digital Logic F(input) can be 0 or 1 independently for each of the 2k bits 22k possible functions Not all patterns are unique (k! permutations of the inputs) Approximately 22k / k! unique functions Is this efficient? How to select k value? 0 0 0 1 1 0 1 1 A1 A2 1 1 2 3 1 4 5 6 7 8 9 10 11 12 13 14 15 1 1 1 1 1 1 1 1 1 1 1 1 1 August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing LUT Mapping How many gates in a 2-LUT + flip-flop? D S Q 1 0 0 0 1 1 0 1 1 A1 A2 x0 x1 x2 x3 f DFF Q’ R ~8 0.5 1.5 4 CprE 583 – Reconfigurable Computing August 29, 2006

CprE 583 – Reconfigurable Computing LUT Mapping (cont.) Implement the following logic function with k-LUTs, for k={2, 3, 4}: F = A0A1A3 + A1A2Ā3 + Ā0 Ā1 Ā2 A0 A1 A2 A3 F A0 A3 A2 A1 F A0 A2 A3 F A1 CprE 583 – Reconfigurable Computing August 29, 2006

Logic Block Granularity Each k-LUT requires 2k configuration bits: The 2-LUT implementation requires 22 x 7 = 28 bits The 3-LUT needs 23 x 3 = 24 bits The 4-LUT needs just 24 x 1 = 16 bits Using configuration bits as area measure (area cost), the 4-LUT implementation achieves minimum logic area August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Interconnect Problem: Thousands of operators producing results Each taking as outputs the results of other bit operators Initial assumptions – have to connect them all simultaneously August 29, 2006 CprE 583 – Reconfigurable Computing

Interconnect Design Issues Flexibility – route anything (within reason?) Bisection bandwidth Simultaneous routes Area Switches Delay (and Power) Switches in path Wire length Routability – difficulty of finding a route August 29, 2006 CprE 583 – Reconfigurable Computing

General Routing Architecture A wire segment is a wire unbroken by programmable switches Typically one switch is attached to each end of a wire segment A track is a sequence of one or more wire segments in a line A routing channel is a group of parallel tracks A connection block provides connectivity from the inputs and outputs of a logic block to the wire segments in the channels August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Attempt 1 – Crossbar Any operator may want to take as input the output of any other operator Let n be the number of LUTs, and l the length of wire Delay: Parasitic loads = kn Switches/path = l Wire length = O(kn) Delay = O(kn) Area: Bisection BW = n Switches = kn2 Area = O(n2) August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Attempt 2 – Mesh Put connected components together Transitive fan-in grows faster than the close sites: Let w be the channel width Delay: Switches/path = l Wire length = O(l) Best Delay = O(w) Worst Delay = O(n½w) Area: Bisection BW = wn½ Switches = O(nw2) Area = O(nw2) August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Segmentation X Y Length 4 Length 2 Length 1 Segmentation distribution: how many of each length? Longer length Better performance? Reduced routability? CprE 583 – Reconfigurable Computing August 29, 2006

Interconnect Area/Delay Interconnect area = ~10x configuration memory = ~100x logic The interconnect constitutes ~90% of the total area and 70-80% of the total delay See [Deh96A] for details August 29, 2006 CprE 583 – Reconfigurable Computing

Architectural Issues [AhmRos04A] What values of N, I, and K minimize the following parameters? Area Delay Area-delay product Assumptions All routing wires length 4 Fully populated IMUX Wiring is half pass transistor, half tri-state 180 nm Routing performed with Wmin + 30% tracks August 29, 2006 CprE 583 – Reconfigurable Computing

CprE 583 – Reconfigurable Computing Summary Three basic types of FPGA devices Antifuse EEPROM SRAM Key issues for SRAM FPGA are logic cluster, connection box, and switch box. Most tasks have structure, which can be exploited by LUT arrays with programmable interconnect (FPGAs) Cannot afford full interconnect, or make all interconnects local August 29, 2006 CprE 583 – Reconfigurable Computing