UNIT VI SUBSYSTEM DESIGN PROCESSES AND ILLUSTRATION

Slides:



Advertisements
Similar presentations
Chapter 9 Computer Design Basics. 9-2 Datapaths Reminding A digital system (or a simple computer) contains datapath unit and control unit. Datapath: A.
Advertisements

EE141 © Digital Integrated Circuits 2nd Arithmetic Circuits 1 Digital Integrated Circuits A Design Perspective Arithmetic Circuits Jan M. Rabaey Anantha.
Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.
ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.
Modern VLSI Design 2e: Chapter 6 Copyright  1998 Prentice Hall PTR Topics n Shifters. n Adders and ALUs.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Abdullah Aldahami ( ) Feb26, Introduction 2. Feedback Switch Logic 3. Arithmetic Logic Unit Architecture a.Ripple-Carry Adder b.Kogge-Stone.
ECE 465 Introduction to CPLDs and FPGAs Shantanu Dutt ECE Dept. University of Illinois at Chicago Acknowledgement: Extracted from lecture notes of Dr.
Arithmetic Building Blocks
Arithmetic Building Blocks
J. Christiansen, CERN - EP/MIC
Chapter 14 Arithmetic Circuits (I): Adder Designs Rev /12/2003
A Reconfigurable Low-power High-Performance Matrix Multiplier Architecture With Borrow Parallel Counters Counters : Rong Lin SUNY at Geneseo
EE3A1 Computer Hardware and Digital Design
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
Lecture 1: Course Introduction September 8, 2004 ECE 697F Reconfigurable Computing Lecture 1 Course Introduction Prof. Russell Tessier.
Integrated Microsystems Lab. EE372 VLSI SYSTEM DESIGNE. Yoon 1-1 Panorama of VLSI Design Fabrication (Chem, physics) Technology (EE) Systems (CS) Matel.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
Programmable Logic Devices
Reconfigurable Computing - Performance Issues John Morris Chung-Ang University The University of Auckland ‘Iolanthe’ at 13 knots on Cockburn Sound, Western.
Introduction to the FPGA and Labs
This chapter in the book includes: Objectives Study Guide
Sequential Logic Design
Combinational Circuits
ECE 636 Reconfigurable Computing Lecture 1 Course Introduction Prof
Subject Name: Fundamentals Of CMOS VLSI Subject Code: 10EC56
Lecture 15 Sequential Circuit Design
Subtitle: How to design the data path of a processor.
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
This chapter in the book includes: Objectives Study Guide
Computer Design Basics
Chap 7. Register Transfers and Datapaths
Swamynathan.S.M AP/ECE/SNSCT
Instructor: Dr. Phillip Jones
Architecture & Organization 1
This chapter in the book includes: Objectives Study Guide
Registers and Counters Register : A Group of Flip-Flops. N-Bit Register has N flip-flops. Each flip-flop stores 1-Bit Information. So N-Bit Register Stores.
Instructor: Alexander Stoytchev
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
Combinatorial Logic Design Practices
CprE / ComS 583 Reconfigurable Computing
We will be studying the architecture of XC3000.
Digital Building Blocks
CprE / ComS 583 Reconfigurable Computing
The Xilinx Virtex Series FPGA
Architecture & Organization 1
VLSI Arithmetic Adders & Multipliers
Design Technologies Custom Std Cell Performance Gate Array FPGA Cost.
Digital Integrated Circuits A Design Perspective
Lesson 4 Synchronous Design Architectures: Data Path and High-level Synthesis (part two) Sept EE37E Adv. Digital Electronics.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN
CSE 370 – Winter 2002 – Comb. Logic building blocks - 1
Subject Name: Fundamentals Of CMOS VLSI Subject Code: 10EC56
Programmable Configurations
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
Digital Fundamentals Tenth Edition Floyd Chapter 11.
HIGH LEVEL SYNTHESIS.
Overview Part 1 – Design Procedure Part 2 – Combinational Logic
Part III The Arithmetic/Logic Unit
The Xilinx Virtex Series FPGA
Computer Design Basics
Combinational Circuits
ECE 352 Digital System Fundamentals
Lecture 9 Digital VLSI System Design Laboratory
EE216A – Fall 2010 Design of VLSI Circuits and Systems
ECE 352 Digital System Fundamentals
Arithmetic Building Blocks
CprE / ComS 583 Reconfigurable Computing
Arithmetic Circuits.
Computer Architecture
Presentation transcript:

UNIT VI SUBSYSTEM DESIGN PROCESSES AND ILLUSTRATION

UNIT – V SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION INTRODUCTION Objectives: Design consideration, problem and solution Design processes Basic digital processor structure Datapath Bus Architecture Design 4 – bit shifter Design of ALU subsystem Adders Multipliers UNIT – V SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

GENERAL CONSIDERATIONS Lower unit cost Higher reliability Lower power dissipation, lower weight and lower volume Better performance Enhanced repeatability Possibility of reduced design/development periods UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION SOME PROBLEMS How to design complex systems in a reasonable time & with reasonable effort. The nature of architectures best suited to take full advantage of VLSI and the technology. The testability of large/complex systems once implemented on silicon. UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION SOME SOLUTIONS Problem 1 & 3 are greatly reduced if two aspects of standard practices are accepted: a) Top-down design approach with adequate CAD tools to do the job b) Partitioning the system sensibly c) Aiming for simple interconnections d) High regularity within subsystem e) Generate and then verify each section of the design. Devote significant portion of total chip area to test and diagnostic facility Select architectures that allow design objectives and high regularity in realization UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Structured design begins with the concept of hierarchy It is possible to divide any complex function into less complex subfunctions that is up to leaf cells Process is known as top-down design As a systems complexity increases, its organization changes as different factors become relevant to its creation Coupling can be used as a measure of how much submodels interact It is crucial that components interacting with high frequency be physically proximate, since one may pay severe penalties for long, high-bandwidth interconnects Concurrency should be exploited – it is desirable that all gates on the chip do useful work most of the time Because technology changes so fast, the adaptation to a new process must occur in a short time. UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Approaches used at Different Stages Conventional circuit symbols Logic symbols Stick diagram Any mixture of logic symbols and stick diagram that is convenient at a stage Mask layouts Architectural block diagrams and floor plans UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.1: Basic digital processor structure UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.2: Communication strategy for the datapath UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.3: Subunits and basic interconnection for datapath UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.4: One bus architecture Sequence: 1. 1st operand from registers to ALU. Operand is stored there. 2. 2nd operand from register to ALU and added. 3. Result is passed through shifter and stored in the register UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.5: Two bus architecture Sequence: 1. Two operands (A & B) are sent from register(s) to ALU & are operated upon, result (S) in ALU. 2. Result is passed through the shifter & stored in registers. UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.6: Three bus architecture Sequence: Two operands (A & B) are sent from registers, operated upon, and shifted result (S) returned to another register, all in same clock period. UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Figure 6.7: Tentative floor plan for 4 – bit datapath UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES General Arrangement of 4-bit Arithmetic Processor Points to be noted for design: Metal can cross poly or diffusion Poly crossing diffusion form a transistor Whenever lines touch on the same level an interconnection is formed Simple contacts can be used to join diffusion or poly to metal Buried contacts or a butting contacts can be used to join diffusion and poly Some processes use 2nd metal 1st and 2nd metal layers may be joined using a via Each layer has particular electrical properties which must be taken into account For CMOS layouts, p-and n-diffusion wires must not directly join each other Nor may they cross either a p-well or an n-well boundary UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Design of a 4-bit Shifter Any general purpose n-bit shifter should be able to shift incoming data by up to (n – 1) place in a right-shift or left-shift direction. Further specifying that all shifts should be on an end-around basis, so that any bit shifted out at one end of a data word will be shifted in at the other end of the word, then the problem of right shift or left shift is greatly eased. The shifter must have: input from a four line parallel data bus four output lines for the shifted data means of transferring input data to output lines with any shift from 0 to 3 bits UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Design of a 4-bit Shifter Figure 6.8: 4 X 4 crossbar switch using MOS UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Design of a 4-bit Shifter Figure 6.9: 4 X 4 barrel shifter UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

ILLUSTRATION OF DESIGN PROCESSES Summary of Design Processes Set out the specifications Partition the architecture into subsystems Set a tentative floor plan Determine the interconnects Choose layers for the bus & control lines Conceive a regular architecture Develop stick diagram Produce mask layouts for standard cell Cascade & replicate standard cells as required to complete the design UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Figure 6.10: 4-bit data path for processor UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Design of 4-bit adder: From the table one form of the equation is: Sum Sk = HkCk-l’ + Hk’Ck-1 New carry Ck = AkBk + HkCk-1 Where Half sum Hk = Ak’Bk + AkBk’ UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Adder element requirement: Table reveals that the adder requirement may be stated as: If Ak = Bk then Sk = Ck-1 Else Sk = Ck-l’ And for the carry Ck If Ak = Bk then Ck = Ak = Bk Else Ck = Ck-l Thus the standard adder element for 1-bit is as shown in the figure 6.11 Figure 6.11: Adder element UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Adder element requirement: Figure 6.12: Multiplexer based adder UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Adder element requirement: Figure 6.13: CMOS based adder UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Standard cells required for adder: Figure 6.14: Multiplexer cell with or without cut UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Standard cells required for adder: Figure 6.15: NMOS (butting contact) inverters UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Standard cells required for adder: Figure 6.16: NMOS (buried contact) inverters UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Standard cells required for adder: Figure 6.17: CMOS inverter design UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Adder element bounding box: Figure 6.18: Approximate bounding box and floor plan for CMOS adder element UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Adder element bounding box: Figure 6.19: 4-bit adder element UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Implementing ALU functions with an adder: The adder equations are: Sum Sk = HkCk-l’ + Hk’Ck-1 New carry Ck = AkBk + Hk Ck-1 Half sum Hk = Ak’Bk + Ak Bk’ Let us consider the sum output, if the previous carry is at logical 0, then Sk = Hk. 1 + Hk’. 0 Sk = Hk = Ak’Bk + Ak Bk’ – An Ex-or operation Now, if Ck-1 is logically 1, then Sk = Hk. 0 + Hk’. 1 Sk = Hk’ – An Ex-Nor operation Next, consider the carry output of each element, first Ck-1 is held at logical 0, then Ck = AkBk + Hk . 0 Ck = AkBk - An And operation Now if Ck-1 is at logical 1, then Ck = AkBk + Hk . 1 On solving Ck = Ak + Bk - An Or operation UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Design of an ALU Subsystem Implementing ALU functions with an adder: Figure 6.20: 1-bit adder element and 4-bit ALU UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Generation: This principle of generation allows the system to take advantage of the occurrences “Ak=Bk”. Propagation: If we are able to localize a chain of bits Ak Ak+1... Ak+p and Bk Bk+1... Bk+p for which Ak not equal to Bk for k in [k, k+p], then the output carry bit of this chain will be equal to the input carry bit of the chain. These remarks constitute the principle of generation and propagation used to speed the addition of two numbers. All adders which use this principle calculate in a first stage. Pk = Ak XOR Bk Gk = Ak Bk UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Figure 6.21: CMOS adder element and using pass/generate concept UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder The Manchester Carry Chain: If the carry path is precharged to VDD, the transmission gate is then reduced to a simple NMOS transistor. In the same way the PMOS transistors of the carry generation is removed. The Manchester cell is very fast, but a large set of such cascaded cells would be slow due to the distributed RC effect and the body effect making the propagation time grow with the square of the number of cells. Figure 6.22: Manchester carry-chain element UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder The Manchester Carry Chain: Figure 6.23: Cascaded Manchester carry-chain elements with buffering UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry select adders: Figure 6.24: Carry select adder structure UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry select adders: Figure 6.25: Carry select adder structure (6-bit) UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry select adders: Optimization of the carry select adder: Computational time T = k1n k1 – delay through one adder cell Dividing the adder into blocks with 2 parallel paths T = k1n/2 + k2 k2 – time needed by multiplexer of next block to select actual output carry For a n-bit adder of M-blocks and each block contains P adder cells in series so that T = Pk1 + (M – 1) k2 ; n = M.P minimum value for T is when M= (k1n / k2 )1/2 UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Figure 6.26: Carry skip adder structure UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Figure 6.27: Carry skip adder structure UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Figure 6.28: Carry skip adder structure (24-bit) UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Optimization of the carry skip adder: Let us formalize that the total adder is made of N adder cells. It contains M blocks of P adder cells. The total of adder cells is then N = M.P The time T needed by the carry signal to propagate through P adder cells is T = k1.P The time T' needed by the carry signal to skip through M adder blocks is T‘ = k2.M The problem to solve is to minimize the worst case delay which is: Tworst = 2(P – 1).k1 + (M – 2) where P = n/M T is minimum when M = (2n.k1/k2)1/2 UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Optimization of the carry skip adder: Figure 6.29: Worst case carry propagation carry skip adder UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry skip adders: Optimization of the carry skip adder: Figure 6.30: Block propagation carry skip adder UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.31: Carry look-ahead adder structure UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.32: Carry look-ahead and ripple through compromise UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.33: 4-bit Carry look-ahead adder unit UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.34: 16-bit, 4X4 block Carry look-ahead adder unit UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.35: Generation of carry out (from 4-bits and carry in) UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

COMPUTATIONAL ELEMENTS Further Consideration of Adder Adder Enhancement Techniques: Carry look-ahead (CLA) adders: Figure 6.36: Four-cell Manchester carry-chain UNIT – VI SUBSYTEM DESIGN PROCESSES AND ILLUSTRATION

Introduction to CPLDs and FPGAs

CPLD Families

CPLD Block Diagram Programmable switch for interconnecting various FBs FF 1 An individual switch In a crossbar is a diamond switch I/Ps O/Ps Programmable switch for interconnecting various FBs Function block (~ PLA w/ 1 o/p that can be FF’ed) Crossbar Switch

CPLD Function Block Extra function (e.g., g, h) i/ps for OR term 2:1 Mux Example function f= ab+bc’+g+h D-FF PLA-like AND array Literal inputs (e.g., a, b, c)

Field Programmable Gate Arrays (FPGAs)

FPGA Types (Anti-fuse technology)

FPGA Families

SRAM-type FPGA Interconnect Architecture Diamond switch Horizontal routing (interconnect) channel PSM: Programmable Switch Matrix (for making connections between interconnects of different channels). The structure shown only allows i-to-i connections Vertical routing channels CLB: Configuration Logic Block (programmable logic cell)

SRAM-type FPGA Interconnect Architecture (contd) Cell Connection Matrix (CCM) PSM

Configuration Logic Block (CLB) 5-i/p function implemented using G, F and H LUTs (Look Up Tables) using Shannon’s Expansion: p(a,b,c,d,e) = a p(1, b, c, d, e) + a’ p(0, b, c, d, e) = a q(b,c,d,e) + a’r(b,c,d,e). q( ) impl. using LUT G, r impl. using LUT F and p=ag + a’h impl. using LUT H The LUT o/ps can go through a FF (for seq. ckt design) or bypass it for a combinational o/p This is called technology mapping: mapping the logic to CLB logic components

Technology Mapping

Programming a CLB (contd)

Components of Modern FPGAs

Digital System: Implementation Spectrum Microprocessor Reconfigurable Hardware ASIC Software Firmware Hardware ASIC gives high performance at cost of inflexibility. Processor is very flexible but not tuned to the application. Reconfigurable hardware is a nice compromise.

Simplified FPGA Logic Element

High-level Compilers & FPGAs Difficult to estimate hardware resources. Some parts of program more appropriate for processor (hardware/software codesign). Compiler must parallelize computation across many resources. Engineers like to write in C/VHDL/Verilog rather than pushing little blocks around. for (i = 0; i<n, i++) { c[i] = a[i] + b[i] } Some success stories

Translating a Design to an FPGA RTL . C = A+B Circuit A B + C Array CAD to translate circuit from text description to physical implementation well understood. Most current FPGA designers use register-transfer level specification (allocation and scheduling) Same basic steps as ASIC design.

Circuit Compilation & Implementation: Basic Steps Technology Mapping Placement Routing LUT 4. Convert all implementation “details” to FPGA programming info (configuration bits): LUT RAM bits, CCM & PSM FF/SRAM bits, etc. Can store config bits on disk or ROM and load into FPGA as needed Can thus use the FPGA to implement multiple digital systems (at different times or sometimes simultaneously in different FPGA partitions) LUT ? Assign a logical LUT to a physical location. Select wire segments and switches for Interconnection.

Technology Mapping: A Simple Example Made of Full Adders FA A B Co Ci S A+B = D Logic synthesis tool reduces circuit to SOP form S = ABCi + ABCi + ABCi + ABCi A A B LUT Co B LUT S Ci Ci Co = ABCi + ABCi + ABCi + ABCi

Processor + FPGA Three possibilities daughtercard Proc FPGA chip Backplane bus (e.g. PCI) 1. FPGA serves as coprocessor for data intensive applications – possible project. Proc FPGA chip 2. FPGA serves as embedded digital system for lower latency processing. “Reconfigurable Functional Unit”