Presentation is loading. Please wait.

Presentation is loading. Please wait.

Xilinx FPGAs - 1 trend toward higher levels of integration Evolution of Implementation Technologies zDiscrete devices: relays, transistors (1940s-50s)

Similar presentations


Presentation on theme: "Xilinx FPGAs - 1 trend toward higher levels of integration Evolution of Implementation Technologies zDiscrete devices: relays, transistors (1940s-50s)"— Presentation transcript:

1 Xilinx FPGAs - 1 trend toward higher levels of integration Evolution of Implementation Technologies zDiscrete devices: relays, transistors (1940s-50s) zDiscrete logic gates (1950s-60s) zIntegrated circuits (1960s-70s) ye.g. TTL packages: Data Book for 100’s of different parts yMap your circuit to the Data Book parts zGate Arrays (IBM 1970s) y“Custom” integrated circuit chips yDesign using a library (like TTL) yTransistors are already on the chip yPlace and route software puts the chip together automatically y+ Large circuits on a chip y+ Automatic design tools (no tedious custom layout) y- Only good if you want 1000’s of parts

2 Xilinx FPGAs - 2 Gate Array Technology (IBM - 1970s) zSimple logic gates yUse transistors to implement combinational and sequential logic zInterconnect yWires to connect inputs and outputs to logic blocks zI/O blocks ySpecial blocks at periphery for external connections zAdd wires to make connections yDone when chip is fabed x“mask-programmable” yConstruct any circuit

3 Xilinx FPGAs - 3 Programmable Logic zDisadvantages of the Data Book method yConstrained to parts in the Data Book yParts are necessarily small and standard yNeed to stock many different parts zProgrammable logic yUse a single chip (or a small number of chips) yProgram it for the circuit you want yNo reason for the circuit to be small

4 Xilinx FPGAs - 4 Programmable Logic Technologies zFuse and anti-fuse yFuse makes or breaks link between two wires yTypical connections are 50-300 ohm yOne-time programmable (testing before programming?) yVery high density zEPROM and EEPROM yHigh power consumption yTypical connections are 2K-4K ohm yFairly high density zRAM-based yMemory bit controls a switch that connects/disconnects two wires yTypical connections are.5K-1K ohm yCan be programmed and re-programmed in the circuit yLow density

5 Xilinx FPGAs - 5 Programmable Logic zProgram a connection yConnect two wires ySet a bit to 0 or 1 zRegular structures for two-level logic (1960s-70s) yAll rely on two-level logic minimization yPROM connections - permanent yEPROM connections - erase with UV light yEEPROM connections - erase electrically yPROMs xProgram connections in the _____________ plane yPLAs xProgram the connections in the ____________ plane yPALs xProgram the connections in the ____________ plane

6 Xilinx FPGAs - 6 Making Large Programmable Logic Circuits zAlternative 1 : “CPLD” yPut a lot of PLDS on a chip yAdd wires between them whose connections can be programmed yUse fuse/EEPROM technology zAlternative 2: “FPGA” yEmulate gate array technology yHence Field Programmable Gate Array yYou need: xA way to implement logic gates xA way to connect them together

7 Xilinx FPGAs - 7 Field-Programmable Gate Arrays zPALs, PLAs = 10 - 100 Gate Equivalents zField Programmable Gate Arrays = FPGAs yAltera MAX Family yActel Programmable Gate Array yXilinx Logical Cell Array z100 - 1000(s) of Gate Equivalents!

8 Xilinx FPGAs - 8 Field-Programmable Gate Arrays zLogic blocks yTo implement combinational and sequential logic zInterconnect yWires to connect inputs and outputs to logic blocks zI/O blocks ySpecial logic blocks at periphery of device for external connections zKey questions: yHow to make logic blocks programmable? yHow to connect the wires? yAfter the chip has been fabbed

9 Xilinx FPGAs - 9 Tradeoffs in FPGAs zLogic block - how are functions implemented: fixed functions (manipulate inputs) or programmable? ySupport complex functions, need fewer blocks, but they are bigger so less of them on chip ySupport simple functions, need more blocks, but they are smaller so more of them on chip zInterconnect yHow are logic blocks arranged? yHow many wires will be needed between them? yAre wires evenly distributed across chip? yProgrammability slows wires down – are some wires specialized to long distances? yHow many inputs/outputs must be routed to/from each logic block? yWhat utilization are we willing to accept? 50%? 20%? 90%?

10 Xilinx FPGAs - 10 8 Product Term AND-OR Array + Programmable MUX's Programmable polarity I/O Pin Seq. Logic Block Programmable feedback Altera EPLD (Erasable Programmable Logic Devices) zHistorical Perspective yPALs: same technology as programmed once bipolar PROM yEPLDs: CMOS erasable programmable ROM (EPROM) erased by UV light zAltera building block = MACROCELL

11 Xilinx FPGAs - 11 Altera EPLDs contain 8 to 48 independently programmed macrocells Personalized by EPROM bits: Flipflop controlled by global clock signal local signal computes output enable Flipflop controlled by locally generated clock signal + Seq Logic: could be D, T positive or negative edge triggered + product term to implement clear function Altera EPLD

12 Xilinx FPGAs - 12 AND-OR structures are relatively limited Cannot share signals/product terms among macrocells Logic Array Blocks (similar to macrocells) Global Routing: Programmable Interconnect Array 8 Fixed Inputs 52 I/O Pins 8 LABs 16 Macrocells/LAB 32 Expanders/LAB EPM5128: Altera Multiple Array Matrix (MAX)

13 Xilinx FPGAs - 13 LAB Architecture Expander Terms shared among all macrocells within the LAB Macrocell ARRAY I/O Block Expander Product Term ARRAY I N P U T S P I A I/O Pad

14 Xilinx FPGAs - 14 Supports large number of product terms per output Latches and muxes associated with output pins P22V10 PAL

15 Xilinx FPGAs - 15 Rows of programmable logic building blocks + rows of interconnect Anti-fuse Technology: Program Once 8 input, single output combinational logic blocks FFs constructed from discrete cross coupled gates Use Anti-fuses to build up long wiring runs from short segments I/O Buffers, Programming and Test Logic Logic ModuleWiring Tracks I/O Buffers, Programming and Test Logic Actel Programmable Gate Arrays

16 Xilinx FPGAs - 16 Basic Module is a Modified 4:1 Multiplexer Example: Implementation of S-R Latch Actel Logic Module

17 Xilinx FPGAs - 17 Interconnection Fabric Actel Interconnect

18 Xilinx FPGAs - 18 Jogs cross an anti-fuse minimize the # of jogs for speed critical circuits 2 - 3 hops for most interconnections Actel Routing Example

19 Xilinx FPGAs - 19 Xilinx Programmable Gate Arrays zCLB - Configurable Logic Block y5-input, 1 output function yor 2 4-input, 1 output functions yoptional register on outputs zBuilt-in fast carry logic zCan be used as memory zThree types of routing ydirect ygeneral-purpose ylong lines of various lengths zRAM-programmable ycan be reconfigured

20 Programmable Interconnect I/O Blocks (IOBs) Configurable Logic Blocks (CLBs)

21 Xilinx FPGAs - 21 The Xilinx 4000 CLB

22 Xilinx FPGAs - 22 Two 4-input functions, registered output

23 Xilinx FPGAs - 23 5-input function, combinational output

24 Xilinx FPGAs - 24 CLB Used as RAM

25 Xilinx FPGAs - 25 Fast Carry Logic

26 Xilinx FPGAs - 26 Xilinx 4000 Interconnect

27 Xilinx FPGAs - 27 Switch Matrix

28 Xilinx FPGAs - 28 Xilinx 4000 Interconnect Details

29 Xilinx FPGAs - 29 Global Signals - Clock, Reset, Control

30 Xilinx FPGAs - 30 Xilinx 4000 IOB

31 Xilinx FPGAs - 31 Xilinx FPGA Combinational Logic Examples zKey: General functions are limited to 5 inputs y(4 even better - 1/2 CLB) yNo limitation on function complexity zExample  2-bit comparator: A B = C D and A B > C D implemented with 1 CLB (GT)F = A C' + A B D' + B C' D' (EQ)G = A'B'C'D'+ A'B C'D + A B'C D'+ A B C D zCan implement some functions of > 5 input

32 Xilinx FPGAs - 32 CLB 5-input Majority Circuit CLB 7-input Majority Circuit Xilinx FPGA Combinational Logic zExamples yN-input majority function: 1 whenever n/2 or more inputs are 1 yN-input parity functions: 5 input/1 CLB; 2 levels yield 25 inputs! CLB 9 Input Parity Logic

33 Xilinx FPGAs - 33 Xilinx FPGA Adder Example zExample y2-bit binary adder - inputs: A1, A0, B1, B0, CIN outputs: S0, S1, Cout Full Adder, 4 CLB delays to final carry out 2 x Two-bit Adders (3 CLBs each) yields 2 CLBs to final carry out

34 Xilinx FPGAs - 34 Computer-Aided Design zCan't design FPGAs by hand yWay too much logic to manage, hard to make changes zHardware description languages ySpecify functionality of logic at a high level zValidation: high-level simulation to catch specification errors yVerify pin-outs and connections to other system components yLow-level to verify mapping and check performance zLogic synthesis yProcess of compiling HDL program into logic gates and flip-flops zTechnology mapping yMap the logic onto elements available in the implementation technology (LUTs for Xilinx FPGAs)

35 Xilinx FPGAs - 35 CAD Tool Path (cont’d) zPlacement and routing yAssign logic blocks to functions yMake wiring connections zTiming analysis - verify paths yDetermine delays as routed yLook at critical paths and ways to improve zPartitioning and constraining yIf design does not fit or is unroutable as placed split into multiple chips yIf design it too slow prioritize critical paths, fix placement of cells, etc. yFew tools to help with these tasks exist today zGenerate programming files - bits to be loaded into chip for configuration

36 Xilinx FPGAs - 36 Xilinx CAD Tools zVerilog (or VHDL) use to specify logic at a high-level yCombine with schematics, library components zSynopsys yCompiles Verilog to logic yMaps logic to the FPGA cells yOptimizes logic zXilinx APR - automatic place and route (simulated annealing) yProvides controllability through constraints yHandles global signals zXilinx Xdelay - measure delay properties of mapping and aid in iteration zXilinx XACT - design editor to view final mapping results

37 Xilinx FPGAs - 37 Applications of FPGAs zImplementation of random logic yEasier changes at system-level (one device is modified) yCan eliminate need for full-custom chips zPrototyping yEnsemble of gate arrays used to emulate a circuit to be manufactured yGet more/better/faster debugging done than with simulation zReconfigurable hardware yOne hardware block used to implement more than one function yFunctions must be mutually-exclusive in time yCan greatly reduce cost while enhancing flexibility yRAM-based only option zSpecial-purpose computation engines yHardware dedicated to solving one problem (or class of problems) yAccelerators attached to general-purpose computers

38 Xilinx FPGAs - 38 ROM-based Design Example: BCD to Excess 3 Serial Converter BCD Excess 3 Code 00000011 00010100 00100101 00110110 01000111 01011000 01101001 01111010 10001011 10011100 Conversion Process Bits are presented in bit serial fashion starting with the least significant bit Single input X, single output Z Implementation Strategies

39 Xilinx FPGAs - 39 State Transition Table Derived State Diagram Implementation Strategies

40 Xilinx FPGAs - 40 ROM-based Implementation Truth Table/ROM I/Os Circuit Level Realization 74175 = 4 x positive edge triggered D FFs In ROM-based designs, no need to consider state assignment Implementation Strategies

41 Xilinx FPGAs - 41 Timing Behavior for input strings 0 0 0 0 (0) and 1 1 1 0 (7) 0 0 0 0 1 1 0 0 1 1 1 0 0 1 0 1 LSBMSB LSB Implementation Strategies

42 Xilinx FPGAs - 42 PLA-based Design State Assignment with NOVA S0 = 000 S1 = 001 S2 = 011 S3 = 110 S4 = 100 S5 = 111 S6 = 101 NOVA derived state assignment 9 product term implementation 0 S0 S1 1 1 S0 S2 0 0 S1 S3 1 1 S1 S4 0 0 S2 S4 0 1 S2 S4 1 0 S3 S5 0 1 S3 S5 1 0 S4 S5 1 1 S4 S6 0 0 S5 S0 0 1 S5 S0 1 0 S6 S0 1 NOVA input file Implementation Strategies

43 Xilinx FPGAs - 43 Espresso Inputs Espresso Outputs.i 4.o 4.ilb x q2 q1 q0.ob d2 d1 d0 z.p 16 0 000 001 1 1 000 011 0 0 001 110 1 1 001 100 0 0 011 100 0 1 011 100 1 0 110 111 0 1 110 111 1 0 100 111 1 1 100 101 0 0 111 000 0 1 111 000 1 0 101 000 1 1 101 --- - 0 010 --- - 1 010 --- -.e.i 4.o 4.ilb x q2 q1 q0.ob d2 d1 d0 z.p 9 0001 0100 10-0 0100 01-0 0100 1-1- 0001 -0-1 1000 0-0- 0001 -1-0 1000 --10 0100 ---0 0010.e Implementation Strategies

44 Xilinx FPGAs - 44 D2 = Q2 Q0 + Q2 Q0 D1 = X Q2 Q1 Q0 + X Q2 Q0 + X Q2 Q0 + Q1 Q0 D0 = Q0 Z = X Q1 + X Q1 Implementation Strategies

45 Xilinx FPGAs - 45 10H8 PAL: 10 inputs, 8 outputs, 2 product terms per OR gate D1 = D11 + D12 D11 = X Q2 Q1 Q0 + X Q2 Q0 D12 = X Q2 Q0 + Q1 Q0 0. Q2 Q0 1. Q2 Q0 8. X Q2 Q1 Q0 9. X Q2 Q0 16. X Q2 Q0 17. Q1 Q0 24. D11 25. D12 32. Q0 33. not used 40. X Q1 41. X Q1 Implementation Strategies

46 Xilinx FPGAs - 46 Implementation Strategies

47 Xilinx FPGAs - 47 Registered PAL Architecture Buffered Input or product term Negative Logic Feedback D2 = Q2 Q0 + Q2 Q0 D1 = X Q2 Q1 Q0 + X Q2 + X Q0 + Q2 Q0 + Q1 Q0 D0 = Q0 Z = X Q1 + X Q1 Implementation Strategies

48 Xilinx FPGAs - 48 Programmable Output Polarity/XOR PALs Buried Registers: decouple FF from the output pin Advantage of XOR PALs: Parity and Arithmetic Operations Implementation Strategies

49 Xilinx FPGAs - 49 Example of XOR PAL Example of Registered PAL Implementation Strategies

50 Xilinx FPGAs - 50 module bcd2excess3 title 'BCD to Excess 3 Code Converter State Machine' u1 device 'p10h8'; "Input Pins X,Q2,Q1,Q0,D11i,D12i pin 1,2,3,4,5,6; "Output Pins D2,D11o,D12o,D1,D0,Z pin 19,18,17,16,15,14; INSTATE = [Q2, Q1, Q0]; S0 = [0, 0, 0]; S1 = [0, 0, 1]; S2 = [0, 1, 1]; S3 = [1, 1, 0]; S4 = [1, 0, 0]; S5 = [1, 1, 1]; S6 = [1, 0, 1]; equations D2 = (!Q2 & Q0) # (Q2 & !Q0); D1 = D11i # D12i; D11o = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0); D12o = (!X & Q2 & !Q0) # (Q1 & !Q0); D0 = !Q0; Z = (X & Q1) # (!X & !Q1); end bcd2excess3; P10H8 PAL Explicit equations for partitioned output functions Specifying PALs with ABEL

51 Xilinx FPGAs - 51 module bcd2excess3 title 'BCD to Excess 3 Code Converter State Machine' u1 device 'p12h6'; "Input Pins X, Q2, Q1, Q0 pin 1, 2, 3, 4; "Output Pins D2, D1, D0, Z pin 17, 18, 16, 15; INSTATE = [Q2, Q1, Q0]; OUTSTATE = [D2, D1, D0]; S0in = [0, 0, 0]; S0out = [0, 0, 0]; S1in = [0, 0, 1]; S1out = [0, 0, 1]; S2in = [0, 1, 1]; S2out = [0, 1, 1]; S3in = [1, 1, 0]; S3out = [1, 1, 0]; S4in = [1, 0, 0]; S4out = [1, 0, 0]; S5in = [1, 1, 1]; S5out = [1, 1, 1]; S6in = [1, 0, 1]; S6out = [1, 0, 1]; equations D2 = (!Q2 & Q0) # (Q2 & !Q0); D1 = (!X & !Q2 & !Q1 & Q0) # (X & !Q2 & !Q0) # (!X & Q2 & !Q0) # (Q1 & !Q0); D0 = !Q0; Z = (X & Q1) # (!X & !Q1); end bcd2excess3; P12H6 PAL Simpler equations Specifying PALs with ABEL

52 Xilinx FPGAs - 52 module bcd2excess3 title 'BCD to Excess 3 Code Converter' u1 device 'p16r4'; "Input Pins Clk, Reset, X, !OE pin 1, 2, 3, 11; "Output Pins D2, D1, D0, Z pin 14, 15, 16, 13; SREG = [D2, D1, D0]; S0 = [0, 0, 0]; S1 = [0, 0, 1]; S2 = [0, 1, 1]; S3 = [1, 1, 0]; S4 = [1, 0, 0]; S5 = [1, 1, 1]; S6 = [1, 0, 1]; P16R4 PAL state_diagram SREG state S0: if Reset then S0 else if X then S2 with Z = 0 else S1 with Z = 1 state S1: if Reset then S0 else if X then S4 with Z = 0 else S3 with Z = 1 state S2: if Reset then S0 else if X then S4 with Z = 1 else S4 with Z = 0 state S3: if Reset then S0 else if X then S5 with Z = 1 else S5 with Z = 0 state S4: if Reset then S0 else if X then S6 with Z = 0 else S5 with Z = 1 state S5: if Reset then S0 else if X then S0 with Z = 1 else S0 with Z = 0 state S6: if Reset then S0 else if !X then S0 with Z = 1 end bcd2excess3; Specifying PALs with ABEL

53 Xilinx FPGAs - 53 Synchronous Counters: CLR, LD, CNT Four kinds of transitions for each state: (1) to State 0 (CLR) (2) to next state in sequence (CNT) (3) to arbitrary next state (LD) (4) loop in current state Careful state assignment is needed to reflect basic sequencing of the counter Careful state assignment is needed to reflect basic sequencing of the counter FSM Design with Counters

54 Xilinx FPGAs - 54 Excess 3 Converter Revisited Note the sequential nature of the state assignments FSM Design with Counters

55 Xilinx FPGAs - 55 Excess 3 Converter CLR signal dominates LD which dominates Count FSM Design with Counters

56 Xilinx FPGAs - 56 Implementing FSMs with Counters Excess 3 Converter Espresso Input File Espresso Output File.i 5.o 7.ilb res x q2 q1 q0.ob z clr ld en c b a.p 17 1---- -0----- 00000 1111--- 00001 1111--- 00010 0111--- 00011 00----- 00100 0111--- 00101 110-011 00110 10----- 00111 ------- 01000 010-100 01001 010-101 01010 1111--- 01011 10----- 01100 1111--- 01101 0111--- 01110 ------- 01111 -------.e.i 5.o 7.ilb res x q2 q1 q0.ob z clr ld en c b a.p 10 0-001 0101101 -0-01 1000000 -11-0 1000000 0-0-0 0101100 -000- 1010000 -0--0 0010000 0-10- 0101011 --11- 1000000 -11-- 0010000 -1-1- 1010000.e

57 Xilinx FPGAs - 57 Excess 3 Converter Schematic Synchronous Output Register FSM Implementation with Counters

58 Xilinx FPGAs - 58 Xilinx LCA Architecture Implementing the BCD to Excess 3 FSM Q2+ = Q2 Q0 + Q2 Q0 Q1+ = X Q2 Q1 Q0 + X Q2 Q0 + X Q2 Q0 + Q1 Q0 Q0+ = Q0 Z = Z Q1 + X Q1 No function more complex than 4 variables 4 FFs implies 2 CLBs Synchronous Mealy Machine Global Reset to be used Place Q2+, Q0+ in once CLB Q1, Z in second CLB maximize use of direct & general purpose interconnections Implementation Strategies

59 Xilinx FPGAs - 59 Implementing the BCD to Excess 3 FSM

60 Xilinx FPGAs - 60 Traffic Light Controller Decomposition into primitive subsystems Controller FSM next state/output functions state register Short time/long time interval counter Car Sensor Output Decoders and Traffic Lights Design Case Study

61 Xilinx FPGAs - 61 Traffic Light Controller Block Diagram Design Case Study

62 Xilinx FPGAs - 62 Subsystem Logic Car Detector Light Decoders Interval Timer Design Case Study

63 Xilinx FPGAs - 63 Next State Logic State Assignment: HG = 00, HY = 10, FG = 01, FY = 11 P1 = C TL Q1 + TS Q1 Q0 + C Q1 Q0 + TS Q1 Q0 P0 = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0 ST = C TL Q1 + C Q1 Q0 + TS Q1 Q0 + TS Q1 Q0 HL[1] = TS Q1 Q0 + Q1 Q0 + TS Q1 Q0 HL[0] = TS Q1 Q0 + TS Q1 Q0 FL[1] = Q0 FL[0] = TS Q1 Q0 + TS Q1 Q0 PAL/PLA Implementation: 5 inputs, 7 outputs, 8 product terms PAL 22V10 -- 11 inputs, 10 prog. IOs, 8 to 14 prod terms per OR ROM Implementation: 32 word by 8-bit ROM (256 bits) Reset may double ROM size Design Case Study

64 Xilinx FPGAs - 64 Counter-based Implementation ST = Count TTL Implementation with MUX and Counter Can we reduce package count by using an 8:1 MUX? 2 x 4:1 MUX Design Case Study

65 Xilinx FPGAs - 65 Counter-based Implementation Dispense with direct output functions for the traffic lights Why not simply decode from the current state? ST is a Synchronous Mealy Output Light Controllers are Moore Outputs Design Case Study

66 Xilinx FPGAs - 66 LCA-Based Implementation Discrete Gate Method: None of the functions exceed 5 variables P1, ST are 5 variable (1 CLB each) P0, HL1, HL0, FL0 are 3 variable (1/2 CLB each) FL1 is 1 variable (1/2 CLB) 4 1/2 CLBs total! Design Case Study

67 Xilinx FPGAs - 67 LCA-Based Implementation Placement of functions selected to maximize the use of direct connections Design Case Study

68 Xilinx FPGAs - 68 LCA-Based Implementation Counter/Multiplexer Method: 4:1 MUX, 2 Bit Upcounter MUX: six variables (4 data, 2 control) but this is the kind of 6 variable function that can be implemented in 1 CLB! 2nd CLB to implement TL C and TL + C' But note that ST/Cnt is really a function of TL, C, TS, Q1, Q0 1 CLB to implement this function of 5 variables! 2 Bit Counter: 2 functions of 3 variables (2 bit state + count) Also implemented in one CLB Traffic light decoders: functions of 2 variables (Q1, Q0) 2 per CLB = 3 CLB for the six lights Total count = 5 CLBs Design Case Study


Download ppt "Xilinx FPGAs - 1 trend toward higher levels of integration Evolution of Implementation Technologies zDiscrete devices: relays, transistors (1940s-50s)"

Similar presentations


Ads by Google