Jongsok Choi M.A.Sc Candidate, University of Toronto
Overview TSMC 0.35 um technology Cadence tools Less than 2mm X 2mm die area Design time = 1 month Tile based approach Each tile contains a Logic Block, 2 Connections Blocks and a Switch Box Pass transistor approach 2
References Architecture and CAD for Deep-Submicron FPGAs 3
Presentation Outline Schematics Base Cells – Pass transistor, SRAM, Multiplexer Logic Block – LUT, Set/Reset Logic, D-flipflop Connection Box – Right, Bottom Switch Box Tile 2X2 Programming Circuitry – Row, Column FPGA 4X4 – Programming a multiplier FPGA 32X16 – full schematic Layouts Base Cells – SRAM, Multiplexer, Pull-up Buffer Logic Block – LUT, Set/Reset Logic, D-flipflop Connection Box – Right, Bottom Programming Circuitry – Row, Column Tile – Single tile, Tile 2X2 FPGA 4X4 – Post-layout simulation of programmed multiplier FPGA 32X16 – floor plan, full layout Clock tree – H-tree implemented Complete layout with Padframe DRC, LVS Results Employed layout techniques and Conclusions
Schematics 5
Base Cells Schematic Simulation o Pass transistor 6 Highlighted red boxes in the top right hand corner indicate where this cell is used (e.g. Pass transistor is used in the logic element, connection boxes 1 and 2, and the switch block)
Base Cells Schematic Simulation o SRAM cell : to program the FPGA with the required functionality 7
Base Cells Schematic: Simulation o 2-to-1 Multiplexer 8
Base Cells Schematic Simulation Sel2/Sel1out 11IN_1 10IN_2 01IN_3 00IN_4 o 4-to-1 Multiplexer: to choose between the four SRAM bits in the LUT 9
Logic Block Top-level Schematic 10
Logic Block - LUT Schematic Simulation 11
Logic Block – Set/Reset Logic Schematic: Simulatio n When Sram 1, 2 set to 1 => Set= 1 When Sram 1, 2 set to 0 => Reset= 1 12
Logic Block – D-Flip Flop Simulation Schematic 13
Connection Box -Right Schematic Simulation Track2 selected when SRAM set to 0 Track1 selected when SRAM set to 1 o Functionality: Connect vertical tracks to logic element 14
Connection Box -Bottom Top Level Schematic Output from CB to Tracks Input to CB from Tracks 15
Switch Box Schematic 16
TILE 2x2 V1 V2V3 V4 H1 H2 H3 H4 Schematic: Each tile has different connections at the switch box Segmented and staggered routing structure for FPGA Segment Length of 2 17
Programming Circuitry – Programming Column Schematic Simulation 18
Schematic Simulation Programming Circuitry – Programming Row 19
FPGA 4x4 Schematic 20
FPGA Mapping and Programming bits for a 2 by 2 Multiplier Table shows manually created bitstream to program the multiplier using 4X4 tiles with programming circuits 21 FPGA 4x4
Input 1 Input 2 Bit[3] Simulation 2 by 2 Multiplier correctly implemented Shows correct output for all possible inputs Bit[2] Bit[1] Bit[0] Numbers shows total output 22
FPGA 32x16 – Full Schematic 23
Layouts 24
Base Cells Schematic Layout 25 o SRAM cell : to program the FPGA with the required functionality
Base Cells Schematic Layout Sel2/Sel1out 11IN_1 10IN_2 01IN_3 00IN_4 26 o 4-to-1 Multiplexer: to choose between the four SRAM bits in the LUT
Base Cells Layout 27 o Pull-up buffer: used to pull the degraded signal back up to VDD
Logic Block 28 Top-level Schematic
Logic Block - LUT Layout 29 Schematic Layout
Logic Block – Set/Reset Logic Schematic Layout 30
Logic Block – D-flipflop Layout 31 Schematic
Logic Block LUTSet/Reset Buffer_inverter for clock Pullup Buffer D-flipflop 32 Layout
Connection Box -Right 33 Layout Schematic
Connection Box - Bottom 34 Top-level Schematic Output from Connection box to Tracks
Programming Circuitry – Programming Column 35 Layout Schematic
36 Programming Circuitry – Programming Column
37 Programming Circuitry – Programming Row Layout Schematic
38 Programming Circuitry – Programming Row
Tile 39 Schematic
Tile -Layout 40 Logic Element Right Connection Box Bottom Connection BoxSwitch Box
TILE 2x2 - Layout 41
FPGA 4x4 - Layout 42
43 FPGA Mapping and Programming bits for a 2 by 2 Multiplier Table shows manually created bitstream to program the multiplier using 4X4 tiles with programming circuits FPGA 4x4 - Post Layout Simulation
FPGA 4x4 – Post-Layout Simulation Input 1 Input 2 44 Bit[3] Bit[2] Bit[1] Bit[0] Numbers shows total output Post-Layout Simulation 2 by 2 Multiplier correctly implemented Shows correct output for all possible inputs Matches schematic simulations
32x16 Tiles FPGA Floorplan Programming Row 4x4 Tile Programming Column 4x4 Tile mm 4x4 Tile mm 4x4 Tile 1.25 mm mm 45
FPGA 32x16 - Layout 46
Clock Tree 47 H-tree structure Perfectly symmetrical in every direction to reduce clock skew
Complete layout with Padframe 48
49 DRC - Passed
50 LVS - Passed
Layout Techniques Employed 51 General Techniques Cell pitch of 6um used, layouts optimized for area to match pitch size Shared Sources/Drains when possible to minimize area Shared VDD and ground rails between rows Hierarchical Layout Bigger cells composed of multiple smaller cells Orthogonal metal routing using M3, M4, Local routing using M1, M2 Blocks made to abut well Wider tracks for power rails to provide enough power Wider horizontal tracks, vertical tracks, and clock tree for increase drive strength
Conclusions 52 Designed a fully functional FPGA Can Implement up to 512 gates Consists of 8,704 SRAMs 148,448 transistors without padframe
Questions 53