1 A Lithography-friendly Structured ASIC Design Approach By: Salman Goplani* Rajesh Garg # Sunil P Khatri # Mosong Cheng # * National Instruments, Austin,

Slides:

Advertisements

Similar presentations

Embedded Systems Design: A Unified Hardware/Software Introduction 1 Chapter 10: IC Technology.

Advertisements

Digital Integrated Circuits© Prentice Hall 1995 Combinational Logic COMBINATIONAL LOGIC.

FPGA (Field Programmable Gate Array)

Tunable Sensors for Process-Aware Voltage Scaling

Cadence Design Systems, Inc. Why Interconnect Prediction Doesn’t Work.

Slides based on Kewal Saluja

March 23, 2001VLSI Test: Bushnell-Agrawal/Lecture 211 Lecture 21 I DDQ Current Testing n Definition n Faults detected by I DDQ tests n Vector generation.

A Robust, Fast Pulsed Flip- Flop Design By: Arunprasad Venkatraman Rajesh Garg Sunil Khatri Department of Electrical and Computer Engineering, Texas A.

Reap What You Sow: Spare Cells for Post-Silicon Metal Fix Kai-hui Chang, Igor L. Markov and Valeria Bertacco ISPD’08, Pages

Predictably Low-Leakage ASIC Design using Leakage-immune Standard Cells Nikhil Jayakumar Sunil P. Khatri University of Colorado at Boulder.

Ch.3 Overview of Standard Cell Design

1 A Design Approach for Radiation-hard Digital Electronics Rajesh Garg Nikhil Jayakumar Sunil P Khatri Gwan Choi Department of Electrical and Computer.

1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.

Copyright 2005, Agrawal & BushnellVLSI Test: Lecture 19alt1 Lecture 19alt I DDQ Testing (Alternative for Lectures 21 and 22) n Definition n Faults detected.

EECE579: Digital Design Flows

Design of Variable Input Delay Gates for Low Dynamic Power Circuits

1 A Variation-tolerant Sub- threshold Design Approach Nikhil Jayakumar Sunil P. Khatri. Texas A&M University, College Station, TX.

A Novel Clock Distribution and Dynamic De-skewing Methodology Arjun Kapoor – University of Colorado at Boulder Nikhil Jayakumar – Texas A&M University,

An Algorithm to Minimize Leakage through Simultaneous Input Vector Control and Circuit Modification Nikhil Jayakumar Sunil P. Khatri Presented by Ayodeji.

1 Generalized Buffering of PTL Logic Stages using Boolean Division and Don’t Cares Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering,

TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.

Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.

1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.

Physical Design Outline –What is Physical Design –Design Methods –Design Styles –Analysis and Verification Goal –Understand physical design topics Reading.

Toward Performance-Driven Reduction of the Cost of RET-Based Lithography Control Dennis Sylvester Jie Yang (Univ. of Michigan,

University of Toronto Pre-Layout Estimation of Individual Wire Lengths Srinivas Bodapati (Univ. of Illinois) Farid N. Najm (Univ. of Toronto)

A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.

A Probabilistic Method to Determine the Minimum Leakage Vector for Combinational Designs Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri Department of.

Layout-based Logic Decomposition for Timing Optimization Yun-Yin Lien* Youn-Long Lin Department of Computer Science, National Tsing Hua University, Hsin-Chu,

1 A Deep Sub-Micron VLSI Design Flow using Layout Fabrics Sunil P. Khatri University of Colorado, Boulder Amit Mehrotra University of Illinois, Urbana-Champaign.

A Highly Testable Pass Transistor Based Structured ASIC Design Methodology Kanupriya Gulati Nikhil Jayakumar Sunil P. Khatri.

1 A Method for Fast Delay/Area Estimation EE219b Semester Project Mike Sheets May 16, 2000.

UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.

Digital Circuit Implementation. Wafers and Chips  Integrated circuit (IC) chips are manufactured on silicon wafers  Transistors are placed on the wafers.

A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.

1 Efficient Analytical Determination of the SEU- induced Pulse Shape Rajesh Garg Sunil P. Khatri Department of ECE Texas A&M University College Station,

Introduction to VLSI Design – Lec01. Chapter 1 Introduction to VLSI Design Lecture # 2 A Circuit Design Example.

Power Reduction for FPGA using Multiple Vdd/Vth

Seongbo Shim, Yoojong Lee, and Youngsoo Shin Lithographic Defect Aware Placement Using Compact Standard Cells Without Inter-Cell Margin.

CAD for Physical Design of VLSI Circuits

An ASIC Design methodology with Predictably Low Leakage, using Leakage-immune Standard Cells Nikhil Jayakumar, Sunil P Khatri ISLPED’03.

Modern VLSI Design 3e: Chapters 1-3 week12-1 Lecture 30 Scale and Yield Mar. 24, 2003.

1 5. Application Examples 5.1. Programmable compensation for analog circuits (Optimal tuning) 5.2. Programmable delays in high-speed digital circuits (Clock.

CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.

A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design

Field Programmable Gate Arrays (FPGAs) An Enabling Technology.

Design Space Exploration for Application Specific FPGAs in System-on-a-Chip Designs Mark Hammerquist, Roman Lysecky Department of Electrical and Computer.

Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.

Process Variation Mohammad Sharifkhani. Reading Textbook, Chapter 6 A paper in the reference.

Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.

4. Combinational Logic Networks Layout Design Methods 4. 2

Recent Topics on Programmable Logic Array

Pattern Sensitive Placement For Manufacturability Shiyan Hu, Jiang Hu Department of Electrical and Computer Engineering Texas A&M University College Station,

Pattern Sensitive Placement For Manufacturability Shiyan Hu, Jiang Hu Department of Electrical and Computer Engineering Texas A&M University College Station,

1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,

Patricia Gonzalez Divya Akella VLSI Class Project.

An Improved “Soft” eFPGA Design and Implementation Strategy

Static CMOS Logic Seating chart updates

Design For Manufacturability in Nanometer Era

A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.

Introduction to ASICs ASIC - Application Specific Integrated Circuit

Power-Optimal Pipelining in Deep Submicron Technology

MAPLD 2005 Reduced Triple Modular Redundancy for Tolerating SEUs in SRAM based FPGAs Vikram Chandrasekhar, Sk. Noor Mahammad, V. Muralidharan Dr. V. Kamakoti.

Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2

Chapter 10: IC Technology

University of Colorado at Boulder

Chapter 10: IC Technology

On the Improvement of Statistical Timing Analysis

Chapter 10: IC Technology

Chapter 6 (I) CMOS Layout of Complexe Gate

Presentation transcript:

1 A Lithography-friendly Structured ASIC Design Approach By: Salman Goplani* Rajesh Garg # Sunil P Khatri # Mosong Cheng # * National Instruments, Austin, TX # Department of ECE, Texas A&M University, College Station, TX

2 Outline Motivation Motivation Mask costs increasing Mask costs increasing Systematic process variations increasing Systematic process variations increasing Previous Work Previous Work Our Approach Our Approach NAND2 based circuit implementation methodology NAND2 based circuit implementation methodology Experimental Results Experimental Results Conclusions Conclusions

3 Motivation – Mask Costs Process (microns) Single Mask Cost ($K) # of Masks Mask Set cost ($K) A full set of lithography masks can cost between $1-3M. A full set of lithography masks can cost between $1-3M. Roughly 25% reduction in ASIC design starts in past 7 years. [Sematech Annual Report 2002], [ A. Sangiovanni-Vincentelli “The Tides of EDA”, keynote talk, DAC 2003]. Roughly 25% reduction in ASIC design starts in past 7 years. [Sematech Annual Report 2002], [ A. Sangiovanni-Vincentelli “The Tides of EDA”, keynote talk, DAC 2003]. Need an approach in which different designs share a set of masks Need an approach in which different designs share a set of masks

4 Motivation - Variations Process variations can be classified as Process variations can be classified as Random variations Random variations Systematic variations Systematic variations Random variations are unpredictable Random variations are unpredictable Caused by random fluctuations such as number of dopant atoms Caused by random fluctuations such as number of dopant atoms Systematic variations Systematic variations Predictable variation trends across a chip Predictable variation trends across a chip Caused by spatial dependencies during device processing Caused by spatial dependencies during device processing Chemical and mechanical polishing (CMP) Chemical and mechanical polishing (CMP) Optical proximity effects (OPE) Optical proximity effects (OPE) Changes in poly shapes translates into channel length variations Changes in poly shapes translates into channel length variations Impacts circuit performance more severely compared to metal variations Impacts circuit performance more severely compared to metal variations

5 Motivation – Structured ASICs Standard cell based design approach (ASIC) Standard cell based design approach (ASIC) Severely affected by OPEs due to lack of regularity in design Severely affected by OPEs due to lack of regularity in design Optical proximity correction (OPC) is performed to deal with OPEs Optical proximity correction (OPC) is performed to deal with OPEs OPC needs to be performed on all layers for each new ASIC design OPC needs to be performed on all layers for each new ASIC design Computationally expensive process Computationally expensive process Need a circuit design approach that Need a circuit design approach that Allows us to share a majority of fabrication masks across different designs Allows us to share a majority of fabrication masks across different designs Allows us to share the OPC computation for some layers, across different designs Allows us to share the OPC computation for some layers, across different designs Our approach achieves these goals Our approach achieves these goals

6 Previous Work Jayakumar et. al proposed a structured ASIC approach using a network of fixed (medium) sized PLAs Jayakumar et. al proposed a structured ASIC approach using a network of fixed (medium) sized PLAs Large delay (area) overhead of ~260% (~240%) Large delay (area) overhead of ~260% (~240%) Gulati et. al reported a pass transistor logic (PTL) based structured ASIC approach Gulati et. al reported a pass transistor logic (PTL) based structured ASIC approach Delay and area overheads are ~50% and ~240% Delay and area overheads are ~50% and ~240% Pillegi et. al reported that FPGAs are typically ~25X slower than ASICs Pillegi et. al reported that FPGAs are typically ~25X slower than ASICs Our approach provides a structured ASIC solution with small area (~10%) and delay (~35%) overheads Our approach provides a structured ASIC solution with small area (~10%) and delay (~35%) overheads

7 Our Solution Use a regular array of 2-input NAND cells as the underlying circuit structure, and customize only METAL and VIA masks Use a regular array of 2-input NAND cells as the underlying circuit structure, and customize only METAL and VIA masks NAND2 is functionally complete NAND2 is functionally complete Stock such arrays pre-processed until metallization step Stock such arrays pre-processed until metallization step Or, use previously generated masks for all other layers and use new masks for only METAL, VIA layers Or, use previously generated masks for all other layers and use new masks for only METAL, VIA layers To create an ASIC for a given design – technology-map this design to the smallest available NAND2 array To create an ASIC for a given design – technology-map this design to the smallest available NAND2 array Only METAL and VIA masks require changes Only METAL and VIA masks require changes Easier to fix bugs, since only METAL and VIA masks change Easier to fix bugs, since only METAL and VIA masks change Optimize poly layer mask for maximum yield Optimize poly layer mask for maximum yield Perform aggressive OPC on the poly layer Perform aggressive OPC on the poly layer Required to be done only once Required to be done only once Beneficial since performance highly sensitive to channel length variations Beneficial since performance highly sensitive to channel length variations

8 NAND2 Cell Array NAND2 cells are placed NAND2 cells are placed to create rectangular array of cells array of cells Some space is left between Some space is left between two rows of NAND2 cells two rows of NAND2 cells Used for routing Used for routing

9 NAND2 Cell Size- 1.6  m X 2.6  m Size- 1.6  m X 2.6  m Input/output pins on Metal1 Input/output pins on Metal1 Symmetrical along vertical axis up to poly layer Symmetrical along vertical axis up to poly layer Placer can map to original or flipped cell orientation, thereby reducing area Placer can map to original or flipped cell orientation, thereby reducing area Poly and diffusion layers unchanged if a cell is flipped, hence same masks used for either orientation. Poly and diffusion layers unchanged if a cell is flipped, hence same masks used for either orientation. Layout of NAND2 cell is lithography- friendly Layout of NAND2 cell is lithography- friendly No bends in poly No bends in poly Poly on a fixed pitch (as required in more recent fabrication processes) Poly on a fixed pitch (as required in more recent fabrication processes) Good for manufacturability reasons Good for manufacturability reasons

10 Circuit Mapping to NAND2 Array Library L consists of 1X, 2X, 3X and 4X NAND2 cells Library L consists of 1X, 2X, 3X and 4X NAND2 cells 2X, 3X and 4X NAND2 cells are implemented by connecting 2, 3 and 4 NAND2 cells in parallel 2X, 3X and 4X NAND2 cells are implemented by connecting 2, 3 and 4 NAND2 cells in parallel Combination circuit N in blif format Place N2 using QPLACE -SEDSM and Route using WROUTE Technology indep. opt. of N Map N * with L for area or delay N*N* N1 Replace all 2X, 3X or 4X NAND2 cells in N1 by 2, 3 or 4 1X NAND2 cells N2

11 Characterization of NAND2 Array Delay ( D ) is obtained using the sense package in SIS Delay ( D ) is obtained using the sense package in SIS Sense reports the largest sensitizeable delay of the circuit (excludes any false paths) Sense reports the largest sensitizeable delay of the circuit (excludes any false paths) We use gate netlist N1 with 1X, 2X, 3X and 4X NAND2 We use gate netlist N1 with 1X, 2X, 3X and 4X NAND2 Power - dynamic power of a circuit is Power - dynamic power of a circuit is f (= 1/ D ) is the operating frequency of circuit f (= 1/ D ) is the operating frequency of circuit C eff is the total switching capacitance C eff is the total switching capacitance where: C k is the capacitance of the node k where: C k is the capacitance of the node k is the probability of transition of the node k is the probability of transition of the node k

12 Characterization of NAND2 Array Transition probability of the node k is given by Transition probability of the node k is given by where: p k is the probability that node k is at logic “1” Probability p k is obtained using the approach of Gulati et. al Probability p k is obtained using the approach of Gulati et. al p k = 0.5 for primary inputs p k = 0.5 for primary inputs For any node, obtain p k by propagating input probabilities based on node functionality For any node, obtain p k by propagating input probabilities based on node functionality Area is obtained by placing and routing N2 using SEDSM tools from Cadence Area is obtained by placing and routing N2 using SEDSM tools from Cadence All benchmark circuits are routed using up to 4 Metal layers All benchmark circuits are routed using up to 4 Metal layers

13 Characterization of NAND2 Array OPC and lithographical simulations OPC and lithographical simulations Used Calibre tool from Mentor Graphics Used Calibre tool from Mentor Graphics We used optical model with = 193nm We used optical model with = 193nm Constant threshold resist model was used Constant threshold resist model was used We perform OPC on poly and metal layers (referred to as M) of the placed and routed N2 design. Resulting layers are referred to as M OPC We perform OPC on poly and metal layers (referred to as M) of the placed and routed N2 design. Resulting layers are referred to as M OPC Lithographical simulations are then performed on all layers in M OPC to obtain resulting layers M SIM Lithographical simulations are then performed on all layers in M OPC to obtain resulting layers M SIM Error is the area of layer E M which is given by Error is the area of layer E M which is given by E M = XOR(M, M SIM ) E M = XOR(M, M SIM )

14 Experimental Results Designed NAND2 cells library L using 100 BPTM with VDD = 1.2V Designed NAND2 cells library L using 100 BPTM with VDD = 1.2V Also implemented standard cell library L STD Also implemented standard cell library L STD L contains 1X, 2X, 3X and 4X NAND2 cells L contains 1X, 2X, 3X and 4X NAND2 cells L STD consists of INV and NAND, NOR, AND & OR gates (with 2 and 3 inputs) L STD consists of INV and NAND, NOR, AND & OR gates (with 2 and 3 inputs) Implemented several ISCAS and MCNC benchmark circuits using our approach and ASIC approach Implemented several ISCAS and MCNC benchmark circuits using our approach and ASIC approach We mapped these designs for both area and delay optimality We mapped these designs for both area and delay optimality

15 Area, Delay and Power Average results for several circuits implemented using our NAND2 structured ASIC approach and traditional ASIC approach Average results for several circuits implemented using our NAND2 structured ASIC approach and traditional ASIC approach Detailed results in paper Detailed results in paper Performance Parameter Area Mapped Delay Mapped Ratio (NAND2/ASIC) Area Delay Power

16 Lithography Simulation Ratio of lithographical error for poly and Metal1-4 layers for both approaches Ratio of lithographical error for poly and Metal1-4 layers for both approaches Errors on poly and Metal1 for our approach is lower than ASIC approach Errors on poly and Metal1 for our approach is lower than ASIC approach Poly error translates into channel length variations Poly error translates into channel length variations Sheet resistivity of Metal1 is higher than Metal2-4 Sheet resistivity of Metal1 is higher than Metal2-4 Wires in these layers is largely restricted to within the cell alone Wires in these layers is largely restricted to within the cell alone Our approach uses more wiring on Metal2-4 due to an overall area increase, resulting in an increase in error on these layers Our approach uses more wiring on Metal2-4 due to an overall area increase, resulting in an increase in error on these layers EPEPEPEP E M1 E M2 E M3 E M4 Area Mapped Delay Mapped

17 Conclusions With increasing cost of masks and process variations With increasing cost of masks and process variations Need to implement circuits using regular structures Need to implement circuits using regular structures We presented a new structured ASIC approach We presented a new structured ASIC approach Implements circuits using regular array of 2-input NAND gates Implements circuits using regular array of 2-input NAND gates Our approach has small overheads compared to standard cell (ASIC) based design approach Our approach has small overheads compared to standard cell (ASIC) based design approach Area - 12% Area - 12% Delay - 40% Delay - 40% Power - 7% Power - 7% Lithographical errors of our approach are lower on poly and Metal1 layers by 7% and 24% compared to ASIC approach Lithographical errors of our approach are lower on poly and Metal1 layers by 7% and 24% compared to ASIC approach Our approach is lithography friendly Our approach is lithography friendly

18 `Thank You!!

19 Backup Slides Backup Slides

20 AREA

21 Delay

22 Power

23 Lithographical Error

24 Implementing Sequential Circuits Flip Flop can be implemented using NAND2 gates as shown Flip Flop can be implemented using NAND2 gates as shown

25 Circuit Mapping to NAND2 Array Library L - 1X, 2X, 3X and 4X NAND2 cells Library L - 1X, 2X, 3X and 4X NAND2 cells 2X, 3X and 4X NAND2 cells are implemented by connecting 2, 3 and 4 NAND2 cells in parallel 2X, 3X and 4X NAND2 cells are implemented by connecting 2, 3 and 4 NAND2 cells in parallel Circuit mapping Circuit mapping Combination circuit N in blif format SIS Mapped Circuit N2 using only 1X NAND2 Technology Indep. Opt. of N Map N * with L for Area and Delay N*N* N1 Replace all 2X, 3X and 4X NAND2 cells by 2, 3 and 4 1X NAND2 Cells N2