Yuxi Liu The Chinese University of Hong Kong Circuit Timing Problem Driven Optimization.

Slides:



Advertisements
Similar presentations
Retiming Scan Circuit To Eliminate Timing Penalty
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
Courtesy RK Brayton (UCB) and A Kuehlmann (Cadence) 1 Logic Synthesis Sequential Synthesis.
Microprocessor Reliability
VARIUS: A Model of Process Variation and Resulting Timing Errors for Microarchitects Sarangi et al Prateeksha Satyamoorthy CS
Mapping for Better Than Worst-Case Delays In LUT-Based FPGA Designs Kirill Minkovich and Jason Cong VLSI CAD Lab Computer Science Department University.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
K-Maps, Timing Sequential Circuits: Latches & Flip-Flops Lecture 4 Digital Design and Computer Architecture Harris & Harris Morgan Kaufmann / Elsevier,
Class presentation based on ISSCC : A Low-power 1GHz Razor FIR Accelerator with Time-Borrow Tracking Pipeline and Approximate Error Correction.
Timing Margin Recovery With Flexible Flip-Flop Timing Model
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
CSE241 Formal Verification.1Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Recitation 6: Formal Verification.
Post-Placement Voltage Island Generation for Timing-Speculative Circuits Rong Ye†, Feng Yuan†, Zelong Sun†, Wen-Ben Jone§ and Qiang Xu†‡
Synchronous Digital Design Methodology and Guidelines
Clock Design Adopted from David Harris of Harvey Mudd College.
Embedding of Asynchronous Wave Pipelines into Synchronous Data Processing Stephan Hermanns, Sorin Alexander Huss University of Technology Darmstadt, Germany.
The Cost of Fixing Hold Time Violations in Sub-threshold Circuits Yanqing Zhang, Benton Calhoun University of Virginia Motivation and Background Power.
UCSD VLSI CAD Laboratory and UIUC PASSAT Group - ASPDAC, Jan. 21, 2010 Slack Redistribution for Graceful Degradation Under Voltage Overscaling Andrew B.
Combining Technology Mapping and Retiming EECS 290A Sequential Logic Synthesis and Verification.
A Delay-efficient Radiation-hard Digital Design Approach Using Code Word State Preserving (CWSP) Elements Charu Nagpal Rajesh Garg Sunil P. Khatri Department.
Subthreshold Logic Energy Minimization with Application- Driven Performance EE241 Final Project Will Biederman Dan Yeager.
Input-Specific Dynamic Power Optimization for VLSI Circuits Fei Hu Intel Corp. Folsom, CA 95630, USA Vishwani D. Agrawal Department of ECE Auburn University,
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
Opportunities and Challenges for Better Than Worst­Case Design Todd Austin (presenter) Valeria Bertacco David Blaauw Trevor Mudge University of Michigan.
10/25/2007 ITC-07 Paper Delay Fault Simulation with Bounded Gate Delay Model Soumitra Bose Design Technology, Intel Corp. Folsom, CA Hillary.
CSE241 L4 System.1Kahng & Cichy, UCSD ©2003 CSE241A VLSI Digital Circuits Winter 2003 Lecture 04: Static Timing Analysis.
Retiming. Consider the Following Circuit Suppose T XOR = 3 ns, T pcq = 1 ns, T setup = 1 ns, then this circuit can be clocked at 1 ns + (3 x 3 ns) + 1.
1 Razor: A Low Power Processor Design Presented By: - Murali Dharan.
Logic Design Outline –Logic Design –Schematic Capture –Logic Simulation –Logic Synthesis –Technology Mapping –Logic Verification Goal –Understand logic.
1 Application Specific Integrated Circuits. 2 What is an ASIC? An application-specific integrated circuit (ASIC) is an integrated circuit (IC) customized.
A Cost-Driven Lithographic Correction Methodology Based on Off-the-Shelf Sizing Tools.
Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Jieyi Long and Seda Ogrenci Memik Dept. of EECS, Northwestern Univ. Automated Design.
ECE Synthesis & Verification 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Retiming.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
University of Michigan Electrical Engineering and Computer Science 1 Top 5 Reasons Reliability is the Biggest Fallacy in Computer Architecture Research.
Circuit-Level Timing Speculation: The Razor Latch Developed by Trevor Mudge’s group at the University of Michigan, 2003.
1 paper I design and implementation of the aegis single-chip secure processor using physical random functions, isca’05 nuno alves 28/sep/06.
Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.
Analysis of Instruction-level Vulnerability to Dynamic Voltage and Temperature Variations ‡ Computer Science and Engineering, UC San Diego variability.org.
On Timing- Independent False Path Identification Feng Yuan, Qiang Xu Cuhk Reliable Computing Lab, The Chinese University of Hong Kong ICCAD 2010.
DELAY INSERTION METHOD IN CLOCK SKEW SCHEDULING BARIS TASKIN and IVAN S. KOURTEV ISPD 2005 High Performance Integrated Circuit Design Lab. Department of.
Todd Austin University of Michigan X-Stack Energy Optimization: Fact or Fiction.
Introduction to Sequential Logic Design Flip-flops.
Accuracy-Configurable Adder for Approximate Arithmetic Designs
Mehdi Sadi, Italo Armenti Design of a Near Threshold Low Power DLL for Multiphase Clock Generation and Frequency Multiplication.
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
UW-Madison Computer Sciences Vertical Research Group© 2010 A Unified Model for Timing Speculation: Evaluating the Impact of Technology Scaling, CMOS Design.
POWER-DRIVEN MAPPING K-LUT-BASED FPGA CIRCUITS I. Bucur, N. Cupcea, C. Stefanescu, A. Surpateanu Computer Science and Engineering Department, University.
Logic Synthesis For Low Power CMOS Digital Design.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Combinational network delay. n Logic optimization.
Safe Overclocking Safe Overclocking of Tightly Coupled CGRAs and Processor Arrays using Razor © 2012 Guy Lemieux Alex Brant, Ameer Abdelhadi, Douglas Sim,
A Robust Pulse-triggered Flip-Flop and Enhanced Scan Cell Design
Skewed Flip-Flop Transformation for Minimizing Leakage in Sequential Circuits Jun Seomun, Jaehyun Kim, Youngsoo Shin Dept. of Electrical Engineering, KAIST,
BR 8/991 DFFs are most common Most programmable logic families only have DFFs DFF is fastest, simplest (fewest transistors) of FFs Other FF types (T, JK)
Pipelining and Retiming
Hrushikesh Chavan Younggyun Cho Structural Fault Tolerance for SOC.
Patricia Gonzalez Divya Akella VLSI Class Project.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Combinational network delay. n Logic optimization.
04/21/20031 ECE 551: Digital System Design & Synthesis Lecture Set : Functional & Timing Verification 10.2: Faults & Testing.
A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
CUHK Test and Fault-Tolerance for Timing Error Presenter: Feng Yuan.
Advanced Computer Architecture Lab The University of Michigan Razor DVS Dan Ernst – 12/3/2003 Razor: Dynamic Voltage Scaling Based on Circuit-Level Timing.
Raghuraman Balasubramanian Karthikeyan Sankaralingam
Supervised Learning Based Model for Predicting Variability-Induced Timing Errors Xun Jiao, Abbas Rahimi, Balakrishnan Narayanaswamy, Hamed Fatemi, Jose.
SIMD Lane Decoupling Improved Timing-Error Resilience
The University of British Columbia
Scott Mahlke University of Michigan
Pipeline Principle A non-pipelined system of combination circuits (A, B, C) that computation requires total of 300 picoseconds. Comb. logic.
Post-Silicon Calibration for Large-Volume Products
Presentation transcript:

Yuxi Liu The Chinese University of Hong Kong Circuit Timing Problem Driven Optimization

Outline Timing problem Background Optimization for hardware cost Optimization for high performance

Reliability Problem Ever-increasing uncertainties with technology scaling Static and dynamic variations Environmental fluctuations Less effective manufacturing test Transistor feature size continuously shrinking 22nm transistor by 2011, 14nm followed Circuit reliability problem becomes more severe!

Timing Error Problem Timing speculation Allow infrequent timing errors Online detection and recovery Better throughput, higher energy-efficiency Representative technique: Razor “Ernst et al. [MICRO 2003]” Conventional solution: embed design guardband Guarantee timing correctness Diminish benefits of scaling More vulnerable to timing problems Delay uncertainty

Razor FF & Timing Speculation Error_L Error comparator RAZOR FF clk_del Main Flip-Flop clk Shadow Latch Q1 D1 0 1 Double latch the input data Detect the error when latched data disagree Correct when error detected: flush and replay

Cost for Timing Speculation clock clock_del t delay t hold Min. path delay Min. Path Delay > t delay + t hold intended pathshort path Hardware cost Throughput loss from error rate Clock cycle T, Error penalty r, Working correctly rate P

Optimization for Hardware Cost Linear programming approach is not scalable Heuristic method Focus on suspicious FFs and the related FFs FF Traditional retiming: reduce the maximum path delay Our target: reduce the number of critical paths Reduce number of SFFs

Optimization for Throughput Throughput is determined by timing error rate Reduce error rate 1-P can increase throughput Reduce error rate Reduce the sensitization probability of critical paths Shorten critical paths with high sensitization probability Consideration during logic synthesis is promising Circuit structure is expected to be changed More flexibility comparing with post-synthesis methods

Timing Optimization For traditional timing optimization Minimize worst-case path delay Balance the delay among different paths Optimization for timing error probability Aware of error rate information during process Optimization is performed on each super-gate Associative transform: a(bc)=(ab)c=(ac)b Reorder all fanins of the supergates

b c a Timing Optimization Optimization performed at every super-gate Not balance the delay of different paths Locate fanin with paths more critical closer to output Shorten all critical paths through this fanin Lengthen paths which are less critical Consideration for technology mapping More accurate timing information

Thank you very much!