-1- Statistical Analysis and Modeling for Error Composition in Approximate Computation Circuits 1 123 Wei-Ting Jonas Chan 1, Andrew B. Kahng 1, Seokhyeong.

Slides:



Advertisements
Similar presentations
Feedback Reliability Calculation for an Iterative Block Decision Feedback Equalizer (IB-DFE) Gillian Huang, Andrew Nix and Simon Armour Centre for Communications.
Advertisements

Porosity Aware Buffered Steiner Tree Construction C. Alpert G. Gandham S. Quay IBM Corp M. Hrkic Univ Illinois Chicago J. Hu Texas A&M Univ.
(1/25) UCSD VLSI CAD Laboratory - ISQED10, March. 23, 2010 Toward Effective Utilization of Timing Exceptions in Design Optimization Kwangok Jeong, Andrew.
Introduction So far, we have studied the basic skills of designing combinational and sequential logic using schematic and Verilog-HDL Now, we are going.
Slide 1 Bayesian Model Fusion: Large-Scale Performance Modeling of Analog and Mixed- Signal Circuits by Reusing Early-Stage Data Fa Wang*, Wangyang Zhang*,
Designing a Processor from the Ground Up to Allow Voltage/Reliability Tradeoffs Andrew Kahng (UCSD) Seokhyeong Kang (UCSD) Rakesh Kumar (Illinois) John.
Use of Kalman filters in time and frequency analysis John Davis 1st May 2011.
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal
A Generalized Model for Financial Time Series Representation and Prediction Author: Depei Bao Presenter: Liao Shu Acknowledgement: Some figures in this.
Dual Graph-Based Hot Spot Detection Andrew B. Kahng 1 Chul-Hong Park 2 Xu Xu 1 (1) Blaze DFM, Inc. (2) ECE, University of California at San Diego.
Intrinsic Shortest Path Length: A New, Accurate A Priori Wirelength Estimator Andrew B. KahngSherief Reda VLSI CAD Laboratory.
UNIVERSITY OF MASSACHUSETTS Dept
1 CS 140 Lecture 14 Standard Combinational Modules Professor CK Cheng CSE Dept. UC San Diego Some slides from Harris and Harris.
1 A Timing-Driven Synthesis Approach of a Fast Four-Stage Hybrid Adder in Sum-of-Products Sabyasachi Das University of Colorado, Boulder Sunil P. Khatri.
Lifetime Reliability-Aware Task Allocation and Scheduling for MPSoC Platforms Lin Huang, Feng Yuan and Qiang Xu Reliable Computing Laboratory Department.
Statistical Crosstalk Aggressor Alignment Aware Interconnect Delay Calculation Supported by NSF & MARCO GSRC Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego.
Design Sensitivities to Variability: Extrapolations and Assessments in Nanometer VLSI Y. Kevin Cao *, Puneet Gupta +, Andrew Kahng +, Dennis Sylvester.
1 Dynamic Power Estimation With Process Variation Modeled as Min–Max Delay Jins Davis Alexander Vishwani D. Agrawal Department of Electrical and Computer.
TH EDA NTHU-CS VLSI/CAD LAB 1 Re-synthesis for Reliability Design Shih-Chieh Chang Department of Computer Science National Tsing Hua University.
L i a b l eh kC o m p u t i n gL a b o r a t o r y Performance Yield-Driven Task Allocation and Scheduling for MPSoCs under Process Variation Presenter:
Constructing Current-Based Gate Models Based on Existing Timing Library Andrew Kahng, Bao Liu, Xu Xu UC San Diego
Local Unidirectional Bias for Smooth Cutsize-delay Tradeoff in Performance-driven Partitioning Andrew B. Kahng and Xu Xu UCSD CSE and ECE Depts. Work supported.
Chung-Kuan Cheng†, Andrew B. Kahng†‡,
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Toward Performance-Driven Reduction of the Cost of RET-Based Lithography Control Dennis Sylvester Jie Yang (Univ. of Michigan,
Code and Decoder Design of LDPC Codes for Gbps Systems Jeremy Thorpe Presented to: Microsoft Research
1 Rasit Onur Topaloglu and Alex Orailoglu University of California, San Diego Computer Science and Engineering Department.
Statistical Gate Delay Calculation with Crosstalk Alignment Consideration Andrew B. Kahng, Bao Liu, Xu Xu UC San Diego
The Calibration Process
Enhanced Metamodeling Techniques for High-Dimensional IC Design Estimation Problems Andrew B. Kahng, Bill Lin and Siddhartha Nath VLSI CAD LABORATORY,
1 Assessment of Imprecise Reliability Using Efficient Probabilistic Reanalysis Farizal Efstratios Nikolaidis SAE 2007 World Congress.
-1- UC San Diego / VLSI CAD Laboratory Methodology for Electromigration Signoff in the Presence of Adaptive Voltage Scaling Wei-Ting Jonas Chan, Andrew.
Gaussian process modelling
Introduction to Adaptive Digital Filters Algorithms
Accuracy-Configurable Adder for Approximate Arithmetic Designs
VLSI Arithmetic Adders & Multipliers Prof. Vojin G. Oklobdzija University of California
A New Methodology for Reduced Cost of Resilience Andrew B. Kahng, Seokhyeong Kang and Jiajia Li UC San Diego VLSI CAD Laboratory.
By Grégory Brillant Background calibration techniques for multistage pipelined ADCs with digital redundancy.
Capturing Crosstalk-Induced Waveform for Accurate Static Timing Analysis Masanori Hashimoto, Yuji Yamada, Hidetoshi Onodera Kyoto University.
Number of Blocks per Pole Diego Arbelaez. Option – Number of Blocks per Pole Required magnetic field tolerance of ~10 -4 For a single gap this can be.
Probabilistic Mechanism Analysis. Outline Uncertainty in mechanisms Why consider uncertainty Basics of uncertainty Probabilistic mechanism analysis Examples.
Assuring Application-level Correctness Against Soft Errors Jason Cong and Karthik Gururaj.
1 Chapter 7 Computer Arithmetic Smruti Ranjan Sarangi Computer Organisation and Architecture PowerPoint Slides PROPRIETARY MATERIAL. © 2014 The McGraw-Hill.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
Statistical Sampling-Based Parametric Analysis of Power Grids Dr. Peng Li Presented by Xueqian Zhao EE5970 Seminar.
Accelerating Statistical Static Timing Analysis Using Graphics Processing Units Kanupriya Gulati and Sunil P. Khatri Department of ECE, Texas A&M University,
Propagation of Error Ch En 475 Unit Operations. Quantifying variables (i.e. answering a question with a number) 1. Directly measure the variable. - referred.
Statistics Presentation Ch En 475 Unit Operations.
Outline Introduction: BTI Aging and AVS Signoff Problem
ApproxHadoop Bringing Approximations to MapReduce Frameworks
Investigating a Physically-Based Signal Power Model for Robust Low Power Wireless Link Simulation Tal Rusak, Philip Levis MSWIM 2008.
UC San Diego / VLSI CAD Laboratory Learning-Based Approximation of Interconnect Delay and Slew Modeling in Signoff Timing Tools Andrew B. Kahng, Seokhyeong.
0 Optimizing Stochastic Circuits for Accuracy-Energy Tradeoffs Armin Alaghi 3, Wei-Ting J. Chan 1, John P. Hayes 3, Andrew B. Kahng 1,2 and Jiajia Li 1.
Tutorial I: Missing Value Analysis
Statistics Presentation Ch En 475 Unit Operations.
Deterministic Diagnostic Pattern Generation (DDPG) for Compound Defects Fei Wang 1,2, Yu Hu 1, Huawei Li 1, Xiaowei Li 1, Jing Ye 1,2 1 Key Laboratory.
IPR: In-Place Reconfiguration for FPGA Fault Tolerance Zhe Feng 1, Yu Hu 1, Lei He 1 and Rupak Majumdar 2 1 Electrical Engineering Department 2 Computer.
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- UC San Diego / VLSI CAD Laboratory Optimal Reliability-Constrained Overdrive Frequency Selection in Multicore Systems Andrew B. Kahng and Siddhartha.
Chapter 6: Random Errors in Chemical Analysis. 6A The nature of random errors Random, or indeterminate, errors can never be totally eliminated and are.
AN ENHANCED LOW POWER HIGH SPEED ADDER FOR ERROR TOLERANT APPLICATIONS BY K.RAJASHEKHAR, , VLSI Design.
UNIVERSITY OF MASSACHUSETTS Dept
Statistics Review ChE 477 Winter 2018 Dr. Harding.
CS 140 Lecture 14 Standard Combinational Modules
FPGA Glitch Power Analysis and Reduction
CSE 140 Lecture 14 Standard Combinational Modules
Real-time Uncertainty Output for MBES Systems
Presentation transcript:

-1- Statistical Analysis and Modeling for Error Composition in Approximate Computation Circuits Wei-Ting Jonas Chan 1, Andrew B. Kahng 1, Seokhyeong Kang 1, Rakesh Kumar 2, and John Sartori 3 1 VLSI CAD LABORATORY, UC San Diego 2 PASSAT GROUP, Univ. of Illinois 3 Univ. of Minnesota

-2- Threats to traditional IC design approach... Threats to traditional IC design approach... Extreme variations / Reliability issues / Cost: Approximate Computation: Approximate Computation: Relaxing the requirement of correctness can dramatically reduce costs of the design Why Approximate Computation? Threats to traditional IC design approach... Threats to traditional IC design approach... Extreme variations: PVT variation uncertainty leads to design overhead Reliability issues: Hard errors (NBTI, latchup), Soft errors (α-particle) Cost: Cost (power/performance) of perfect accuracy is too high! Approximate Computation Approximate Computation Relaxing the requirement of correctness can dramatically reduce costs of the design What is the square root of 10 ? “a little more than three” “ ” Approximation could be faster and more powerful

-3- Reduce Design Cost with Approximations Simplified critical paths but with errors Accurate hardware Approximate hardware Approach: insert approximate hardware modules on critical paths What is the output quality of this circuit?

-4- Building Blocks: Approximate Hardware Modules Zhu et al. TVLSI 2010  ETAI : accurate part + inaccurate part  Reduce error size  Error rate is high  ETAIIM : limited carry-chain run-length  Extra protection hardware  Reduce error rate and significance

-5- (c)~(f) have 50% power of accurate adder (b), BUT…… Result Quality Estimation of Approximate Computation Image smoothing (Addition operations executed by different approximate adders) (a)Original image (b)Accurate adder (c)ACA (d)ETAI (e)ETAII (f)LU (a)(b)(c) (d)(e)(f) How can system designers estimate result quality metrics for circuits containing approximate adders?

-6- Problem: Result Quality Estimation Correct results Approximate results Arithmetic hardware replacement Accurate hardware Approximate hardware Given: Input statistical properties Hardware configurations Topologies of circuits Output: Estimated error metrics Goal: quantify degradation of result accuracy after approximate hardware modules are inserted How to compose errors at circuit level? Solution from this work:

-7- Outline Related Work Related Work Problem Modeling and Proposed Approaches Problem Modeling and Proposed Approaches Results and Conclusions Results and Conclusions

-8- Related Work [HuangLR12]  Propagates error metrics  Improves estimation accuracy and runtime Our work CategoryGate levelRounding Approximate Arithmetic VDD scaling Manipulated Elements Logic cellArithmetic Multiple Levels Error SourceAppx. HWRoundingAppx. HWOver-scaled V DD Probabilistic Errors NNNY  Intensively characterize error distributions over different intervals  Propagate distributions with interval arithmetic

-9- Related Work [HuangLR12]  Intensively characterize error distributions over different intervals  Single intervals represent multiple values in log scale  quantization inaccuracy Positive Errors Negative Errors abs(log(Probability)) PDF PMF If the inputs are out of range, there will be extra inaccuracy

-10- Related Work [HuangLR12] Source of estimation inaccuracy: quantization errors from interval representation Source of estimation inaccuracy: quantization errors from interval representation Accuracy does not scale with characterization runtime Accuracy does not scale with characterization runtime For better accuracy, alternative approach is required

-11- Error Metrics for Quality Estimation Error rate (ER): measures the frequency of error occurrences Error significance (ES): measures the magnitude of errors Average relative error significance (ARES): measures the ratio between error magnitude and signal magnitude Mean square error (MSE): common metric in signal processing Signal to Noise Ratio (SNR): common metric for quality of image processing Max error (MAXE): measure the upper bound of errors

-12- Outline Related Work Related Work Problem Modeling and Proposed Approaches Problem Modeling and Proposed Approaches Results and Conclusions Results and Conclusions

-13- Our Quality Estimation Approach Traverse the design to propagate statistical property Look up EM in in pre-characterized library Compute EM at output by propagations Pre-characterized STD tables Pre-characterized EM in tables Stage 1: Hardware characterization Stage 2: Composition of EMs Statistical property Information of EMs STD: standard deviation EM in : intrinsic error metric

-14- Our Quality Estimation Approach Traverse the design to propagate statistical property Look up EM in in pre-characterized library Compute EM at output by propagations Pre-characterized STD tables Pre-characterized EM in tables Stage 1: Hardware characterization Stage 2: Composition of EMs Statistical property Information of EMs

-15- Hardware Characterization: Observation #1 Observation #1: EMs of approximate hardware depend on input patterns Observation #1: EMs of approximate hardware depend on input patterns Input patterns decide whether carry chain will lose bits ETAIIM CLA RCA CLA RCA CLA RCA CLA RCA ‘0’ k guard blocks for MSB MSB {A, B} EM in = f( k, STD A, STD B )

-16- Hardware Characterization: Observation #2 Observation #2: EMs in ETAIIM-type adders depend on input distribution and hardware configuration Observation #2: EMs in ETAIIM-type adders depend on input distribution and hardware configuration k = # of guard blocks to mitigate errors Log(ES) vs. input STDs ER vs. input STDs k = 1 k = 2 k = 3 k = 4

-17- Hardware Characterization: Our Solution Generate lookup tables to store pre-characterized EMs Generate lookup tables to store pre-characterized EMs Generate libraries STD Z tables STD A STD B Hardware configurations EM in tables STD A STD B Hardware configurations EMs vs. input STDs

-18- Our Quality Estimation Approach Traverse the design to propagate statistical property Look up EM in in pre-characterized library Compute EM at output by propagations Pre-characterized STD tables Pre-characterized EM in tables Stage 1: Hardware characterization Stage 2: Composition of EMs Statistical property Information of EMs

-19- Composition of EMs: Error Propagation EM in : EM generated by approximate hardware {STD {A,B}, EM {A,B} }: propagated standard deviations / EMs from previous stages {EM Z, STD Z }: EMs and STDs at output nodes {STD A, EM A } {STD B, EM B } {STD z, EM Z } EM in +*+* +*+* +*+* +*: approximate adders Key issue: enable error propagation in circuit topology

-20- Composition of EMs: Observation Observation: EM (e.g., rate, magnitude) at a node depends on both intrinsic and propagated EMs Observation: EM (e.g., rate, magnitude) at a node depends on both intrinsic and propagated EMs ER A ER B ER Z ER in +*+* +*+* +*+* ES A ES B ES Z ES in +*+* +*+* +*+* ES Z = ES in + ES A + ES B (assume no cancellations between all error sources) Pass Rate ER Z = 1-(1-ER in )⋅(1-ER A )⋅(1-ER B )

-21- Composition of EMs: Our Method Our method: Our method: –Traverse the circuit and propagate STDs in its topology –EMs are looked up in the pre-characterized libraries A B C D E F Function = ((A+B)+(C+D))+(E+F) ER Z = 1−(1−ER in ) · (1−ER A ) · (1−ER B ) EM Z = EM in + EM A + EM B For each node, EMs are propagated as follows: Traverse and propagate (for EMs other than ER)

-22- Outline Related Work Related Work Problem Modeling and Proposed Approaches Problem Modeling and Proposed Approaches Results and Conclusions Results and Conclusions

-23- Results: Table-Lookup Approach Testcase: 5-node adder tree Input distributions: zero mean normal distribution with different STDs Different configurations of ETAIIMs Compared with Monte Carlo simulation

-24- Experimental Results: FIR Filter NET11 NET1 C1 = 0.1 NET2 C2 = 0.2 NET3 C3 = 0.3 NET4 C4 = 0.4 NET10 NET9 NET5NET6 NET7 NET8 NetTypeError Estimation Inaccuracy (%) ERESARESMSESNRMAXE NET9ETAIIMIN0.3%6.4%17.0%6.4%19.1%0.0% NET10ETAIIMIN1.3%2.6%61.9%3.3%10.7%0.0% NET11ETAIIMIN1.0%6.3%419.6%6.2%6.1%0.0% NET11ETAIIMP13.4%5.8%692.3%5.8%436.4%0.7% Approximate FIR  Adders are approximate  Multipliers are accurate Approximate FIR  Adders are approximate  Multipliers are accurate Estimation inaccuracies at each node for different error metrics Estimation inaccuracies at each node for different error metrics

-25- Experimental Results: MAC C0C0 A0A0 A1A1 level 1 C1C1 CiCi AiAi level i Output... Approximate MAC (multiply-accumulate)  Adders are approximate  Multipliers are accurate  14 levels of MAC are tested  20 testcases for each #level

-26- MAC: Comparison with HuangLR12 [HuangLR12] Relative inaccuracy = 10 9 beyond the lower bound of characterization ES ER Our method interpolates continuously changing EM in lookup table

-27- MAC: Speedup and Accuracy Improvement Speedup= 8.4x Accuracy improvement = 3.75x Faster runtime  allows designer to evaluate more design combinations Better accuracy  reduce the iterations due to mis-prediction

-28- Conclusions We propose an approach for output quality estimation of approximate designs We propose an approach for output quality estimation of approximate designs Our approach achieves 8.4× runtime improvement for error composition and 3.75× average accuracy improvement for ES compared to previous (DAC-2012) work of Huang et al. Our approach achieves 8.4× runtime improvement for error composition and 3.75× average accuracy improvement for ES compared to previous (DAC-2012) work of Huang et al. We demonstrate results on FIR filter and MAC circuits with up to 30 nodes We demonstrate results on FIR filter and MAC circuits with up to 30 nodes

-29- Future Work Improve accuracy of EM estimation for relative error metrics (e.g., ARES and SNR) Improve accuracy of EM estimation for relative error metrics (e.g., ARES and SNR) Extend our approach to other approximate modules, including multipliers Extend our approach to other approximate modules, including multipliers Develop a synthesis flow for approximate circuits using our EM analysis approach Develop a synthesis flow for approximate circuits using our EM analysis approach Generalize our approach to arbitrary input distributions Generalize our approach to arbitrary input distributions

-30- Thank You!

-31- Backup Slides

-32- Experiment and Results Approximate circuit: Random-generated circuits  Netlists are randomly generated with accurate multipliers and different ETAIIM approximate adders

-33- Regression study of EM Composition We also tried to generalize our propagation model with parameter regression We also tried to generalize our propagation model with parameter regression General form of error propagation models: General form of error propagation models: Simulated EM results from different hardware configurations and input distributions/EMs are used for regression Simulated EM results from different hardware configurations and input distributions/EMs are used for regression Parameters in the models are fitted with simulation data Parameters in the models are fitted with simulation data

-34- Regression study of EM Composition Results of parameter regression Results of parameter regression Regression Parameters ERESARESMSESNRMAXE 1.03E E E E E E E E E E E E E E E E E E-05 Estimation Inaccuracy w/o Reg. 4.15E E E E E E-01 with Reg. 7.40E E E E E E+01

-35- Experimental Results: FIR Filter Approximate FIR  Adders are approximate  Multipliers are accurate Approximate FIR  Adders are approximate  Multipliers are accurate Estimation inaccuracies at each node for different error metrics Estimation inaccuracies at each node for different error metrics NET11 NET1 C1 = 0.1 NET2 C2 = 0.2 NET3 C3 = 0.3 NET4 C4 = 0.4 NET10 NET9 NET5NET6 NET7 NET8 NetTypeError Estimation Inaccuracy (%) ERESARESMSESNRMAXE NET9ETAIIMIN0.3%6.4%17.0%6.4%19.1%0.0% NET10ETAIIMIN1.3%2.6%61.9%3.3%10.7%0.0% NET11ETAIIMIN1.0%6.3%419.6%6.2%6.1%0.0% NET11ETAIIMP13.4%5.8%692.3%5.8%436.4%0.7%