Electrical and Computer Engineering Muhammad Noman Ashraf Optimization of Data-Flow Computations Using Canonical TED Representation M. Ciesielski, D. Gomez-Prado,Q.

Slides:



Advertisements
Similar presentations
Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.
Advertisements

Factorization of DSP Transforms using Taylor Expansion Diagram
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.
COP 3502: Computer Science I (Note Set #21) Page 1 © Mark Llewellyn COP 3502: Computer Science I Spring 2004 – Note Set 21 – Balancing Binary Trees School.
FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.
CS412/413 Introduction to Compilers Radu Rugina Lecture 16: Efficient Translation to Low IR 25 Feb 02.
Class Presentation on Binary Moment Diagrams by Krishna Chillara Base Paper: “Verification of Arithmetic Circuits using Binary Moment Diagrams” by.
ECE 667 Student Presentation Gayatri Prabhu [1]. *PHDD: An Efficient Graph Representation for Floating Point Circuit Verification – Y. Chen, R. Bryant,
ECE 667 Synthesis & Verification - Boolean Functions 1 ECE 667 Spring 2013 ECE 667 Spring 2013 Synthesis and Verification of Digital Circuits Boolean Functions.
DATE-2002TED1 Taylor Expansion Diagrams: A Compact Canonical Representation for Symbolic Verification M. Ciesielski, P. Kalla, Z. Zeng B. Rouzeyre Electrical.
Optimizing high speed arithmetic circuits using three-term extraction Anup Hosangadi Ryan Kastner Farzan Fallah ECE Department Fujitsu Laboratories University.
Logic Synthesis Part II
Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Optimal Layout of CMOS Functional Arrays ECE665- Computer Algorithms Optimal Layout of CMOS Functional Arrays T akao Uehara William M. VanCleemput Presented.
ECE Synthesis & Verification - Lecture 18 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Word-level.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
ECE Synthesis & Verification - Lecture 19 1 ECE 667 Spring 2009 ECE 667 Spring 2009 Synthesis and Verification of Digital Systems Functional Decomposition.
Factoring and Eliminating Common Subexpressions in Polynomial Expressions International Conference on Computer Aided Design (ICCAD), 2004 Farzan Fallah.
Digital Fundamentals Floyd Chapter 4 Tenth Edition
Reducing Hardware Complexity of Linear DSP Systems by Iteratively Eliminating Two-Term Common Subexpressions IEEE/ACM Asia South Pacific Design Automation.
ECE Synthesis & Verification - Lecture 14 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems BDD-based.
Taylor Expansion Diagrams (TED): Verification EC667: Synthesis and Verification of Digital Systems Spring 2011 Presented by: Sudhan.
Improving Code Generation Honors Compilers April 16 th 2002.
ECE Synthesis & Verification - Lecture 10 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Systems Binary.
 2001 CiesielskiBDD Tutorial1 Decision Diagrams Maciej Ciesielski Electrical & Computer Engineering University of Massachusetts, Amherst, USA
Equivalence Verification of Polynomial Datapaths with Fixed-Size Bit-Vectors using Finite Ring Algebra Namrata Shekhar, Priyank Kalla, Florian Enescu,
Logic Synthesis 3 1 Logic Synthesis Part III Maciej Ciesielski Univ. of Massachusetts Amherst, MA.
ECE 667 Synthesis & Verification - BDD 1 ECE 667 ECE 667 Synthesis and Verification of Digital Systems Binary Decision Diagrams (BDD)
 2000 M. CiesielskiPTL Synthesis1 Synthesis for Pass Transistor Logic Maciej Ciesielski Dept. of Electrical & Computer Engineering University of Massachusetts,
ECE 667 Synthesis and Verification of Digital Systems
1 High-Level Design Verification using Taylor Expansion Diagrams: First Results Priyank Kalla ECE Department University of Utah Maciej Ciesielski ECE Department.
A Differential Approach to Inference in Bayesian Networks - Adnan Darwiche Jiangbo Dang and Yimin Huang CSCE582 Bayesian Networks and Decision Graphs.
ENGIN112 L12: Circuit Analysis Procedure September 29, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 12 Circuit Analysis Procedure.
By Tariq Bashir Ahmad Taylor Expansion Diagrams (TED) Adapted from the paper M. Ciesielski, P. Kalla, Z. Zeng, B. Rouzeyre,”Taylor Expansion Diagrams:
Digital Fundamentals with PLD Programming Floyd Chapter 4
Logic Decomposition ECE1769 Jianwen Zhu (Courtesy Dennis Wu)
Overview Part 2 – Circuit Optimization 2-4 Two-Level Optimization
Digitaalsüsteemide verifitseerimise kursus1 Formal verification: BDD BDDs applied in equivalence checking.
Electrical and Computer Engineering Archana Rengaraj ABC Logic Synthesis basics ECE 667 Synthesis and Verification of Digital Systems Spring 2011.
Graph Theory Topics to be covered:
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
Toward Efficient Flow-Sensitive Induction Variable Analysis and Dependence Testing for Loop Optimization Yixin Shou, Robert A. van Engelen, Johnnie Birch,
The Volcano Query Optimization Framework S. Sudarshan (based on description in Prasan Roy’s thesis Chapter 2)
Shantanu Dutt ECE Dept. UIC
Curve-Fitting Regression
Algebraic Techniques To Enhance Common Sub-expression Extraction for Polynomial System Synthesis Sivaram Gopalakrishnan Synopsys Inc., Hillsboro, OR –
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Agenda Review: –Planar Graphs Lecture Content:  Concepts of Trees  Spanning Trees  Binary Trees Exercise.
Automatic Evaluation of the Accuracy of Fixed-point Algorithms Daniel MENARD 1, Olivier SENTIEYS 1,2 1 LASTI, University of Rennes 1 Lannion, FRANCE 2.
Floyd, Digital Fundamentals, 10 th ed Digital Fundamentals Tenth Edition Floyd Chapter 4 © 2008 Pearson Education.
PC-Trees & PQ-Trees. 2 Table of contents Review of PQ-trees –Template operations Introducing PC-trees The PC-tree algorithm –Terminal nodes –Splitting.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
ECEN 248: INTRODUCTION TO DIGITAL SYSTEMS DESIGN Lecture 4 Dr. Shi Dept. of Electrical and Computer Engineering.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Copyright © Curt Hill Other Trees Applications of the Tree Structure.
BDS – A BDD Based Logic Optimization System Presented by Nitin Prakash (ECE 667, Spring 2011)
Test complexity of TED operations Use canonical property of TED for - Software Verification - Algorithm Equivalence check - High Level Synthesis M ac iej.
ELEC692 VLSI Signal Processing Architecture Lecture 12 Numerical Strength Reduction.
Binary Decision Diagrams Prof. Shobha Vasudevan ECE, UIUC ECE 462.
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
ECE 667 Synthesis and Verification of Digital Systems
Chapter 2 Introduction to Logic Circuits
Architectural-Level Synthesis
Digital Fundamentals Floyd Chapter 4 Tenth Edition
VLSI CAD Flow: Logic Synthesis, Placement and Routing Lecture 5
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Presentation transcript:

Electrical and Computer Engineering Muhammad Noman Ashraf Optimization of Data-Flow Computations Using Canonical TED Representation M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimization of Data-Flow Computations Using Canonical TED Representation”, in IEEE Transactions on Computer-Aided design of Integrated Circuits and Systems ECE 667 Synthesis and Verification of Digital Systems Spring 2011 Slides adapted from D. Gomez-Prado,Q. Ren, M. Ciesielski, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

2 Electrical and Computer Engineering Overview  Motivation  TED Review  Related Work  TED Decomposition System  TED Linearization  Product Term Extraction  Sum-Term Extraction  Reordering  DFG Generation  Replacing constant multipliers by Shifters  Conclusion  References

3 Electrical and Computer Engineering Motivation F=a ⋅ (f ⋅ (g+d ⋅ c)+c ⋅ e ⋅ g) F=a ⋅ f ⋅ g+a ⋅ f d ⋅ c+a ⋅ c ⋅ e ⋅ g Minimum number of operations: 5MPY, 2ADD F=(a ⋅ f)(g+d ⋅ c)+(a ⋅ c) ⋅ e ⋅ g number of operations: 6MPY, 2ADD Res: 2 MPY,1 ADD 8 MPY, 2 ADD L=3 MPY +1 ADD L = 3 MPY +2 ADD Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

4 Electrical and Computer Engineering TED Review [Construction] zu qw (zu+qw) + x(zu+qw) pw yw Canonical for the given order: x,z,u,q,p,y,w 1 2 w ^2 1 w Notation: NON-LINEAR Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

5 Electrical and Computer Engineering RELATED WORK  HDL Compilers High level synthesis systems – Cyber, Spark, Catapult C – Lacks local optimility  Kernel based decomposition [Hosangadi et al, Optimizing Polynomial Expressions by algebraic factorization and cse, IEEE Transactions 2005] Lacks canonicity  Cut based decomposition (TED based) [Askar et al. “Data-flow transformations using Taylor expansion diagrams,” in Proc. Des. Autom. Test Eur., 2007] Limitation – only applicable to TEDs with disjoint decomposition property

6 Electrical and Computer Engineering Cut based decomposition (Related Work)  Top down approach  Apply a series of cuts (additive and multiplicative) to the edges such that it separates into two disjoint sub-graphs  Different sequence of cuts results in different DFG Sequence - A3,A1,M1,A2

7 Electrical and Computer Engineering Cut based decomposition (Related Work)  Top down approach  Apply a series of cuts (additive and multiplicative) to the edges such that it separates into two disjoint sub-graphs  Different sequence of cuts results in different DFG Sequence – A1,A3,M1,A2 Sequence - A3,A1,M1,A2

8 Electrical and Computer Engineering TED decomposition [TDS]  Cut based decomposition mentioned earlier only works for TEDs with disjoint decomposition property Many TEDs don’t have this property  New approach – Bottom up Identify algebraic operations and extract from the graph Also works for TEDs without disjoint decomposition property TED based factorization, CSE, and decomposition jointly referred asTED decomposition  Systematically involves Linearization Product-term extraction Sum-term extraction Reordering DFG generation

9 Electrical and Computer Engineering Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009) TDS System Overview TED linearization Variable ordering TED factorization & decomposition Constant multiplication & shifter generation Common subexpression elimination (CSE) TED-based Transformations Static timing analysis Latency optimization Resource constraints DFG-based Transformations Behavioral transformations Optimized DFG TDS netlist Design objectives Design constraints Structural elements Functional TED Structural DFG TDS flow Matrix transforms, Polynomials C, Behavioral HDL DFG extraction High Level Synthesis (GAUT) RTL VHDL Original DFG HLS flow

10 Electrical and Computer Engineering TED Linearization  TED naturally represents polynomial in its factored form  This efficiency is missing when considering non-linear expressions F=a 2 c+abc a could be factored out split a^2 into a1 and a2 F=a 1 (a 2 +b)c

11 Electrical and Computer Engineering TED Decomposition split w^2 into w1 and w2 TED Linearization [back to previous example] Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

12 Electrical and Computer Engineering TED Linearization [Concept] ^1 x ^n ^0 F0F0 F1F1 FnFn ….. x1x1 ^0 F0F0 x2x2 F1F1 xnxn F n-1 FnFn ^1 ^0 ^1 split x k = x 1.x 2.x 3 …..x k, where x i =x j for all i,j iteratively perform splitting on high order nodes above substitution results in Horner form which contains minimum no. of multiplications

13 Electrical and Computer Engineering Product Term Extraction  Extractable Product Term – product of variables which appear in expression only once Can be extracted from TED without duplicating any of it’s variables  Set of nodes connected by a series of multiplicative edges only starting and ending nodes can have incident additive edges Starting and ending nodes can have more than one incoming or outgoing multiplicative edge Ending node can be terminal node 1  [TDS] recursively identify such terms by traversing the graph in a bottom-up fashion For each node use depth first approach for including nodes in product term

14 Electrical and Computer Engineering start u has only one * parent …YES u has only one child path …YES z has only one * parent …YES z has only one * child path …NO CONTINUE BACKTRACK zu P1 P2 Product-Term Extraction [back to example] Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

15 Electrical and Computer Engineering Sum Term Extraction  Extractable Sum Term – sum of variables which appear in expression only once Can be extracted from TED without duplicating any of it’s variables  “Set of nodes incident to multiplicative edges joined at a single common node, such that nodes in question are connected by a chain of additive edges only”  [TDS] recursively identify such terms by traversing the graph in a bottom-up fashion For each node, make a list of incident nodes and extract the nodes from the list if connected by additive edges only  [TDS] Uses associativity property of addition

16 Electrical and Computer Engineering Keep support (irreducible) start S1 Sum-Term Extraction [back to example] Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

17 Electrical and Computer Engineering Sum Term Extraction  Extractable Sum Term – sum of variables which appear in expression only once Can be extracted from TED without duplicating any of it’s variables  “Set of nodes incident to multiplicative edges joined at a single common node, such that nodes in question are connected by a chain of additive edges only”  [TDS] recursively identify such terms by traversing the graph in a bottom-up fashion For each node, make a list of incident nodes and extract the nodes from the list if connected by additive edges only  [TDS] Uses associativity property of addition

18 Electrical and Computer Engineering Example to illustrate Associativity* S1=b+d S2=a+c

19 Electrical and Computer Engineering Stop when TED is Irreducible. Now generate DFG – (to be explained later) If Sum term extraction results in more product terms, go back Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009) Sum-Term Extraction [cont. – back to example]

20 Electrical and Computer Engineering P3 P4 P5 S3 Stop when TED is Irreducible. S2 Reordering [Back to previous example -> Iteration 2 extraction] Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

21 Electrical and Computer Engineering F = S3 = P5+P4 = x·S2+w1·S1 = x·(P1+P3)+w1·(P2+y) = x·(z·u+q·w1)+w1·(p·w2+y) = x·(z·u+q·w)+w·(p·w+y) 1× total: 5 MPY, 3 ADD 1+ Normal Factored Form* Factored form associated with a TED is called NFF for that TED, if the order Of variables in the factored form is Compatible with the order in the given TED Theorem: The NFF derived from a linear TED Is unique Canonical Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

22 Electrical and Computer Engineering DFG Generation and Optimization  Transform each irreducible TED into simple DFG Additive edge -> addition operation Multiplicative edge -> multiplication operation Break multiple operands operations into chain of operations  [TDS] maintain a hash table for DFG nodes keyed by the corresponding function Helps in reusing the node, if same function/expression found again Captures redundancy due to poor variable order during factorization  DFG is not unique Can be restructured and balanced to minimize cost

23 Electrical and Computer Engineering Data Flow Graph L=2 MPY +2 ADD Req 3 MPY, 2 ADD total: 5 MPY, 3 ADD Reordering cost Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

24 Electrical and Computer Engineering S2 P3 P4 S3 L=2 MPY +2 ADD Req 3 MPY, 2 ADD Reordering [-> Iteration 3 extraction] Cost involves Reordering of variable Extraction DFG generation Annotating Latency and resource requirements Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

25 Electrical and Computer Engineering F total: 4 MPY, 3 ADD F = S3 = P4+P3 = w ⋅ S2+x ⋅ P1 = w ⋅ (q+S1)+x ⋅ (z ⋅ u) = w ⋅ (q+P2+y)+x ⋅ z ⋅ u = w ⋅ (q+p ⋅ w+y)+x ⋅ z ⋅ u L=2 MPY +2 ADD L=2 MPY +3 ADD Req 1 MPY,1 ADD 1× 1+ Reordering cost L=2 MPY +2 ADD Req 2 MPY, 1 ADD Previous cost L=2MPY+2ADD Req=3MPY,2ADD Generating and evaluating new Data Flow Graph [Iteration 3] Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

26 Electrical and Computer Engineering Through reordering all cases can be obtained Reordering [-> Iteration 4 extraction,DFG generation] Design Space Exploration Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

27 Electrical and Computer Engineering Replacing constant multipliers*  By shifters Transform constant multiplications into shifters, while considering factorization involving shifters  Steps Represent constant in CSD format – Use shift variable L i (instead of 2 i for shifting i bits Generate TED with shift variables, linearize it and perform decomposition Replace terms involving shift variables (L i ) by i-bit shifters 7a + 6b L 3 (a+b) - L.b - a ((a+b)<<3) – (a+(b<<1)) (L 3 -1)a+(L 3 -L)b

28 Electrical and Computer Engineering Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009) TDS – TED Decomposition System  RECAP  Read in the CDFG file (cdfg) or polynomial expression (poly) or using pre-coded DSP transforms (tr)  Translate into functional TED (dfg2ted) and structural elements (comparators etc.)  Linearize its data path (linearize)  Iterate Iterate Product term extraction Sum term extraction Reorder to minimize latency (reorder)  Set of irreducible TEDs  Produce Final DFG (ted2dfg)and annotate back the CDFG file (write)  Data flow and computation intensive designs - DSP Design Space Exploration

29 Electrical and Computer Engineering Conclusion  Results in the paper show 15% Latency improvement and 7% area reduction when using DFG generated from TDS instead of using KBD Far better results when compared to original DFG  TDS – front end to GAUT  Fundamental limitation – decomposition dependent upon variable reordering which is an expensive operation

30 Electrical and Computer Engineering REFERENCES  M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimization of Data-Flow Computations Using Canonical TED Representation”, in IEEE Transactions on Computer-Aided design of Integrated Circuits and Systems  M. Ciesielski, S. Askar, D. Gomez-Prado, J. Guillot, and E. Boutillon, “Data-flow transformations using Taylor expansion diagrams,” in Proc. Des. Autom. Test Eur., 2007, pp. 455–460  TDS—TED-Based Dataflow Decomposition System, Univ. Massachusetts,Amherst, MA. [Online]. Available:

31 Electrical and Computer Engineering QUESTIONS?

32 Electrical and Computer Engineering Experiment Setup* TED linearization Variable ordering TED factorization & decomposition Constant multiplication & shifter generation Common subexpression elimination (CSE) TED-based Transformations Static timing analysis Latency optimization Resource constraints DFG-based Transformations Behavioral transformations Optimized DFG TDS netlist Design objectives Design constraints Structural elements Functional TED Structural DFG TDS flow Matrix transforms, Polynomials C, Behavioral HDL DFG extraction High Level Synthesis (GAUT) RTL VHDL Original DFG HLS flow KBD ORIGINAL TED Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

33 Electrical and Computer Engineering Results* KBD Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

34 Electrical and Computer Engineering Results: Quintic Spline* KBD Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

35 Electrical and Computer Engineering Results: Quartic spline* KBD Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

36 Electrical and Computer Engineering Improvement over KBD and Original* KBD Slide adapted from M. Ciesielski, D. Gomez-Prado,Q. Ren, J. Guillot and Emmanuel Boutillon, “Optimizing Data Flow Graphs to Minimize Hardware Implementations”, DATE (2009)

37 Electrical and Computer Engineering