Penn ESE535 Spring 2009 -- DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2009 C  RTL.

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Synopsys University Courseware Copyright © 2012 Synopsys, Inc. All rights reserved. Compiler Optimization and Code Generation Lecture - 3 Developed By:
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Lecture 11: Code Optimization CS 540 George Mason University.
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Chap. 6 Dataflow Modeling
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Jeffrey D. Ullman Stanford University. 2  A never-published Stanford technical report by Fran Allen in  Fran won the Turing award in  Flow.
Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.
EECC551 - Shaaban #1 Fall 2003 lec# Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining increases performance by overlapping.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Components of representation Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements occur in right order Data.
Program Representations. Representing programs Goals.
1 Data flow analysis Goal : collect information about how a procedure manipulates its data This information is used in various optimizations For example,
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 23, 2008 C  RTL.
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
EECC551 - Shaaban #1 Winter 2002 lec# Pipelining and Exploiting Instruction-Level Parallelism (ILP) Pipelining increases performance by overlapping.
EECC551 - Shaaban #1 Spring 2006 lec# Pipelining and Instruction-Level Parallelism. Definition of basic instruction block Increasing Instruction-Level.
EECC551 - Shaaban #1 Fall 2005 lec# Pipelining and Instruction-Level Parallelism. Definition of basic instruction block Increasing Instruction-Level.
Penn ESE 535 Spring DeHon 1 ESE535: Electronic Design Automation Day 22: April 23, 2008 FSM Equivalence Checking.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Class canceled next Tuesday. Recap: Components of IR Control dependencies: sequencing of operations –evaluation of if & then –side-effects of statements.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 11, 2009 Dataflow.
Intermediate Code. Local Optimizations
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
EECC551 - Shaaban #1 Spring 2004 lec# Definition of basic instruction blocks Increasing Instruction-Level Parallelism & Size of Basic Blocks.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 18, 2009 Static Timing Analysis and Multi-Level Speedup.
Penn ESE Spring DeHon 1 ESE (ESE534): Computer Organization Day 5: January 24, 2007 ALUs, Virtualization…
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 8: February 13, 2008 Retiming.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
Ryan Chu. Arithmetic Expressions Arithmetic expressions consist of operators, operands, parentheses, and function calls. The purpose is to specify an.
What’s in an optimizing compiler?
Basic Operators. What is an operator? using expression is equal to 9. Here, 4 and 5 are called operands and + is the operator Python language supports.
Predicated Static Single Assignment (PSSA) Presented by AbdulAziz Al-Shammari
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 16: March 25, 2015 C  RTL.
1 Data Flow Analysis Data flow analysis is used to collect information about the flow of data values across basic blocks. Dominator analysis collected.
Modern VLSI Design 3e: Chapter 8 Copyright  1998, 2002 Prentice Hall PTR Topics n Basics of register-transfer design: –data paths and controllers; –ASM.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
3/2/2016© Hal Perkins & UW CSES-1 CSE P 501 – Compilers Optimizing Transformations Hal Perkins Autumn 2009.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 15: March 13, 2013 High Level Synthesis II Dataflow Graph Sharing.
©SoftMoore ConsultingSlide 1 Code Optimization. ©SoftMoore ConsultingSlide 2 Code Optimization Code generation techniques and transformations that result.
ESE532: System-on-a-Chip Architecture
Code Optimization Code produced by compilation algorithms can often be improved (ideally optimized) in terms of run-time speed and the amount of memory.
Code Optimization Overview and Examples
High-level optimization Jakub Yaghob
Code Optimization.
ESE532: System-on-a-Chip Architecture
Optimization Code Optimization ©SoftMoore Consulting.
ESE535: Electronic Design Automation
Optimizing Transformations Hal Perkins Autumn 2011
1. Reaching Definitions Definition d of variable v: a statement d that assigns a value to v. Use of variable v: reference to value of v in an expression.
Optimizing Transformations Hal Perkins Winter 2008
Code Optimization Overview and Examples Control Flow Graph
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
ESE535: Electronic Design Automation
Presentation transcript:

Penn ESE535 Spring DeHon 1 ESE535: Electronic Design Automation Day 2: January 21, 2009 C  RTL

Penn ESE535 Spring DeHon 2 Today Straight-line Code If-conversion Memory Basic Blocks and Control Flow Looping Hyperblocks Common Optimizations

Penn ESE535 Spring DeHon 3 DOMAIN SPECIFIC RTL GATE TRANSISTOR BEHAVIORAL Design Productivity by Approach a b s q 0 1 d clk GATES/WEEK (Dataquest) K - 2K 2K - 10K 8K - 12K Source: Keutzer (UCB EE 244) Day 1

Penn ESE535 Spring DeHon 4 Today See how get from a language (C) to RTL level –…everyone here knows C, right? –VHDL or Verilog?

Penn ESE535 Spring DeHon 5 Arithmetic Operators Unary Minus (Negation) -a Addition (Sum) a + b Subtraction (Difference) a - b Multiplication (Product) a * b Division (Quotient) a / b Modulus (Remainder) a % b Things might have an a hardware operator for…

Penn ESE535 Spring DeHon 6 Bitwise Operators Bitwise Left Shift a << b Bitwise Right Shift a >> b Bitwise One's Complement ~a Bitwise AND a & b Bitwise OR a | b Bitwise XOR a ^ b Things might have an a hardware operator for…

Penn ESE535 Spring DeHon 7 Comparison Operators Less Than a < b Less Than or Equal To a <= b Greater Than a > b Greater Than or Equal To a >= b Not Equal To a != b Equal To a == b Logical Negation !a Logical AND a && b Logical OR a || b Things might have an a hardware operator for…

Penn ESE535 Spring DeHon 8 Build complex expressions a*x*x+b*x+c a*(x+b)*x+c ((a+10)*b < 100) A connected set of operators  Graph of operators

Penn ESE535 Spring DeHon 9 C Assignment Basic assignment statement Location = expression F=a*x*x+b*x+c

Penn ESE535 Spring DeHon 10 Straight-line code Just a sequence of assignments What does this mean? g=a*x; h=b+g; i=h*x; j=i+c;

Penn ESE535 Spring DeHon 11 Variable Reuse Variables (locations) define flow between computations Locations (variables) are reusable t=a*x; r=t*x; t=b*x; r=r+t; r=r+c;

Penn ESE535 Spring DeHon 12 Variable Reuse Variables (locations) define flow between computations Locations (variables) are reusable t=a*x; r=t*x; t=b*x; r=r+t; r=r+c; Sequential assignment semantics tell us which definition goes with which use. –Use gets most recent preceding definition.

Penn ESE535 Spring DeHon 13 Dataflow Can turn sequential assignments into dataflow graph through def  use connections t=a*x; r=t*x; t=b*x; r=r+t; r=r+c; ** * + + a x b c

Penn ESE535 Spring DeHon 14 Dataflow Height t=a*x; r=t*x; t=b*x; r=r+t; r=r+c; Height (delay) of DF graph may be less than # sequential instructions. ** * + + a x b c

Penn ESE535 Spring DeHon 15 Simple Control Flow If (cond) { … } else { …} Assignments become conditional In simplest cases, can treat as dataflow node cond select thenelse

Penn ESE535 Spring DeHon 16 Simple Conditionals if (a>b) c=b*c; else c=a*c; a>bb*ca*c c

Penn ESE535 Spring DeHon 17 Simple Conditionals v=a; if (b>a) v=b; If not assigned, value flows from before assignment b>a ba v

Penn ESE535 Spring DeHon 18 Simple Conditionals max=a; min=a; if (a>b) min=b; c=1; else max=b; c=0; May (re)define many values on each branch. a>b b a min 10 maxc

Penn ESE535 Spring DeHon 19 C Memory Model One big linear address space of locations Most recent definition to location is value Sequential flow of statements Addr New value Current value

Penn ESE535 Spring DeHon 20 C Memory Operations Read/Use a=*p; a=p[0] a=p[c*10+d] Write/Def *p=2*a+b; p[0]=23; p[c*10+d]=a*x+b;

Penn ESE535 Spring DeHon 21 Memory Operation Challenge Memory just a location But memory expressions can refer to variable locations –Does *q and *p refer to same location? –*p and p[c*10+d]? –p[0] and p[c*10+d]? –p[f(a)] and p[q(b)] ?

Penn ESE535 Spring DeHon 22 Pitfall P[i]=23 P[j]=17 r=10+P[i] s=P[j]*12 Could do: P[i]=23; P[j]=17; r=10+P[i]; s=P[j]*12 ….unless i==j

Penn ESE535 Spring DeHon 23 C Pointer Pitfalls *p=23 *q=17 r=10+*p; s=*q*12; Similar limit if p==q

Penn ESE535 Spring DeHon 24 C Memory/Pointer Sequentialization Must preserve ordering of memory operations –A read cannot be moved before write to memory which may redefine the location of the read Conservative: any write to memory Sophisticated analysis may allow us to prove independence of read and write –Writes which may redefine the same location cannot be reordered

Penn ESE535 Spring DeHon 25 Consequence Expressions and operations through variables (whose address is never taken) can be executed at any time –Just preserve the dataflow Memory assignments must execute in strict order –Ideally: partial order –Conservatively: strict sequential order of C

Penn ESE535 Spring DeHon 26 Forcing Sequencing Demands we introduce some discipline for deciding when operations occur –Could be a FSM –Could be an explicit dataflow token –Callahan uses control register Other uses for timing control –Variable delay blocks –Looping –Complex control

Penn ESE535 Spring DeHon 27 Scheduled Memory Operations Source: Callahan

Penn ESE535 Spring DeHon 28 Basic Blocks Sequence of operations with –Single entry point –Once enter execute all operations in block –Set of exits at end Can dataflow schedule operations within a basic block –As long as preserve memory ordering

Penn ESE535 Spring DeHon 29 Connecting Basic Blocks Connect up basic blocks by routing control flow token –May enter from several places –May leave to one of several places

Penn ESE535 Spring DeHon 30 Basic Blocks for if/then/else Source: Callahan

Penn ESE535 Spring DeHon 31 Loops sum=0; for (i=0;i<imax;i++) sum+=i; r=sum<<2; sum=0; i=0; i<imax sum+=i; i=i+1; r=sum<<2;

Penn ESE535 Spring DeHon 32 Beyond Basic Blocks Basic blocks tend to be limiting Runs of straight-line code are not long For good hardware implementation –Want more parallelism

Penn ESE535 Spring DeHon 33 Hyperblocks Can convert if/then/else into dataflow –If/mux-conversion Hyperblock –Single entry point –No internal branches –Internal control flow provided by mux conversion –May exit at multiple points

Penn ESE535 Spring DeHon 34 Basic Blocks  Hyperblock Source: Callahan

Penn ESE535 Spring DeHon 35 Hyperblock Benefits More code  typically more parallelism –Shorter critical path Optimization opportunities –Reduce work in common flow path –Move logic for uncommon case out of path Makes smaller faster

Penn ESE535 Spring DeHon 36 Common Case Height Reduction Source: Callahan

Penn ESE535 Spring DeHon 37 Common-Case Flow Optimization Source: Callahan

Penn ESE535 Spring DeHon 38 Optimizations Constant propagation: a=10; b=c[a]; Copy propagation: a=b; c=a+d;  c=b+d; Constant folding: c[10*10+4];  c[104]; Identity Simplification: c=1*a+0;  c=a; Strength Reduction: c=b*2;  c=b<<1; Dead code elimination Common Subexpression Elimination: –C[x*100+y]=A[x*100+y]+B[x*100+y] –t=x*100+y; C[t]=A[t]+B[t]; Operator sizing: for (i=0; i<100; i++) b[i]=(a&0xff+i);

Penn ESE535 Spring DeHon 39 Flow Review

Penn ESE535 Spring DeHon 40 Concerns Parallelism in hyperblock –Especially if memory sequentialized Disambiguate memories? Allow multiple memory banks? Only one hyperblock active at a time –Share hardware between blocks? Data only used from one side of mux –Share hardware between sides? Most logic in hyperblock idle? –Couldn’t we pipeline execution?

Penn ESE535 Spring DeHon 41 Pipelining for (i=0;i<MAX;i++) o[i]=(a*x[i]+b)*x[i]+c; If know memory operations independent i<MAX * + * + a b c +read write o x i

Penn ESE535 Spring DeHon 42 Summary Language (here C) defines meaning of operations Dataflow connection of computations Sequential precedents constraints to preserve Create basic blocks Link together Merge into hyperblocks with if-conversion Result is logic and registers  RTL

Penn ESE535 Spring DeHon 43 Admin Assignment 1 out Reading for Monday

Penn ESE535 Spring DeHon 44 Big Ideas: Dataflow Mux-conversion Specialization Common-case optimization