Download presentation
Presentation is loading. Please wait.
1
Material for course thanks to:
Chap 10: Optimization Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT (860) Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre
2
Overview Motivation and Background Code Level Optimization
Common Sub-expression elimination Copy Propagation Dead-code elimination Peephole optimization Load/Store elimination Unreachable code Flow of Control Optimization Algebraic simplification Strength Reduction Concluding Remarks/Looking Ahead
3
Motivation What we achieved We have working machine code
What is missing Code generation does not see the “big” picture We can generate poor instruction sequences What we need A simple way to locally improve the code quality Goal: Transition from “Lousy” Intermediate Code to More Effective and Efficient Code Response Time, Performance (Algorithms), Memory Usage Measured in terms of Number of Variables Saved, Operands Saved, Memory Accesses, etc.
4
Where can Optimation Occur?
Source Program Front End LA, Parse, Int. Code Int. Code Code Generator Target Program Software Engineer can: Profile Program Change Algorithm Data Transform/Improve Loops Compiler Can: Improve Loops/Proc Calls Calculate Addresses Use Registers Selected Instructions Perform Peephole Opt. All are Optimizations 1st is User Controlled and Defined At Intermediate Code Level by Compiler At Assembly Level for Target Architecture (to take advantage of different machine features)
5
Code Level Optimization
First Look at Optimization Section 9.4 in 1st Edition Introduce and Discuss Basic Blocks Requirements for Optimization Section 10.1 in 1st Edition Basic Blocks, Flow Graphs Indepth Examination of Optimization Section 10.2 in 1st Edition Function Preserving Transformations Loop Optimizations
6
First Look at Optimization
Optimization Applied to 3 Address Coding (3AC) Version of Source Program - Examples: A + B[i] * c t1 = b[i] t2 = t1 * a t3 = t2 * c
7
First Look at Optimization
Once Code has been Generated in 3AC, an Algorithm can be Applied to: Identify each Basic Block which Represents a set of Three Address Statements where Execution Enters at Top and Leaves at Bottom No Branches within Code Represent the Control Flow Dependencies Among and Between Basic Blocks Defines what is Termed a “Flow Graph” Let’s see an Example
8
First Look at Optimization
Steps 1 to 12 from two Slides Back Represented as: Optimization Works with Basic Blocks and Flow Graph to Perform Transformations that: Generate Equivalent Flow Graph w/Improved Perf.
9
First Look at Optimization
Optimization will Perform Transformations on Basic Blocks/Flow Graph Resulting Graph(s) Passed Through to Final Code Generation to Obtain More Optimal Code Two Fold Goal of Optimization Reduce Time Reduce Space Optimization Used to Come at a Cost: In “Old Days” Turning on Optimizer Could Double the Compilation Time From 2 hours to 4 hours Is this an Issue Today?
10
First Look at Optimization
Two Types of Transformations Structure Preserving Inherent Structure and Implicit Functionality of Basic Blocks is Unchanged Algebraic Elimination of Useless Expressions x = x or y = y * 1 Replace Expensive Operators Change x = y ** 2 to x = y * y Why? We’ll Focus on Both …
11
Structure Preserving Transformations
Common Sub-Expression Elimination How can Following Code be Improved? a = b + c b = a – d c = b + c d = a – d What Must Make Sure Doesn’t happen? Dead-Code Elimination If x is not Used in Block, Can it be Removed? x = y + z What are the Possible Ramifications if so? d = b
12
Structure Preserving Transformations
Renaming Temporary Variables Consider the code t = b + c Can be Changed to u = b + c May Reduce the Number of temporaries Make Change from all t’s to all u’s Interchange of Statements Consider and Change to: t1 = b + c t2 = x + y t2 = x + y t1 = b + c This can Occur as Long as: x and y not t1 b and c not t2 What Do you have to Check?
13
Requirements for Optimization
Identify Frequently Executed Portions of Code and Make them Perform Better Rule-of-Thumb - Most Programs spend 80% of their Time in 20% of Code – Is this True? We Focus on Loops since Every Gain in Space or Time is Multiplied by Loop Iterations Reduce Loop’s Code and Improve Performance What Other Programming Technique Should be a Major Concern for Optimization?
14
Requirements for Optimization
Criteria for Transformations Preserve Meaning of Code Don’t Change Output, Introduce Errors, etc. Speed up Programs by Measurable Amount (on Average for Entire Code) Must be Work the Effort Stick to Meaningful, Useful Transformations Provide Different Versions of Compiler Non-Optimizing Optimizing Extra Optimization on Demand
15
Requirements for Optimization
Beware that Some Optimization Directives are Ignored! In C, Define variable as “register int I;” While a Feature of Language, cc States that these Instructions are Ignored and Compiler Controls Use of Registers
16
The Overall Optimization Process
Advantages Intermediate Code has Explicit Operations and Their Identification Promotes Optimization Intermediate Code is Relatively Machine Independent Therefore, Optimization Doesn’t Impact Final Code Generation
17
Example Source Code
18
Generated Three Address Coding
19
Flow Graph of Basic Blocks
20
Indepth Examination of Optimization
Code-Transformation Techniques: Local – within a “Basic Block” Global – between “Basic Blocks” Data Flow Dependencies Determined by Inspection what do i, a, and v refer to? Dependent in Another Basic Block Scoping is Very Critical
21
Indepth Examination of Optimization
Function Preserving Transformations Common Subexpressions Copy Propagation Deal Code Elimination Loop Optimizations Code Motion Induction Variables Strength Reduction
22
Common Sub-Expressions
E is a Common Sub-Expression if E as Previously Computed Value of E Unchanged since Previous Computation What Can be Saved in B5? t6 and t7 same computation t8 and t10 same computation Save: Remove 2 temp variables Remove 2 multiplications Remove 4 variable accesses Remove 2 assignments t6 := 4 * i x := a[t6] t8 := 4 * j t9 := a[t8] a[t6] := t9 a[t8]:= x Goto B2 t6 := 4 * i x := a[t6] t7 := 4 * i t8 := 4 * j t9 := a[t8] a[t7] := t9 t10 := 4 * j a[t10]:= x Goto B2
23
Common Sub-Expressions
What about B6? t11 and t12 t13 and t15 Similar Savings as in B5 t11 := 4 * i x := a[t11] t12 := 4 * i t13 := 4 * n t14 := a[t13] a[t12]:= t14 t15 := 4 * n a[t15]:= x t11 := 4 * i x := a[t11] t13 := 4 * n t14 := a[t13] a[t11]:= t14 a[t13]:= x
24
Common Sub-Expressions
What else Can be Accomplished? Where is Variable j Determined? In B3 – and when drop through B3 to B4 and into B5, no change occurs to j! What Does B5 Become? Are we done? No t9 same as t5! Again savings in access, variables, operations, etc. j := j - 1 t4 := 4 * j t5 := a[t4] if t5>4 goto B3 B4 t6 := 4 * i x := a[t6] t8 := 4 * j t9 := a[t8] a[t6] := t9 a[t8]:= x Goto B2 t6 := 4 * i x := a[t6] t9 := a[t4] a[t6] := t9 a[t4]:= x Goto B2 t6 := 4 * i x := a[t6] a[t6] := t5 a[t4]:= x Goto B2
25
Common Sub-Expressions
Are we done yet? Where is “i” defined? Any Values we can Leverage? Yes! t2 = 4*i Defined in B2 and is unchanged as it arrives at B5 t3 = a[t2] in B3 and B2 and also unchanged as it arrives Result at Left Saves: From 9 statements down to 4 4 Multiplications are Gone 4 addr/array offsets are only 2 t6 := 4 * i x := a[t6] a[t6] := t5 a[t4]:= x Goto B2 x := t3 a[t2] := t5 a[t4]:= x Goto B2
26
Common Sub-Expressions
B6 is Similarly Changed …. t11 := 4 * i x := a[t11] t13 := 4 * n t14 := a[t13] a[t11]:= t14 a[t13]:= x x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x
27
Resulting Flow Diagram
28
Copy Propagation Introduce a Common Copy Statement to Replace an Arithmetic Calculation with Assignment Regardless of the Path Chosen, the use of an Assignment Saves Time and Space a:= d + e b:= d + e a:= d + e a:= t b:= d + e a:= t c:= d + e c:= t
29
Copy Propagation In our Example for B5 and B6 Below:
Since x is t3, we can replace the use of x on right hand side as below: We’ll come back to this shortly! x := t3 a[t2] := t5 a[t4]:= x Goto B2 x := t3 t14 := a[t1] a[t2]:= t14 a[t1]:= x x := t3 a[t2] := t5 a[t4] := t3 Goto B2 x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3
30
Dead Code Elimination Variable is “Dead” if its Value will never be Utilized Again Subsequently Otherwise, Variable is “Live” What’s True about B5 and B6? Can Any Statements be Eliminated? Which Ones? Why? B5 and B6 are Now Optimized with B5 has 9 Statements Reduced to 3 B56 has 8 Statements Reduced to 3 x := t3 a[t2] := t5 a[t4] := t3 Goto B2 x := t3 t14 := a[t1] a[t2] := t14 a[t1] := t3
31
Loop Optimizations Three Types: Code Motion, Induction Variables, and Strength Reduction Code Motion Remove Invariant Operations from Loop while (limit * 2 > i) do Replaced by: t = limit * 2 while (t > i) do Induction Variables Identify Which Variables are Interdependent or in Step j = j – 1 t4 = 4 * j Replaced by below with an initialization of t4 t4 = t4 - 4
32
Loop Optimizations Strength Reduction
Replace an Expensive Operation (Such as Multiply) with a Cheaper Operation (Such as Add) In B4, I and j can be replaced with t2 and t4 This Eliminates the need for Variables i and j
33
Final Optimized Flow Graph – Done?
34
Turn to Prof. Michel’s Slides …
Motivation Rewrite the basic block to eliminate sub-expressions Technique Change the representation Move to a tree!
35
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
36
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
37
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
38
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
39
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
40
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
41
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
42
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
43
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
44
Example L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3];
t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1
45
Example What we have Common sub-expressions are known
L1: t1 := 4 * i; t2 := a[t1]; t3 := 4 * i; t4 := b[t3]; t5 := t2 * t4; t6 := prod + t5; prod := t6; t7 := i + 1; i := t7; if i <= 20 then goto L1 What we have Common sub-expressions are known Used variables are known (leaves) Live on exit are known
46
Peephole Optimization
Simple Idea Slide a window over the code Optimize code in the window only. Optimizations are Local [still no big picture] Semantic preserving Cheap to implement Usually One can repeat the peephole several times! Each pass can create new opportunities for more
47
Peephole Optimizer block_3: mov [esp-4],ebp mov ebp,esp
sub esp,28 mov eax,[ebp+8] cmp eax,0 mov eax,0 sete ah jz block_5 block_4: mov eax,1 jmp block_6 block_5: sub eax,1 push eax mov eax,[ebp+4] mov eax,[eax] call eax add esp,8 mov ebx,[ebp+8] imul ebx,eax mov eax,ebx block_6: mov esp,[ebp-8] mov ebp,[ebp-4] ret
48
Peephole Optimizations
A Few Simple technique [in a nutshell] Load/Store elimination Get rid of redundant operations Unreachable code Get rid of code guaranteed to never execute Flow of Control Optimization Simply jump sequences. Algebraic simplification Use rules of algebra to rewrite some basic operation Strength Reduction Replace expensive instructions by equivalent ones (yet cheaper) Machine Idioms Replace expensive instructions by equivalent ones (for a given machine)
49
Load / Store Sequences Imagine the following sequence
“a” is a label for a memory location e.g. a variable in memory on on the stack If “a” is on the stack, it would look like ebp(k) [k == constant] mov a,eax mov eax,a What is guaranteed to be true after the first instruction ? Corollary....
50
Unreachable Code What is it? A situation that arise because... Example
Conditional compilation Previous optimizations “created/exposed” dead code Example #define debug 0 .... if (debug) { printf(“This is a trace message\n”); }
51
Example The Generated code looks like.... If we know that...
debug == 0 Then .... if (debug == 0) goto L2 printf(“This is a trace message\n”); L2: .... .... if (0 == 0) goto L2 printf(“This is a trace message\n”); L2: .... 1
52
Example Final transformation Given this code
There is no way to branch “into” the blue block The last instruction (goto L2) jumps over the blue block The blue block is never used. Get rid of it! .... goto L2 printf(“This is a trace message\n”); L2: ....
53
Unreachable Code Example
Bottom Line Now L2 is instruction after goto... So get rid of goto altogether! .... goto L2 L2: .... .... L2: ....
54
Flow of Control Optimization
Situation We can have chains of jumps Direct to conditional or vice-versa Objective Avoid extra jumps. Why? [a.k.a. motivation....] Example if (x relop y) goto L2 .... L2: goto L4 L3: .... L4: L4_BLOCK
55
Flow of Control What can be done Collapse the chain
if (x relop y) goto L4 .... L2: goto L4 L3: .... L4: L4_BLOCK
56
Algebraic Simplification
Simple Idea Use algebraic rules to rewrite some code Examples x := y + 0 x := y * 1 x := y x := y x := y * 0 x := 0
57
Strength Reduction Idea Replace expensive operation
By semantically equivalent cheaper ones. Examples Multiplication by 2 is equivalent to a left shift Left shift is much faster
58
Hardware Idiom Idea Replace expensive instructions by...
Equivalent instruction that are optimized for the platform Example add eax,1 inc eax
59
Concluding Remarks/Looking Ahead
Optimization Techniques/Concepts are Not Only Relevant to Programming Languages Database Systems do Optimization to Reduce Access to Secondary Storage Concern when Asking for too Much Data Joining Three or More Tables at Once Doing a Cartesian Product Instead of a Join Doing Selections before Joins Termed Query Optimization Looking Ahead Review Machine Code Generation (if time) Final Exam Review
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.