EECS 583 – Class 2 Control Flow Analysis

Slides:



Advertisements
Similar presentations
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Advertisements

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Architecture-dependent optimizations Functional units, delay slots and dependency analysis.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2011 More Control Flow John Cavazos University.
1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.
1 CS 201 Compiler Construction Lecture 7 Code Optimizations: Partial Redundancy Elimination.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Partial Redundancy Elimination Guo, Yao.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Jeffrey D. Ullman Stanford University Flow Graph Theory.
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
U NIVERSITY OF M ASSACHUSETTS, A MHERST D EPARTMENT OF C OMPUTER S CIENCE Advanced Compilers CMPSCI 710 Spring 2003 Lecture 2 Emery Berger University of.
PSUCS322 HM 1 Languages and Compiler Design II Basic Blocks Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU Spring.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
2015/6/24\course\cpeg421-10F\Topic1-b.ppt1 Topic 1b: Flow Analysis Some slides come from Prof. J. N. Amaral
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) Loops Guo, Yao.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
1 Region-Based Data Flow Analysis. 2 Loops Loops in programs deserve special treatment Because programs spend most of their time executing loops, improving.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Software (Program) Analysis. Automated Static Analysis Static analyzers are software tools for source text processing They parse the program text and.
Lecture 3: Uninformed Search
Advanced Compiler Techniques LIU Xianhua School of EECS, Peking University Loops.
1 Control Flow Analysis Topic today Representation and Analysis Paper (Sections 1, 2) For next class: Read Representation and Analysis Paper (Section 3)
Announcements & Reading Material
CS 614: Theory and Construction of Compilers Lecture 15 Fall 2003 Department of Computer Science University of Alabama Joel Jones.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.
Control Flow Analysis Compiler Baojian Hua
1 CS 201 Compiler Construction Lecture 2 Control Flow Analysis.
Machine-Independent Optimizations Ⅳ CS308 Compiler Theory1.
EECS 583 – Class 5 Hyperblocks, Control Height Reduction University of Michigan September 21, 2011.
EECS 583 – Class 5 Hyperblocks, Control Height Reduction University of Michigan September 19, 2012.
1 Code Optimization Chapter 9 (1 st ed. Ch.10) COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University,
Loops Simone Campanoni
Simone Campanoni CFA Simone Campanoni
Basic Program Analysis
Control Flow IV: Control Flow Optimizations Start Dataflow Analysis
CS 201 Compiler Construction
Optimizing Compilers Background
EECS 583 – Class 2 Control Flow Analysis
White-Box Testing.
Introduction to Graphs
Princeton University Spring 2016
CS 201 Compiler Construction
Control Flow Analysis CS 4501 Baishakhi Ray.
Code Optimization Chapter 10
Code Optimization Chapter 9 (1st ed. Ch.10)
White-Box Testing.
Optimizing Compilers CISC 673 Spring 2009 More Control Flow
CS 201 Compiler Construction
Graphs Chapter 11 Objectives Upon completion you will be able to:
What to do when you don’t know anything know nothing
Topic 4: Flow Analysis Some slides come from Prof. J. N. Amaral
Code Optimization Overview and Examples Control Flow Graph
EECS 583 – Class 3 Region Formation, Predicated Execution
EECS 583 – Class 6 More Dataflow Analysis
Optimizations using SSA
Control Flow Analysis (Chapter 7)
EECS 583 – Class 3 Region Formation, Predicated Execution
EECS 583 – Class 4 If-conversion
EECS 583 – Class 4 If-conversion
EECS 583 – Class 5 Dataflow Analysis
EECS 583 – Class 7 Static Single Assignment Form
Taken largely from University of Delaware Compiler Notes
Code Generation Part II
EECS 583 – Class 4 If-conversion
EECS 583 – Class 3 Region Formation, Predicated Execution
Instruction Scheduling Hal Perkins Autumn 2011
Presentation transcript:

EECS 583 – Class 2 Control Flow Analysis University of Michigan September 10, 2018

Announcements & Reading Material eecs583a,eecs583b.eecs.umich.edu servers are ready Everyone has home directory and login HW 0 – Nominally due on Wednes, but nothing to turn in Please get this done ASAP, talk to Ze if you have problems Needed for HW 1 which goes out on Wednes Reading Today’s class Ch 9.4, 10.4 (6.6, 9.6) from Compilers: Principles, Techniques Tools Ed 1 (Ed 2) “Trace Selection for Compiling Large C Applications to Microcode”, Chang and Hwu, MICRO-21, 1988. Next class “The Superblock: An Effective Technique for VLIW and Superscalar Compilation”, Hwu et al., Journal of Supercomputing, 1993

From Last Time: Dominator (DOM) Defn: Dominator – Given a CFG(V, E, Entry, Exit), a node x dominates a node y, if every path from the Entry block to y contains x 3 properties of dominators Each BB dominates itself If x dominates y, and y dominates z, then x dominates z If x dominates z and y dominates z, then either x dominates y or y dominates x Intuition Given some BB, which blocks are guaranteed to have executed prior to executing the BB

Dominator Analysis Compute dom(BBi) = set of BBs that dominate BBi Initialization Dom(entry) = entry Dom(everything else) = all nodes Iterative computation while change, do change = false for each BB (except the entry BB) tmp(BB) = BB + {intersect of Dom of all predecessor BB’s} if (tmp(BB) != dom(BB)) dom(BB) = tmp(BB) change = true Entry BB1 BB2 BB3 BB4 BB5 BB6 BB7 Exit

Immediate Dominator Defn: Immediate dominator (idom) – Each node n has a unique immediate dominator m that is the last dominator of n on any path from the initial node to n Closest node that dominates Entry BB1 BB2 BB3 BB4 BB5 BB6 BB7 Exit

Dominator Tree First BB is the root node, each node dominates all of its descendants BB DOM 1 1 2 1,2 3 1,3 4 1,4 BB DOM 5 1,4,5 6 1,4,6 7 1,4,7 BB1 BB2 BB3 BB4 BB1 BB5 BB6 BB2 BB3 BB4 BB5 BB6 BB7 BB7 Dom tree

Class Problem Draw the dominator tree for the following CFG Entry BB1 Exit

Post Dominator (PDOM) Reverse of dominator Defn: Post Dominator – Given a CFG(V, E, Entry, Exit), a node x post dominates a node y, if every path from y to the Exit contains x Intuition Given some BB, which blocks are guaranteed to have executed after executing the BB pdom(BBi) = set of BBs that post dominate BBi Initialization Pdom(exit) = exit Pdom(everything else) = all nodes Iterative computation while change, do change = false for each BB (except the exit BB) tmp(BB) = BB + {intersect of pdom of all successor BB’s} if (tmp(BB) != pdom(BB)) pdom(BB) = tmp(BB) change = true

Post Dominator Examples Entry BB1 Entry BB2 BB1 BB3 BB2 BB3 BB4 BB5 BB4 Exit BB6 BB7 Exit

Immediate Post Dominator Defn: Immediate post dominator (ipdom) – Each node n has a unique immediate post dominator m that is the first post dominator of n on any path from n to the Exit Closest node that post dominates First breadth-first successor that post dominates a node Entry BB1 BB2 BB3 BB4 BB5 BB6 BB7 Exit

Why Do We Care About Dominators? Loop detection – next subject Dominator Guaranteed to execute before Redundant computation – an op is redundant if it is computed in a dominating BB Most global optimizations use dominance info Post dominator Guaranteed to execute after Make a guess (ie 2 pointers do not point to the same locn) Check they really do not point to one another in the post dominating BB Entry BB1 BB2 BB3 BB4 BB5 BB6 BB7 Exit

Natural Loops Cycle suitable for optimization 2 properties Discuss optimizations later 2 properties Single entry point called the header Header dominates all blocks in the loop Must be one way to iterate the loop (ie at least 1 path back to the header from within the loop) called a backedge Backedge detection Edge, x y where the target (y) dominates the source (x)

Backedge Example Entry BB1 BB2 BB3 BB4 BB5 BB6 Exit

Loop Detection Identify all backedges using Dom info Each backedge (x  y) defines a loop Loop header is the backedge target (y) Loop BB – basic blocks that comprise the loop All predecessor blocks of x for which control can reach x without going through y are in the loop Merge loops with the same header I.e., a loop with 2 continues LoopBackedge = LoopBackedge1 + LoopBackedge2 LoopBB = LoopBB1 + LoopBB2 Important property Header dominates all LoopBB

Loop Detection Example Entry BB1 BB2 BB3 BB4 BB5 BB6 Exit

Important Parts of a Loop Header, LoopBB Backedges, BackedgeBB Exitedges, ExitBB For each LoopBB, examine each outgoing edge If the edge is to a BB not in LoopBB, then its an exit Preheader (Preloop) New block before the header (falls through to header) Whenever you invoke the loop, preheader executed Whenever you iterate the loop, preheader NOT executed All edges entering header Backedges – no change All others, retarget to preheader Postheader (Postloop) - analogous

Find the Preheaders for each Loop Entry BB1 BB2 BB3 ?? BB4 BB5 BB6 Exit

Characteristics of a Loop Nesting (generally within a procedure scope) Inner loop – Loop with no loops contained within it Outer loop – Loop contained within no other loops Nesting depth depth(outer loop) = 1 depth = depth(parent or containing loop) + 1 Trip count (average trip count) How many times (on average) does the loop iterate for (I=0; I<100; I++)  trip count = 100 With profile info: Ave trip count = weight(header) / weight(preheader)

Trip Count Calculation Example Entry BB1 20 BB2 Calculate the trip counts for all the loops in the graph 360 BB3 1000 2100 600 BB4 480 140 1100 BB5 360 1340 BB6 20 Exit

Reducible Flow Graphs A flow graph is reducible if and only if we can partition the edges into 2 disjoint groups often called forward and back edges with the following properties The forward edges form an acyclic graph in which every node can be reached from the Entry The back edges consist only of edges whose destinations dominate their sources More simply – Take a CFG, remove all the backedges (x y where y dominates x), you should have a connected, acyclic graph bb1 Non-reducible! bb2 bb3

Regions Region: A collection of operations that are treated as a single unit by the compiler Examples Basic block Procedure Body of a loop Properties Connected subgraph of operations Control flow is the key parameter that defines regions Hierarchically organized Problem Basic blocks are too small (3-5 operations) Hard to extract sufficient parallelism Procedure control flow too complex for many compiler xforms Plus only parts of a procedure are important (90/10 rule)

Regions (2) Want Solution Intermediate sized regions with simple control flow Bigger basic blocks would be ideal !! Separate important code from less important Optimize frequently executed code at the expense of the rest Solution Define new region types that consist of multiple BBs Profile information used in the identification Sequential control flow (sorta) Pretend the regions are basic blocks

Region Type 1 - Trace Trace - Linear collection of basic blocks that tend to execute in sequence “Likely control flow path” Acyclic (outer backedge ok) Side entrance – branch into the middle of a trace Side exit – branch out of the middle of a trace Compilation strategy Compile assuming path occurs 100% of the time Patch up side entrances and exits afterwards Motivated by scheduling (i.e., trace scheduling) 10 BB1 90 80 20 BB2 BB3 80 20 BB4 10 BB5 90 10 BB6 10

Linearizing a Trace 10 (entry count) BB1 20 (side exit) 80 BB2 BB3 exit count) 80 20 (side entrance) BB4 10 (side exit) BB5 90 10 (side entrance) BB6 10 (exit count)

Intelligent Trace Layout for Icache Performance BB1 Intraprocedural code placement Procedure positioning Procedure splitting trace1 BB2 trace 2 BB4 BB6 trace 3 BB3 BB5 The rest Procedure view Trace view

Issues With Selecting Traces Acyclic Cannot go past a backedge Trace length Longer = better ? Not always ! On-trace / off-trace transitions Maximize on-trace Minimize off-trace Compile assuming on-trace is 100% (ie single BB) Penalty for off-trace Tradeoff (heuristic) Length Likelihood remain within the trace 10 BB1 90 80 20 BB2 BB3 80 20 BB4 10 BB5 90 10 BB6 10

Trace Selection Algorithm mark all BBs unvisited while (there are unvisited nodes) do seed = unvisited BB with largest execution freq trace[i] += seed mark seed visited current = seed /* Grow trace forward */ while (1) do next = best_successor_of(current) if (next == 0) then break trace[i] += next mark next visited current = next endwhile /* Grow trace backward analogously */ i++

Best Successor/Predecessor Node weight vs edge weight edge more accurate THRESHOLD controls off-trace probability 60-70% found best Notes on this algorithm BB only allowed in 1 trace Cumulative probability ignored Min weight for seed to be chose (ie executed 100 times) best_successor_of(BB) e = control flow edge with highest probability leaving BB if (e is a backedge) then return 0 endif if (probability(e) <= THRESHOLD) then d = destination of e if (d is visited) then return d end procedure

Class Problems Find the traces. Assume a threshold probability of 60%. 100 100 BB1 BB1 20 80 60 40 BB2 BB3 BB2 BB3 50 10 5 135 100 80 20 BB4 BB4 BB5 BB6 25 51 49 15 35 BB5 BB6 BB7 25 10 41 75 49 BB8 BB7 BB8 450 100 41 10 BB9