Improving Data-flow Analysis with Path Profiles ● Glenn Ammons & James R. Larus ● University of Wisconsin-Madison ● 1998 ● Presented by Jessica Friis.

Slides:

Advertisements

Similar presentations

Heuristic Search techniques

Advertisements

Overview Structural Testing Introduction – General Concepts

School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.

ADSP Lecture2 - Unfolding VLSI Signal Processing Lecture 2 Unfolding Transformation.

Every edge is in a red ellipse (the bags). The bags are connected in a tree. The bags an original vertex is part of are connected.

Lecture 11: Code Optimization CS 540 George Mason University.

Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.

Compiler-Based Register Name Adjustment for Low-Power Embedded Processors Discussion by Garo Bournoutian.

Course Outline Traditional Static Program Analysis –Theory Compiler Optimizations; Control Flow Graphs Data-flow Analysis – today’s class –Classic analyses.

Control-Flow Graphs & Dataflow Analysis CS153: Compilers Greg Morrisett.

CS412/413 Introduction to Compilers Radu Rugina Lecture 37: DU Chains and SSA Form 29 Apr 02.

1 Code Optimization. 2 The Code Optimizer Control flow analysis: control flow graph Data-flow analysis Transformations Front end Code generator Code optimizer.

1 Cost Effective Dynamic Program Slicing Xiangyu Zhang Rajiv Gupta The University of Arizona.

Recap from last time We were trying to do Common Subexpression Elimination Compute expressions that are available at each program point.

The Use of Program Profiling for Software Maintenance with Applications to the Year 2000 Problem Thomas Reps, Thomas Ball, Manuvir Das, and James Larus.

Program analysis Mooly Sagiv html://

Previous finals up on the web page use them as practice problems look at them early.

Incremental Path Profiling Kevin Bierhoff and Laura Hiatt Path ProfilingIncremental ApproachExperimental Results Path profiling counts how often each path.

Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.

1 CS 201 Compiler Construction Lecture 6 Code Optimizations: Constant Propagation & Folding.

From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)

Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.

Intermediate Code. Local Optimizations

Introduction to Optimization Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.

Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.

Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.

Data Flow Analysis Compiler Design Nov. 8, 2005.

From last class. The above is Click’s solution (PLDI 95)

Precision Going back to constant prop, in what cases would we lose precision?

Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.

Code Optimization, Part III Global Methods Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.

Context Tailoring the DBMS –To support particular applications Beyond alphanumerical data Beyond retrieve + process –To support particular hardware New.

Dataflow Frequency Analysis based on Whole Program Paths Eduard Mehofer Institute for Software Science University of Vienna

Overview of Previous Lesson(s) Over View  An NFA accepts a string if the symbols of the string specify a path from the start to an accepting state.

1 CS 201 Compiler Construction Introduction. 2 Instructor Information Rajiv Gupta Office: WCH Room Tel: (951) Office.

Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.

Whole Program Paths James R. Larus. Outline 1. Find acyclic path fragments 2. Convert into whole-program path 3. Determine hot subpaths.

Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.

Final Code Generation and Code Optimization.

CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.

1 Control Flow Graphs. 2 Optimizations Code transformations to improve program –Mainly: improve execution time –Also: reduce program size Can be done.

Control Flow Analysis Compiler Baojian Hua

1 Software Testing & Quality Assurance Lecture 13 Created by: Paulo Alencar Modified by: Frank Xu.

CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.

Computer Science 313 – Advanced Programming Topics.

High Performance Embedded Computing © 2007 Elsevier Lecture 10: Code Generation Embedded Computing Systems Michael Schulte Based on slides and textbook.

Profile-Guided Code Positioning See paper of the same name by Karl Pettis & Robert C. Hansen in PLDI 90, SIGPLAN Notices 25(6), pages 16–27 Copyright 2011,

Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.

Slack Analysis in the System Design Loop Girish VenkataramaniCarnegie Mellon University, The MathWorks Seth C. Goldstein Carnegie Mellon University.

Advanced Algorithms for Fast and Scalable Deep Packet Inspection Author ： Sailesh Kumar 、 Jonathan Turner 、 John Williams Publisher ： ANCS’06 Presenter.

Optimization Simone Campanoni

Loops Simone Campanoni

Profile Guided Code Positioning C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon, all rights reserved.

Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.

Simone Campanoni CFA Simone Campanoni

Efficient Instrumentation for Code Coverage Testing

High-level optimization Jakub Yaghob

Topic 10: Dataflow Analysis

Approximation algorithms

Calpa: A Tool for Automating Dynamic Compilation

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Data Flow Analysis Compiler Design

Final Code Generation and Code Optimization

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Taken largely from University of Delaware Compiler Notes

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Pipelining and Exploiting Instruction-Level Parallelism (ILP)

Presentation transcript:

Improving Data-flow Analysis with Path Profiles ● Glenn Ammons & James R. Larus ● University of Wisconsin-Madison ● 1998 ● Presented by Jessica Friis

Problem and Approach ● Some paths in a CFG are not run ● A small number of paths often make up a large portion of the runtime ● By duplicating some frequently used paths, we can improve analysis on them

Outline ● Identify “hot paths,” frequently run paths in a CFG by doing a training run on the program ● Duplicate “hot paths” to form a new CFG, the Hot Paths Graph, or HPG ● The extra paths allow more precise analysis. Constant propagation is looked at here ● Reduce the graph to preserve only the valuable solutions for the rest of compilation. ● Implementation and Results

Example CFG Hot paths taken during sample executions ● A, B, C, E, F, H, I, X ● A, B, D, E, F, H ● B, D, E, G, H ● B, D, E, F, H, I, X Use Ball-Larus path profiles Chose paths so that they cover 97% of execution time

Finite Automaton for the path profile ● The retrieval tree is used as a simple representation of the automaton. ● CFG edges label the transitions ● At 13, 15, or 16, a B takes you back to 0 ● Anything else not labeled goes to an error state ● These extra states/edges are left out for readability

Trace CFG and FA to get HPG ● Uses Holley and Rosen's data flow tracing algorithm ● The HPG will have nodes made up of a tuple {v,q} where v is a vertex from the original CFG and q is a state in the automaton ● A worklist algorithm starts with {r,q Ɛ } where r is the starting node in the CFG and q Ɛ is the starting node in the automaton. It follows the edges in the CFG and automaton to create a new HPG

New HPG ● The diagonally striped nodes (A0 and B0) represent the beginnings of the forward paths ● The shaded nodes represent the error states in the automaton (paths that are not 'hot.') ● Haven't lost any information from orginal CFG

New Knowledge with HPG ● At H14, a+b=6 ● At H12, H15, a+b=5 ● At H13, a+b=4 ● At H14&H15, i++ = 1 ● At I17, n is 1 These are new constant results

Eliminate unneeded vertices ● Heuristic algorithm for identifying the most valuable duplicated vertices. ● Others can be reduced/combined to form rHPG for further compilation.

Implementation ● Implemented as two new paths in SUIF compiler, PP and PW ● Compile into intermediate form ● PP pass instrumented intermediate form for path profiling ● Running the result gives us our path profile ● Next the intermediate code and the path profile is run through PW ● PW generates the HPG, discovers new constants, and generates the rHPG ● The output is compiled into an executable

Speedup of SPEC95 benchmarks ● The benchmarks with the most new constants found sped up, the other slowed ● The increase in program size causes the slowdown

Program Costs ● Cost of duplication in size of CFG → HPG – Go increased 184% – All others increasaed an average of 32% ● Cost of duplication in size of CFG → rHPG – Go increased 77% – All others increased less than 10% ● Analysis time – Go took 6 times longer – All others took an average of 61% longer

Contributions ● Shows improvements in the precision of data flow analysis through guided duplication ● Describes how to reduce hot path graphs ● Preserves path profiling information through transformation from CFG to HPG to rHPG ● Applies to constant propagation to show performance increase

Issues ● The cost of analysis is significant – Only should be used just before release ● The increased graph size slows down the running time – Specific reasons are not known, but larger program size is a possibility ● They only tested with a single optimization – Adding in other optimizations may give more speedup gains

Questions?