More Examples of Abstract Interpretation Register Allocation.

Slides:



Advertisements
Similar presentations
Continuing Abstract Interpretation We have seen: 1.How to compile abstract syntax trees into control-flow graphs 2.Lattices, as structures that describe.
Advertisements

Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Register allocation Morgensen, Torben. "Register Allocation." Basics of Compiler Design. pp from (
1 Lecture 10 Intermediate Representations. 2 front end »produces an intermediate representation (IR) for the program. optimizer »transforms the code in.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Run-time Environment for a Program different logical parts of a program during execution stack – automatically allocated variables (local variables, subdivided.
Idea of Register Allocation x = m[0]; y = m[1]; xy = x*y; z = m[2]; yz = y*z; xz = x*z; r = xy + yz; m[3] = r + xz x y z xy yz xz r {} {x} {x,y} {y,x,xy}
Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
CPS3340 COMPUTER ARCHITECTURE Fall Semester, /17/2013 Lecture 12: Procedures Instructor: Ashraf Yaseen DEPARTMENT OF MATH & COMPUTER SCIENCE CENTRAL.
The University of Adelaide, School of Computer Science
The University of Adelaide, School of Computer Science
Lecture 10: Heap Management CS 540 GMU Spring 2009.
CPSC 388 – Compiler Design and Construction
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
1 1 Lecture 4 Structure – Array, Records and Alignment Memory- How to allocate memory to speed up operation Structure – Array, Records and Alignment Memory-
3/17/2008Prof. Hilfinger CS 164 Lecture 231 Run-time organization Lecture 23.
4/23/09Prof. Hilfinger CS 164 Lecture 261 IL for Arrays & Local Optimizations Lecture 26 (Adapted from notes by R. Bodik and G. Necula)
Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Run time vs. Compile time
Register Allocation (via graph coloring)
JVM-1 Introduction to Java Virtual Machine. JVM-2 Outline Java Language, Java Virtual Machine and Java Platform Organization of Java Virtual Machine Garbage.
Data Flow Analysis Compiler Design October 5, 2004 These slides live on the Web. I obtained them from Jeff Foster and he said that he obtained.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
Chapter 16 Java Virtual Machine. To compile a java program in Simple.java, enter javac Simple.java javac outputs Simple.class, a file that contains bytecode.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
5/6/99 Ashish Sabharwal1 JVM Architecture n Local storage area –Randomly accessible –Just like standard RAM –Stores variables (eg. an array) –Have to specify.
4/29/09Prof. Hilfinger CS164 Lecture 381 Register Allocation Lecture 28 (from notes by G. Necula and R. Bodik)
1 Software Testing and Quality Assurance Lecture 31 – SWE 205 Course Objective: Basics of Programming Languages & Software Construction Techniques.
CSc 453 Interpreters & Interpretation Saumya Debray The University of Arizona Tucson.
Code Generation Introduction. Compiler (scalac, gcc) Compiler (scalac, gcc) machine code (e.g. x86, arm, JVM) efficient to execute i=0 while (i < 10)
Abstract Interpretation (Cousot, Cousot 1977) also known as Data-Flow Analysis.
Compiler Construction Lecture 17 Mapping Variables to Memory.
COP4020 Programming Languages
Compiler Construction Lecture 16 Data-Flow Analysis.
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
Adapted from Prof. Necula UCB CS 1641 Overview of COOL ICOM 4029 Lecture 2 ICOM 4029 Fall 2008.
Java Bytecode What is a.class file anyway? Dan Fleck George Mason University Fall 2007.
Lecture 10 : Introduction to Java Virtual Machine
Runtime Environments Compiler Construction Chapter 7.
Compiler Construction
CPSC 388 – Compiler Design and Construction Runtime Environments.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Register Allocation John Cavazos University.
Copyright 2005, The Ohio State University 1 Pointers, Dynamic Data, and Reference Types Review on Pointers Reference Variables Dynamic Memory Allocation.
Virtual Machines, Interpretation Techniques, and Just-In-Time Compilers Kostis Sagonas
RUN-Time Organization Compiler phase— Before writing a code generator, we must decide how to marshal the resources of the target machine (instructions,
More on MIPS programs n SPIM does not support everything supported by a general MIPS assembler. For example, –.end doesn’t work Use j $ra –.macro doesn’t.
Intermediate Representation II Storage Allocation and Management EECS 483 – Lecture 18 University of Michigan Wednesday, November 8, 2006.
LECTURE 13 Names, Scopes, and Bindings: Memory Management Schemes.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
RealTimeSystems Lab Jong-Koo, Lim
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Names and Attributes Names are a key programming language feature
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
The compilation process
Abstract Interpretation Continued
CSc 453 Interpreters & Interpretation
Data Flow Analysis Compiler Design
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture 7 – More Memory Management Lecturer PSOE Dan Garcia
CSc 453 Interpreters & Interpretation
Abstract Interpretation: Analysis Domain (D) Lattices
Presentation transcript:

More Examples of Abstract Interpretation Register Allocation

1) Constant Propagation Special case of interval analysis: D = { [a,a] | a  {-M,-(M-1),…,-2,-1,0,1,2,3,…,M-1}} U { ,T} Write [a,a] simply as a. So values are: a known constant at this point: a "we could not show it is constant": T "we did not reach this program point":  Convergence fast - lattice has small height

Transfer Function for Plus in pscala x = y + z For each variable (x,y,z) we store a constant , or T abstract class Element case class Top extends Element case class Bot extends Element case class Const(v:Int) extends Element var facts : Map[Nodes,Map[VarNames,Element]] what executes during analysis: oldY = facts(v 1 )("y") oldZ = facts(v 1 )("z") newX = tableForPlus(oldY, oldZ) facts(v 2 ) = facts(v 2 ) join facts(v 1 ).updated("x", newX) def tableForPlus(y:Element, z:Element) = (x,y) match { case (Const(cy),Const(cz)) => Const(cy+cz) case (Bot,_) => Bot case (_,Bot) => Bot case (Top,Const(cz)) => Top case (Const(cy),Top) => Top } table for +:

2) Initialization Analysis uninitialized first initialization initialized

What does javac say to this: class Test { static void test(int p) { int n; p = p - 1; if (p > 0) { n = 100; } while (n != 0) { System.out.println(n); n = n - p; } Test.java:8: variable n might not have been initialized while (n > 0) { ^ 1 error

Program that compiles in java class Test { static void test(int p) { int n; p = p - 1; if (p > 0) { n = 100; } else { n = -100; } while (n != 0) { System.out.println(n); n = n - p; } } // Try using if (p>0) second time. We would like variables to be initialized on all execution paths. Otherwise, the program execution could be undesirable affected by the value that was in the variable initially. We can enforce such check using initialization analysis.

Initialization Analysis T indicates presence of flow from states where variable was not initialized: If variable is possibly uninitialized, we use T Otherwise (initialized, or unreachable):  class Test { static void test(int p) { int n; p = p - 1; if (p > 0) { n = 100; } else { n = -100; } while (n != 0) { System.out.println(n); n = n - p; } } // Try using if (p>0) second time. If var occurs anywhere but left-hand side of assignment and has value T, report error

3) Liveness Analysis dead live dead last use first initialization dead live Variable is dead if its current value will not be used in the future. If there are no uses before it is reassigned or the execution ends, then the variable is sure dead at a given point.

Example: x = y + x if (x > y) What is Written and What Read Purpose: Register allocation: find good way to decide which variable should go to which register at what point in time.

How Transfer Functions Look

Initialization: Forward Analysis Liveness: Backward Analysis while (there was change) pick edge (v1,statmt,v2) from CFG such that facts(v1) has changed facts(v2)=facts(v2) join transferFun(statmt, facts(v1)) } while (there was change) pick edge (v1,statmt,v2) from CFG such that facts(v2) has changed facts(v1)=facts(v1) join transferFun(statmt, facts(v2)) }

Example x = m[0] y = m[1] xy = x * y z = m[2] yz = y*z xz = x*z res1 = xy + yz m[3] = res1 + xz

4) Data Representation Overview

Original and Target Program have Different Views of Program State Original program: – local variables given by names (any number of them) – each procedure execution has fresh space for its variables (even if it is recursive) – fields given by names Java Virtual Machine – local variables given by slots (0,1,2,…), any number – intermediate values stored in operand stack – each procedure call gets fresh slots and stack – fields given by names and object references Machine code: program state is a large arrays of bytes and a finite number of registers

Compilation Performs Automated Data Refinement x = y + 7 * z iload 2 iconst 7 iload 3 imul iadd istore 1 x  0 y  5 z  8 1  0 2  5 3  8 x  61 y  5 z  8 1  61 2  5 3  8 Data Refinement Function R R maps s to c and s’ to c’ [[x = y + 7 * z]] = s s’ c c' If interpreting program P leads from s to s’ then running the compiled code [[P]] leads from R(s) to R(s’) P:

Inductive Argument for Correctness R x = y + 7 * z x  0 y  5 z  8 x  61 y  5 z  8 s1s1 s2s2 y = z + 1 R x  61 y  9 z  8 s3s3 R [[x = y + 7 * z]] 0  0 1  5 2  8 0  61 1  5 2  8 c1c1 c2c2 iload 3; iconst 1; iadd; istore 2 0  61 1  9 2  8 c3c3 (R may need to be a relation, not just function)

A Simple Theorem P : S  S is a program meaning function P c : C  C is meaning function for the compiled program R : S  Cis data representation function Let s n+1 = P(s n ), n = 0,1,… be interpreted execution Let c n+1 = P(c n ), n = 0,1,… be compiled execution Theorem: If – c 0 = R(s 0 ) – for all s, P c (R(s)) = R(P(s)) then c n = R(c n ) for all n. Proof: immediate, by induction. R is often called simulation relation.

Example of a Simple R Let the received, the parameters, and local variables, in their order of declaration, be x 1, x 2 … x n Then R maps program state with only integers like this: x 1  v 1 x 2  v 2 x 3  v 3 … x n  v n 0  v 1 1  v 2 2  v 3 … (n-1)  v n R

R for Booleans Let the received, the parameters, and local variables, in their order of declaration, be x 1, x 2 … x n Then R maps program state like this, where x 1 and x 2 are integers but x 3 and x 4 are Booleans: x 1  3 x 2  9 x 3  true x 4  false 0  3 1  9 2  1 3  0 R

R that depends on Program Point def main(x:Int) { var res, y, z: Int if (x>0) { y = x + 1 res = y } else { z = -x - 10 res = z } … } x  v 1 res  v 2 y  v 3 z  v 4 R 0  v 1 1  v 2 2  v 3 3  v 4 x  v 1 res  v 2 y  v 3 z  v 4 R1R1 0  v 1 1  v 2 2  v 3 x  v 1 res  v 2 y  v 3 z  v 4 R2R2 0  v 1 1  v 2 2  v 4 Map y,z to same slot. Consume fewer slots!

Packing Variables into Memory If values are not used at the same time, we can store them in the same place This technique arises in – Register allocation: store frequently used values in a bounded number of fast registers – ‘malloc’ and ‘free’ manual memory management: free releases memory to be used for later objects – Garbage collection, e.g. for JVM, and.NET as well as languages that run on top of them (e.g. Scala)

Register Machines Better for most purposes than stack machines – closer to modern CPUs (RISC architecture) – closer to control-flow graphs – simpler than stack machine Example: ARM architectureARM architecture Directly Addressable RAM (large - GB, slow) R0,R1,…,R31 A few fast registers

Basic Instructions of Register Machines R i  Mem[R j ]load Mem[R j ]  R i store R i  R j * R k compute: for an operation * Efficient register machine code uses as few loads and stores as possible.

State Mapped to Register Machine Both dynamically allocated heap and stack expand – heap need not be contiguous can request more memory from the OS if needed – stack grows downwards Heap is more general: Can allocate, read/write, and deallocate, in any order Garbage Collector does deallocation automatically – Must be able to find free space among used one, group free blocks into larger ones (compaction),… Stack is more efficient: allocation is simple: increment, decrement top of stack pointer (SP) is often a register if stack grows towards smaller addresses: – to allocate N bytes on stack (push): SP := SP - N – to deallocate N bytes on stack (pop): SP := SP + N Stack Heap Constants Static Globals free memory SP 0 50kb 10MB 1 GB Exact picture may depend on hardware and OS

JVM vs General Register Machine Code R1  Mem[SP] SP = SP + 4 R2  Mem[SP] R2  R1 * R2 Mem[SP]  R2 imul JVM: Register Machine:

5) Register Allocation

How many variables? x,y,z,xy,xz,res1 x = m[0] y = m[1] xy = x * y z = m[2] yz = y*z xz = x*z res1 = xy + yz m[3] = res1 + xz Do we need 6 distinct registers if we wish to avoid load and stores? x = m[0] y = m[1] xy = x * y z = m[2] yz = y*z y = x*z // reuse y x = xy + yz // reuse x m[3] = x + y

Idea of Graph Coloring Register Interference Graph (RIG): – indicates whether there exists a point of time where both variables are live – if so, we draw an edge – we will then assign different registers to these variables – finding assignment of variables to K registers corresponds to coloring graph using K colors!

Graph Coloring Algorithm Simplify If there is a node with less than K neighbors, we will always be able to color it! so we can remove it from the graph This reduces graph size (it is incomplete) Every planar can be colored by at most 4 colors (yet can have nodes with 100 neighbors) Spill If every node has K or more neighbors, we remove one of them we mark it as node for potential spilling then remove it and continue Select Assign colors backwards, adding nodes that were removed If we find a node that was spilled, we check if we are lucky that we can color it if yes, continue if no, insert instructions to save and load values from memory restart with new graph (now we have graph that is easier to color, we killed a variable)