Memory Allocation and circular-arc graphs Presented By Mohammed Alali ST: STRUCTURED GRAPHS AND THEIR APPLICATIONS - CS 6/75995 Dr. Dragan.

Slides:



Advertisements
Similar presentations
Register Allocation COS 320 David Walker (with thanks to Andrew Myers for many of these slides)
Advertisements

Topic G: Static Single-Assignment Form José Nelson Amaral
Compiler Support for Superscalar Processors. Loop Unrolling Assumption: Standard five stage pipeline Empty cycles between instructions before the result.
Static Single-Assignment ? ? Introduction: Over last few years [1991] SSA has been Stablished as… Intermediate program representation.
SSA and CPS CS153: Compilers Greg Morrisett. Monadic Form vs CFGs Consider CFG available exp. analysis: statement gen's kill's x:=v 1 p v 2 x:=v 1 p v.
Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
Register Usage Keep as many values in registers as possible Register assignment Register allocation Popular techniques – Local vs. global – Graph coloring.
P3 / 2004 Register Allocation. Kostis Sagonas 2 Spring 2004 Outline What is register allocation Webs Interference Graphs Graph coloring Spilling Live-Range.
Register Allocation after Classical SSA Elimination is NP-complete Fernando M Q Pereira Jens Palsberg - UCLA - The University of California, Los Angeles.
Register Allocation CS 320 David Walker (with thanks to Andrew Myers for most of the content of these slides)
SSA.
Stanford University CS243 Winter 2006 Wei Li 1 Register Allocation.
Register Allocation CS 671 March 27, CS 671 – Spring Register Allocation - Motivation Consider adding two numbers together: Advantages: Fewer.
1 NP-Complete Problems. 2 We discuss some hard problems:  how hard? (computational complexity)  what makes them hard?  any solutions? Definitions 
Carnegie Mellon Lecture 6 Register Allocation I. Introduction II. Abstraction and the Problem III. Algorithm Reading: Chapter Before next class:
David Laughon CS594 Graph Theory Graph Coloring. Coloring – Assignment of labels to vertices k-coloring – a coloring where Proper k-coloring – k-coloring.
Last time: terminology reminder w Simple graph Vertex = node Edge Degree Weight Neighbours Complete Dual Bipartite Planar Cycle Tree Path Circuit Components.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
1 CS 201 Compiler Construction Lecture 12 Global Register Allocation.
Cpeg421-08S/final-review1 Course Review Tom St. John.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
November 29, 2005Christopher Tuttle1 Linear Scan Register Allocation Massimiliano Poletto (MIT) and Vivek Sarkar (IBM Watson)
Prof. Bodik CS 164 Lecture 171 Register Allocation Lecture 19.
Register Allocation (via graph coloring)
CMPUT Compiler Design and Optimization1 CMPUT680 - Fall 2003 Topic 7: Register Allocation and Instruction Scheduling José Nelson Amaral
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Advanced Compilers CMPSCI 710.
CS 412/413 Spring 2007Introduction to Compilers1 Lecture 29: Control Flow Analysis 9 Apr 07 CS412/413 Introduction to Compilers Tim Teitelbaum.
Register Allocation (via graph coloring). Lecture Outline Memory Hierarchy Management Register Allocation –Register interference graph –Graph coloring.
COE 561 Digital System Design & Synthesis Resource Sharing and Binding Dr. Aiman H. El-Maleh Computer Engineering Department King Fahd University of Petroleum.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Topic 6 -Code Generation Dr. William A. Maniatty Assistant Prof. Dept. of Computer Science University At Albany CSI 511 Programming Languages and Systems.
ECE Synthesis & Verification - Lecture 4 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Allocation:
Improving Code Generation Honors Compilers April 16 th 2002.
Improving code generation. Better code generation requires greater context Over expressions: optimal ordering of subtrees Over basic blocks: Common subexpression.
Data Flow Analysis Compiler Design Nov. 8, 2005.
4/29/09Prof. Hilfinger CS164 Lecture 381 Register Allocation Lecture 28 (from notes by G. Necula and R. Bodik)
Register Allocation and Spilling via Graph Coloring G. J. Chaitin IBM Research, 1982.
Linear Scan Register Allocation POLETTO ET AL. PRESENTED BY MUHAMMAD HUZAIFA (MOST) SLIDES BORROWED FROM CHRISTOPHER TUTTLE 1.
4 Polygon and VLSI design By Madhu Reddy Enugu. Outline Real World Problem. Introduction to 4 polygon graph. Properties of 4 polygon graphs. Graph construction.
CS745: Register Allocation© Seth Copen Goldstein & Todd C. Mowry Register Allocation.
Algorithms for Network Optimization Problems This handout: Minimum Spanning Tree Problem Approximation Algorithms Traveling Salesman Problem.
U NIVERSITY OF D ELAWARE C OMPUTER & I NFORMATION S CIENCES D EPARTMENT Optimizing Compilers CISC 673 Spring 2009 Register Allocation John Cavazos University.
CMPE 511 Computer Architecture A Faster Optimal Register Allocator Betül Demiröz.
Week 10Complexity of Algorithms1 Hard Computational Problems Some computational problems are hard Despite a numerous attempts we do not know any efficient.
ALG0183 Algorithms & Data Structures Lecture 4 Experimental Algorithmics 8/25/20091 ALG0183 Algorithms & Data Structures by Dr Andy Brooks Case study article:
Interference Graphs for Programs in Static Single Information Form are Interval Graphs Philip Brisk Processor Architecture Laboratory (LAP) EPFL Lausanne,
1 Optimizing compiler tools and building blocks project Alexander Drozdov, PhD Sergey Novikov, PhD.
Graph Colouring L09: Oct 10. This Lecture Graph coloring is another important problem in graph theory. It also has many applications, including the famous.
Introduction to OOP CPS235: Introduction.
CS412/413 Introduction to Compilers Radu Rugina Lecture 18: Control Flow Graphs 29 Feb 02.
CS 3220: Compilation Techniques for Parallel Systems Spring Pitt CS
2/22/2016© Hal Perkins & UW CSEP-1 CSE P 501 – Compilers Register Allocation Hal Perkins Winter 2008.
1 Liveness analysis and Register Allocation Cheng-Chia Chen.
Register Allocation Ajay Mathew Pereira and Palsberg. Register allocation via coloring of chordal graphs. APLOS'05Register allocation via coloring of chordal.
A Graph Theoretic Approach to Cache-Conscious Placement of Data for Direct Mapped Caches Mirza Beg and Peter van Beek University of Waterloo June
Single Static Assignment Intermediate Representation (or SSA IR) Many examples and pictures taken from Wikipedia.
Global Register Allocation Based on
Topic Register Allocation
Graph Coloring and Applications
Optimizing Compilers Background
Philip Brisk Ajay K. Verma Paolo Ienne
Compiler Construction
Optimal Polynomial-Time Interprocedural Register Allocation for High-Level Synthesis Using SSA Form Philip Brisk Ajay K. Verma Paolo Ienne csda.
Lecture 17: Register Allocation via Graph Colouring
Register Allocation via Coloring of Chordal Graphs
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
CS 201 Compiler Construction
Presentation transcript:

Memory Allocation and circular-arc graphs Presented By Mohammed Alali ST: STRUCTURED GRAPHS AND THEIR APPLICATIONS - CS 6/75995 Dr. Dragan

Introduction Problem formulation [13] : We have a program P and a number of available registers N: Can each of the temporaries/variables of P be mapped to one of the N registers such that temporary variables with interfering live ranges are assigned to different registers? NP-Complete

Register Allocation: Definition Register allocation = assigning registers to values (i.e. Variables, constants …) [12] Done by the compiler. Crucial low level optimization (2x-7x faster than cache) [12] : Big performance improvements. Optimality = minimizing the number of registers [8] : More compact design which increases register utilization.

Register Allocation: Definition The register allocation phase of the compiler stands between [19] : The optimization phase and the final code assembly and emission phase. Intermediate Language (IL) assumes unlimited number of registers. Optimization phase eliminates references to storage: more data in registers. Register Allocation phase maps the unlimited symbols to the +/-32 registers. When necessary: Spill to memory & reload later. Optimization Phase Register Allocation Phase Code Assembly Phase Emission/Executable

Register Allocation: Process Description It mainly depends on the concept of variable Liveness. A variable is ”live” if it holds a value that may be used in the future [6]. If two variables are LIVE simultaneously, they cannot be allocated to the same register. Control Flow Graphs (CFG) analysis: Def. : All paths that might be traversed through a program during its execution Essential to compiler optimizations and static analysis. Nodes Coalescing: The process of combining two nodes/values. [20] : a.If-then-else. b.While loop.

Register Allocation: Process Description Nodes Coalescing Example: Not always effective: May increase chromatic number (more registers). [21]

Register Allocation: Process Description Basic Example [12] : We have a program P with 6 variables: a = c+ d e = a + b f = e – 1 Lets assume that a and e will die after being used. Then a and e locations can be reused. Therefore, we can allocate a, e and f to one register (r1): r1 = r2 + r3 r1 = r1 + r4 r1 = r1 - 1 Number of registers used = 4 for 6 variables.

Register Allocation: Traditional Solution Chaitin et al. (1981) basic steps [19] : Compute variable liveness to get the number of names. Build the interference graph Perform nodes coalescing when possible. Attempt to find a 32-coloring of the graph. If one cannot be found Modifying the program (spilling) and its graph until a 32-coloring is obtained.

Register Allocation: Traditional Solution Example [12] : Compute variable liveness

Register Allocation: Traditional Solution Example [12] : Construct interference graph where: Nodes = variables Edges = lifetimes of variables. Formally [8] : Let V be the set of variables in the program. An interference graph G =(V,E) is defined, where (u, v) ∈ E indicates that variables u and v interfere, i.e., their lifetimes overlap and thus require separate storage resources. Two variables can be allocated to same register if no edge connects them a fb ce d {b,c,f} {a,c,f} {c,d,f} {c,d,e,f} {c,e} {b,c,e,f} {c,f} {b}

Register Allocation: Traditional Solution Example [12] : Coloring: Colors = registers K-Colorable: K = number of machine registers. If the IG is k-colorable, there’s a register assignment that uses no more than k registers. 4 colors are needed. a fb ce d {b,c,f} {a,c,f} {c,d,f} {c,d,e,f} {c,e} {b,c,e,f} {c,f} {b}

Register Allocation: Traditional Solution Example [12] : a fb ce d

Register Allocation: Traditional Solution Complexity? [12][19][21] : NP-Complete! Chaitin et al. [19] proved that the problem is NP-Hard using reduction from the Graph Coloring problem in general graphs. Because every graph is the interference graph of some program. Heuristics through elimination and spilling could help. [21] Still a powerful technique.

Register Allocation: Traditional Solution Some other known algorithms/approaches [15] : Using Integer Linear Programming and may run in worst-case exponential time, such as the algorithm of Appel and George [14]. Other algorithms use polynomial-time heuristics, such as the algorithm of Briggs, Cooper, and Torczon [16]. The Linear Scan algorithm of Poletto and Sarkar [17]. And many others …

Question : Can we do better? Yes. Two Important properties should be satisfied: Static Single Assignment (SSA) Circular arc models (Chordal Graphs).

Circular-arc graphs : Definition A circular-arc graph is the intersection graph of a set of arcs on the circle [1][9]. A graph G is called a circular-arc graph if [9] : There exists a family C of arcs such that Then, its circular-arc graph is G = (V, E) where And G(F) is called proper circular-arc graph if no arc is contained in any other [5][1]. They are a natural generalization of interval graphs [2]. [9]

Circular-arc graphs : Definition General assumptions [1] : All arcs are closed (contain both endpoints). No arc consists of the whole circle. The families of arcs are all finite. Some Applications [5] : Compiler design and optimization. Allocating bandwidth in all-optical WDM. Scheduling. [9]

Circular-arc graphs : Complexity Recognition [7][11] : Initially conjectured by Booth recognition is NP-complete [3]. Tucker disproved this with an O(n3) algorithm [1]. Hsu improved this to O(nm) -- m is the number of edges [6]. Eschen and Spinrad further improved this to O(n2) [4]. McConnell (2003) gave the first linear O(n+m) recognition algorithm [7]. Colorability: NP-Complete (Garey et al [2] ). 3-Colourability: Polynomial (Garey et al [2] ). [9]

Circular-arc graphs : Challenges? Circular-arc models does NOT always produce Chordal graphs. Therefore, we have to transform program P to SSA form [13]. Conflict Graphs Circular arc Graphs Chordal Graphs Interval Graphs [9] [22]

Static Single Assignment (SSA): Definition SSA is an intermediate representation used in many compilers like gcc 4 [13]. If a program is in SSA form, then every variable is assigned exactly once Each use refers to exactly one definition [13]. SSA construction algorithm is used to build transform non-SSA to SSA. Eg. Cytron et al [23]. [25] Non-SSASSA SSA Simple Example:

SSA and Chordality: Author’s Claims Bouchez and Hack (2006) proved the result that strict programs in SSA form have Chordal Interference Graphs. ( Hack’s Ph.D. Dissertation ). [13][21] Chordal graphs can be colored in linear time. [13][26] He utilized Dominance property to redefine liveness of variables. A variable v dominates v’, if all paths from Dv to v’ contain v. A program P is strict if each usage of a variable v is dominated by Dv.

SSA and Chordality : Steps Summary “ Before a value v is added to a PEO, add all values whose definitions are dominated by v A Post order walk of the dominance tree defines a PEO A pre order walk of the dominance tree yields a coloring sequence IGs of SSA-form programs can be colored optimally in O(k · |V|) Without constructing the interference graph itself “ [27][21]

SSA and Chordality : Simple Example [27] [27] :

SSA and Chordality : Simple Example [27] The Circular-arc model will look something like this: b a c d e a b c d e

SSA and Chordality : Simple Example [27] How can we create a 4-cycle {a, c, d, e}? Redefinition of a violates SSA property.

SSA and Chordality : Simple Example 2 [27] The  -function breaks cycles in the IG

SSA and Chordality : Simple Example 2 [27] The  -function breaks cycles in the IG

Remarks: This leads to a single pass register allocator architecture looking like [21] No iterations. SSA separate spilling from coalescing [27] Both remain challenging. Implementation: [21]

References 1. A. Tucker, Coloring a family of circular arcs, SIAM J. Appl. Math., 29 (1975), pp. 493– M.R. Garey, D.S. Johnson, G.L. Miller, and C.H. Papadimitriou. The complexity of coloring circular arcs and chords. SIAM J. Alg. Disc. Meth., 1(2): , June K.S. Booth. PQ-Tree Algorithms. Ph.D. thesis, Department of Computer Science, University of California, Berkeley, CA, E.M. Eschen and J.P. Spinrad. An O(n2) algorithm for circular-arc graph recognition. Proceedings of the Fourth Annual ACM–SIAM Symposium on Discrete Algorithms, pp. 128–137, M. Valencia-Pabon. Revisiting Tucker’s algorithm to color circular arc graphs. SIAM Journal on Computing, 32(4), pp , W.L. Hsu. O(mn) algorithms for the recognition and isomorphism problems on circular-arc graphs. SIAM J. Comput., 24:411–439, McConnell, Ross (2003), Linear-time recognition of circular-arc graphs, Algorithmica 37 (2): 93–147, doi: /s Brisk, P.; Dabiri, F.; Jafari, R.; Sarrafzadeh, M., "Optimal register sharing for high-level synthesis of SSA form programs," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol.25, no.5, pp.772,779, May 2006 doi: /TCAD Register Allocation via Graph Coloring Lecture. John Cavazos. University of Delaware. 13. Register Allocation Lecture. Ajay Mathew. Carnegie Mellon University.

References 14. Andrew W Appel and Lal George. Optimal spilling for cisc machines with few registers. In International Conference on Programming Languages Design and Im- plementation, pages 243–253. ACM Press, N. Fritz, P. Lucas, and P. Slusallek. CGiS, a new language for data-parallel GPU programming. In B. Girod, H.-P. Seidel, and M. Magnor, editors, Proceedings of “Vision, Modeling, and Visualization”, pages 241–248, Preston Briggs, Keith D. Cooper, and Linda Torczon. Improvements to graph coloring register allocation. Transactions on Programming Languages and Systems (TOPLAS), 16(3):428–455, Massimiliano Poletto and Vivek Sarkar. Linear scan register allocation. ACM Transactions on Programming Languages and Systems, 21(5):895–913, M. Valencia-Pabon. Revisiting Tucker's algorithm to color circular arc graphs. SIAM J. Comput., 32 (2003), pp. 1067– Chaitin, Gregory (04/2004). "Register allocation and spilling via graph coloring". SIGPLAN notices; ACM Special Interest Group on Programming Languages ( ), 39 (4), p Sebastian Hack, Daniel Grund, Gerhard Goos, Register allocation for programs in SSA-Form, Proceedings of the 15th international conference on Compiler Construction, March 30-31, 2006, Vienna, Austria [doi> / _20] 22. Springer, D.L.; Thomas, D.E., "Exploiting the special structure of conflict and compatibility graphs in high-level synthesis," Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, vol.13, no.7, pp.843,856, Jul 1994 doi: / Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, F. Kenneth Zadeck, Efficiently computing static single assignment form and the control dependence graph, ACM Transactions on Programming Languages and Systems (TOPLAS), v.13 n.4, p , Oct [doi> / ]

References 24. Maw-Shang Chang, Efficient Algorithms for the Domination Problems on Interval and Circular-Arc Graphs, SIAM Journal on Computing, v.27 n.6, p , Dec [doi> /S ]

Thank you