CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1.

Slides:



Advertisements
Similar presentations
Optimizing Compilers for Modern Architectures Copyright, 1996 © Dale Carnegie & Associates, Inc. Dependence Testing Allen and Kennedy, Chapter 3 thru Section.
Advertisements

5.1 Real Vector Spaces.
1 ECE734 VLSI Arrays for Digital Signal Processing Loop Transformation.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
Carnegie Mellon Lecture 7 Instruction Scheduling I. Basic Block Scheduling II.Global Scheduling (for Non-Numeric Code) Reading: Chapter 10.3 – 10.4 M.
Timed Automata.
Enforcing Sequential Consistency in SPMD Programs with Arrays Wei Chen Arvind Krishnamurthy Katherine Yelick.
Dependence Analysis Kathy Yelick Bebop group meeting, 8/3/01.
Parallel and Cluster Computing 1. 2 Optimising Compilers u The main specific optimization is loop vectorization u The compilers –Try to recognize such.
What is an Algorithm? (And how do we analyze one?)
Practical Dependence Test Gina Goff, Ken Kennedy, Chau-Wen Tseng PLDI ’91 presented by Chong Liang Ooi.
Stanford University CS243 Winter 2006 Wei Li 1 Loop Transformations and Locality.
Compiler Challenges, Introduction to Data Dependences Allen and Kennedy, Chapter 1, 2.
Stanford University CS243 Winter 2006 Wei Li 1 Data Dependences and Parallelization.
CSE 830: Design and Theory of Algorithms
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
Analysis of Algorithms1 Estimate the running time Estimate the memory space required. Time and space depend on the input size.
Dependence Testing Optimizing Compilers for Modern Architectures, Chapter 3 Allen and Kennedy Presented by Rachel Tzoref and Rotem Oshman.
Data Flow Analysis Compiler Design Nov. 8, 2005.
Data Dependences CS 524 – High-Performance Computing.
Software Verification Bertrand Meyer Chair of Software Engineering Lecture 2: Axiomatic semantics.
Machine-Independent Optimizations Ⅰ CS308 Compiler Theory1.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2 pp
Dependence: Theory and Practice Allen and Kennedy, Chapter 2 Liza Fireman.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Linear Systems The definition of a linear equation given in Chapter 1 can be extended to more variables; any equation of the form for real numbers.
3.1 - Solving Systems by Graphing. All I do is Solve!
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2.
5  Systems of Linear Equations: ✦ An Introduction ✦ Unique Solutions ✦ Underdetermined and Overdetermined Systems  Matrices  Multiplication of Matrices.
The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.
System of Linear Equations Nattee Niparnan. LINEAR EQUATIONS.
Systems of Linear Equation and Matrices
Multi-Dimensional Arrays
Section 4.1 Vectors in ℝ n. ℝ n Vectors Vector addition Scalar multiplication.
Array Dependence Analysis COMP 621 Special Topics By Nurudeen Lameed
26 Sep 2014Lecture 3 1. Last lecture: Experimental observation & prediction Cost models: Counting the number of executions of Every single kind of command.
Toward Efficient Flow-Sensitive Induction Variable Analysis and Dependence Testing for Loop Optimization Yixin Shou, Robert A. van Engelen, Johnnie Birch,
Memory Allocations for Tiled Uniform Dependence Programs Tomofumi Yuki and Sanjay Rajopadhye.
Major objective of this course is: Design and analysis of modern algorithms Different variants Accuracy Efficiency Comparing efficiencies Motivation thinking.
Experiences with Enumeration of Integer Projections of Parametric Polytopes Sven Verdoolaege, Kristof Beyls, Maurice Bruynooghe, Francky Catthoor Compiler.
Control Flow Deobfuscation via Abstract Interpretation © Rolf Rolles, 2010.
Program Efficiency & Complexity Analysis. Algorithm Review An algorithm is a definite procedure for solving a problem in finite number of steps Algorithm.
Carnegie Mellon Lecture 14 Loop Optimization and Array Analysis I. Motivation II. Data dependence analysis Chapter , 11.6 Dror E. MaydanCS243:
1 Theory and Practice of Dependence Testing Data and control dependences Scalar data dependences  True-, anti-, and output-dependences Loop dependences.
Section 2.3 Properties of Solution Sets
Carnegie Mellon Lecture 15 Loop Transformations Chapter Dror E. MaydanCS243: Loop Optimization and Array Analysis1.
Semantics In Text: Chapter 3.
NP-COMPLETE PROBLEMS. Admin  Two more assignments…  No office hours on tomorrow.
CR18: Advanced Compilers L04: Scheduling Tomofumi Yuki 1.
CR18: Advanced Compilers L01 Introduction Tomofumi Yuki.
CR18: Advanced Compilers L05: Scheduling for Locality Tomofumi Yuki 1.
1/6/20161 CS 3343: Analysis of Algorithms Lecture 2: Asymptotic Notations.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Structuring Data: Arrays ANSI-C. Representing multiple homogenous data Problem: Input: Desired output:
CprE / ComS 583 Reconfigurable Computing Prof. Joseph Zambreno Department of Electrical and Computer Engineering Iowa State University Lecture #13 – Other.
Revisiting Loop Transformations with X10 Clocks Tomofumi Yuki Inria / LIP / ENS Lyon X10 Workshop 2015.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
1 ENERGY 211 / CME 211 Lecture 4 September 29, 2008.
Dependence Analysis and Loops CS 3220 Spring 2016.
Lecture 7: Constrained Conditional Models
CS314 – Section 5 Recitation 13
Lecture 3 of Computer Science II
Data Dependence, Parallelization, and Locality Enhancement (courtesy of Tarek Abdelrahman, University of Toronto)
CS 213: Data Structures and Algorithms
CS 3343: Analysis of Algorithms
Parallelization, Compilation and Platforms 5LIM0
Objective of This Course
A Unified Framework for Schedule and Storage Optimization
Optimizing single thread performance
Presentation transcript:

CR18: Advanced Compilers L02: Dependence Analysis Tomofumi Yuki 1

Today’s Agenda Legality of Loop Transformations Dependences Legality of loop parallelization Legality of loop permutation Dependence Tests How to find dependences? Conservative tests Exact methods Polyhedral Representations 2

Loop Parallelism “Simple” transformation Not so simple to reason about Legality Performance impacts More complicated cases Transform the loops to expose parallelism 3 for (i=0; i<N; i++) S; for (i=0; i<N; i++) S; forall (i=0; i<N; i++) S; forall (i=0; i<N; i++) S;

Legality of Transformations First Rule of Compiler preserve original semantics Many complications loops parameters array accesses branches pointers random numbers regular subset 4

Preserving Semantics Preserving the order of operations one “easy” way to ensure preservation dependence is a partial order Exceptions? 5

Dependences Express relations between statements flow (true) dependence RAW anti-dependence WAR output dependence WAW input dependence RAR 6 a = = a a = = a a = = a a = = a

Flow vs Anti Dependence Why is flow the “true” dependence? Flow is value-based Anti is memory-based for i a[i] = = a[i] for i a[i] = = a[i] for i... = a[i] a[i] =... for i... = a[i] a[i] =... for i... = a[i] b[i] =... for i... = a[i] b[i] =... 7

Dependence Abstractions Distance Vector distance between write and read [i,j] + c e.g., [0,1] Direction Vector direction of the instance that uses the value one of, ≤, ≥, =, * e.g., [0,<] less precise, but sometimes sufficient 8

Direction Vector Example 1 9 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i][0] + B[i][j]; for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i][0] + B[i][j]; i j distance vector [0,1], [0,2], [0,3] direction vector [0,<]

Direction Vector Example 2 10 for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; i j distance vector [1,-1] direction vector [ ]

So what does these vectors do? Parallelism is clear same for direction vectors Loop carried-dependence loop at depth d carries a dependence if at least one of the distance/direction vectors have non-zero entry at d 11 [0,0,1] [0,1,0] [1,1,0] [0,0,1] [0,1,0] [1,1,0] [0,0, 1] [0,1, 1] [0,1,-1] [0,0, 1] [0,1, 1] [0,1,-1] [1, 0,0] [1, 1,0] [1,-1,0] [1, 0,0] [1, 1,0] [1,-1,0]

Loop Carried Dependence Is any of the loops parallel? What are the distance vectors? 12 for i for j A[j] = foo(A[j], A[j+1]) for i for j A[j] = foo(A[j], A[j+1])

Legality of Loop Permutation Another application of distance vectors Which ones can you permute? 13 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; [1,1] [0,1] [1,-1]

Legality of Loop Permutation Another application of distance vectors Which ones can you permute? 14 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j-1] + B[i][j]; [1,1] i j

for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i+1][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i+1][j-1] + B[i][j]; Another application of distance vectors Which ones can you permute? Legality of Loop Permutation 15 i j [0,1] for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i][j-1] + B[i][j];

Another application of distance vectors Which ones can you permute? for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; [1,-1] Legality of Loop Permutation 16 i j Fully permutable: [≤,...,≤]

Legality of Loop Reversal Is this transformation legal? 17 for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=1; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=M-1; j>0; j--) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=M-1; j>0; j--) A[i][j] = A[i-1][j+1] + B[i][j]; [1,-1] [?,?]

Today’s Agenda Legality of Loop Transformations Dependences Legality of loop parallelization Legality of loop permutation Dependence Tests How to find dependences? Conservative tests Exact methods 18

How to Find the Vectors Easy case Not too easy 19 for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i][j] = A[i-1][j+1] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[i] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[i] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[2*i-j+3] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[i] = A[2*i-j+3] + B[i][j];

How to Find the Vectors Really difficult No general solution polynomial case is undecidable can work for linear accesses wide range of precise-ness even for linear case 20 for (i=1; i<N; i++) for (j=0; j<M; j++) { A[i*i+j*j-i*j] = A[i] + B[i][j]; A[i*j*j-i*j*3] = A[i] + B[i][j]; } for (i=1; i<N; i++) for (j=0; j<M; j++) { A[i*i+j*j-i*j] = A[i] + B[i][j]; A[i*j*j-i*j*3] = A[i] + B[i][j]; }

Dependence: Affine Case Given two accesses f(i,j) and g(x,y) the two accesses are in conflict if: same location: f(i,j) = g(x,y) one of them is a write Let f and g be affine a 0 +a 1 i+a 2 j = b 0 +b 1 x+b 2 y The last write to a conflicting location is the producer 21

It is just solving a linear system Theoretically it is not that “hard” Two Directions Polyhedral: use PIP and get exact solution Others: less expensive solutions work in practice 22

Exact Method: Polyhedral Model Array Dataflow Analysis [Feautrier 1991] Given read and write statement instances r,w Find w as a function of r such that r and w are in conflict w happens-before r w is the most recent write when everything is affine Main Engine Parametric Integer Linear Programming 23

Exact Dependence Analysis Who produced the value read at A[j]? Powerful but expensive 24 for (i=0; i<N; i++) for (j=0; j<M; j++) S: A[i] = A[j] + B[i][j]; for (i=0; i<N; i++) for (j=0; j<M; j++) S: A[i] = A[j] + B[i][j]; S = if i>j and j>0 : S ; if i=j and i>0 : S ; if j>i or i=j=0: A[j]; S = if i>j and j>0 : S ; if i=j and i>0 : S ; if j>i or i=j=0: A[j]; 0≤i,i’<N 0≤j,j’<M i=j’ (i’,j’)<<(i,j) obj: max i’*X+j’

ADA Example 1 What is the PIP problem? 25 for (i = 0; i<=N; i++) for (j = i; j<=M; j++) A[j] = foo(A[j], A[j+1]) for (i = 0; i<=N; i++) for (j = i; j<=M; j++) A[j] = foo(A[j], A[j+1])

ADA Example 2 What is the PIP problem? 26 for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]); for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);

Digression: Multiple Statements Within a domain, the order of execution is given by lex. order What do you do when you have multiple statements? 27

2d+1 Notation A convention to encode statement ordering Called in many different names in the original ADA paper, it simply said to: “use the textual order” For a d-dimensional loop nest, use d+1 constant dimensions 28 for i for j S1 ; for j S2 ; S3 ; for i for j S1 ; for j S2 ; S3 ; dom(S1) = {0,i,0,j,0|...} dom(S2) = {0,i,1,j,0|...} dom(S3) = {0,i,1,j,1|...} dom(S1) = {0,i,0,j,0|...} dom(S2) = {0,i,1,j,0|...} dom(S3) = {0,i,1,j,1|...}

ADA Example 2 What is the PIP problem? 29 for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]); for (i = 0; i<=N; i++) B[j] = foo(...); for (j = i; j<=M; j++) A[j] = bar(B[j], B[j-1]);

ADA Example 3 What is the PIP problem? 30 for (t=0; t<=T; t++) { for (i=0; i<=N; i++) A[i] = foo(B[j]); for (j=0; j<=M; j++) B[j] = foo(A[i]); } for (t=0; t<=T; t++) { for (i=0; i<=N; i++) A[i] = foo(B[j]); for (j=0; j<=M; j++) B[j] = foo(A[i]); }

The Omega Test Another Variant of ADA William Pugh (1991) based on Fourier-Motzkin for integers Presburger Arithmetic Two slightly different branches one in US, the other in France we mostly talk about the French stuff, but similar evolution took place with Omega 31

So what is wrong? Can’t we just use this powerful method all the time? 32

Dependence Tests Same setting (conflicting memory accesses) f(i,j) = g(x,y) Let f and g be affine a 0 +a 1 i+a 2 j = b 0 +b 1 x+b 2 y linear Diophantine equation solution exists if 33 gcd(a 1,a 2,b 1,b 2 )=|a 0 -b 0 |

GCD Test 3i=6x-3y+2  3i-6x+3y=2 gcd(3,6,3) = 2 ? 2i=4x-2y+2  2i-4x+2y=2 gcd(2,4,2) = 2 ? 34 for (i=1; i<N; i++) for (j=0; j<M; j++) A[3*i] = A[6*i-3*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[3*i] = A[6*i-3*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[2*i] = A[4*i-2*j+2] + B[i][j]; for (i=1; i<N; i++) for (j=0; j<M; j++) A[2*i] = A[4*i-2*j+2] + B[i][j];

GCD vs ADA ADA is clearly much more precise (exact) What can ADA say for the following? 35 for (i=1; i<N; i++) for (j=0; j<i*i; j++) A[i] = foo(A[i])... for (i=1; i<N; i++) for (j=0; j<i*i; j++) A[i] = foo(A[i])...

Why is GCD Test Inexact? When does GCD test give false positive? What happens when GCD=1? GCD test: i = j trivial solution exist Main problem the space is completely unconstrained 36 for (i=0; i<N; i++) for (j=N; j<M; j++) A[i] = A[j] + B[i][j]; for (i=0; i<N; i++) for (j=N; j<M; j++) A[i] = A[j] + B[i][j];

Exact vs Exact Array Dataflow Analysis “exact” dependence analysis GCD Test inexact dependence test Exact Dependence Tests no false positives/negatives does not necessary give the producer 37

Banerjee Test [Banerjee 1976] Making it slightly better There may be a dependence if min(f(i,j)-g(x,y))≤0, and 0≤max(f(i,j)-g(x,y)) min(i-j) = 0-(M-1) = 1-M max(i-j) = N-1-N = for (i=0; i<N; i++) for (j=N; j<M; j++) { A[i] = A[j] + B[i][j]; } for (i=0; i<N; i++) for (j=N; j<M; j++) { A[i] = A[j] + B[i][j]; }

Banerjee Test Intuition interval of 2 functions 39

Banerjee Test Exact or Inexact? Weakness? 40

What happens with 2D arrays? How to formulate? given read: A[i][j] and write: A[x+1][y+2] How to formulate? given read: A[i][i] and write: A[x+1][x+2] 41 for (i=0; i<N; i++) A[i][j] = A[i+1][j+2]; for (i=0; i<N; i++) A[i][j] = A[i+1][j+2]; for (i=0; i<N; i++) A[i][i] = A[i+1][i+2]; for (i=0; i<N; i++) A[i][i] = A[i+1][i+2];

Dimension-by-Dimension Simple extension also called subscript-by-subscript Given A[f 1 (i vec ),f 2 (i vec ),...,f n (i vec )] B[g 1 (j vec ),g 2 (j vec ),...,g n (j vec )] Check feasibility of: f 1 = g 1 or f 2 = g 2 or,..., f 3 = g 3 42

Limitations of Dim-by-Dim Is there parallelism in this loop nest? “coupled” subscript We need to check for feasibility of: f 1 = g 1 ∧ f 2 = g 2 ∧,..., ∧ f 3 = g 3 43 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

Lambda Test [Li et al. 1989] Multi-dimensional Banerjee Given A[f 1 (i vec ),f 2 (i vec ),...,f n (i vec )] B[g 1 (j vec ),g 2 (j vec ),...,g n (j vec )] Check 44

How to get Direction Vectors Pick a direction vector and then test it! only test relevant vectors to the legality testing for lex. negative vectors can return true, but makes no sense What makes sense for the following? 45 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

Lambda Test Let’s try [=,<] 46 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

Lambda Test Let’s try [=,<] 47 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

Lambda Test Let’s try [=,<] 48 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... i,i’ j,j’ ψ2 ψ1

Lambda Test Let’s try [=,<] 49 for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =... for (i=0; i<N; i++) for (j=0; j<M; j++) A[i][j] =... A[2*j][i] =...

Delta Test [Goff et al. 1991] Further extensions for multiple indices Pragmatic approach key observation: real programs are not that complicated when it comes to array accesses 1 st Step, classify array access (pairs) ZIV (Zero Index Variable) pair SIV (Single Index Variable) pair MIV (Multiple Index Variables) pair 50

Delta Test Classifications ZIV e.g., A[N], A[10],... loop invariant SIV e.g., A[i], A[j], A[i+2],... only one loop iterator MIV e.g., A[i+j], A[2*i-j], A[i*j]... when two ore more iterators are involved 51

Array Access Patterns What do they look like in “real-life”? 1D, 2D, 3D+ arrays coupled, separate ZIV, SIV, MIV 52

Delta Test Algorithm 1. Classify accesses 2. Solve the easy cases if separable ZIV/SIV proves independence, done 3. Solve the harder cases BUT, some information are used from Step 2. constraint intersection/propagation 53

Constraint Intersection It is sometimes easy to show that multiple constraints cannot be satisfied at the same time If you have coupled SIV accesses e.g., A[i,i] = A[i+1, i+2] By analyzing each dim separately, you get i’ = i+1 and j’=i+2 But you also know that the valid space is i’=j’, i’=j’=i+c Intersecting everything gives empty set 54

Constraint Propagation Like intersection, SIV gives partial information e.g., A[i,i+j] = A[i+1, i+j] i’=i+1 is derived from the 1 st dim you then substitute the info to the 2 nd dim A[i’,j’] = A[i+1, i+1+j’] Reformulating the 2 nd dim gives i+1+j’ = i+j which yields j’=j-1 55

Putting it All Together Delta-Test aims to take advantage of various properties of how the code is written a collection of many small tricks It is probably closer to what is in actual compilers than the polyhedral model 56

transition 57

Back to Array Dataflow Result Another view: PRDG Polyhedral Reduced Dependence Graph reduced vs extended (recall L01) Node: Statement domain Edge: Dependence (domain + function) 58 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = B[j]+B[j+1]; for (j=0; j<Q; j++) S1: B[j] = A[j]; } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = B[j]+B[j+1]; for (j=0; j<Q; j++) S1: B[j] = A[j]; } S0S0 S0S0 S1S1 S1S1 0≤i<N 0≤j<P 0≤i<N 0≤j<Q

Polyhedral Objects We will usually use ISL syntax Set [ ] -> { [ ] : } [N,M]->{ [i,j] : 0<=i<N and 0<=j<M } Relation [ ] -> { [ ] -> [out] : } [N,M]->{ [i,j] -> [x,y] : x=i+1 } Function is a special case of relation I often use ( → ) 59

Additional Conventions You can name each tuple Following are NOT equivalent [N,M] -> { S0[i,j] : 0<=i<N and 0<=j<M } [N,M] -> { S1[i,j] : 0<=i<N and 0<=j<M } Index names DO NOT matter Following are equivalent [N,M] -> { [i,j] : 0<=i<N and 0<=j<M } [N,M] -> { [x,y] : 0<=x<N and 0<=y<M } Names of parameters DO matter 60

Set vs Relations They are not really different [N]->{ [i,j] -> [x,y] : i=x and j=y } [N]->{ [i,j,x,y] : i=x and j=y } Mostly for convenience when representing program information Ex1. Dependence S0[i,j] -> S1[i’,j’] Ex2. Array access S0[i,j] -> A[i] 61

Matrix Representation Polyhedral obj. are often encoded as matrices Ax + b ≥ 0 A: linear part (matrix) x: indices (symbolic vector) b: constant (constant vector) Px + Ax + b to explicitly separate params Simply Ax + b for functions Algebraic properties of A is often used 62

Matrix Form Example { [i,j] : 0≤i<10 and 0≤j<i } 63

Integer Set Library Tool for manipulating sets and relations mostly by Sven Verdoolaege Kind of does every thing now manipulating set/relation scheduling code generation PIP counting integer points... 64

ISL Demo Online interface bin/dtai/barvinok.cgi 65

PRDG Example (Dataflow Edge) 66 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } S0S0 S0S0 S1S1 S1S1 S0[i,j]→S0[i+1,j] : j≥Q S0[i,j]→S1[i,j] : j<Q S1[i,j]→S1[i+1,j] : j<P S1[i,j]→S1[i+1,j] : j≥P

PRDG Example (Dep. Polyhedra) 67 for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } for (i=0; i<N; i++) { for (j=0; j<P; j++) S0: A[j] = foo(A[j]); for (j=0; j<Q; j++) S1: A[j] = bar(A[j]); } S0S0 S0S0 S1S1 S1S1 j≥Q, i’=i+1, j=j’ i’=i, j=j’ : j<Q i’=i+1, j=j’ : j<P i’=i+1, j=j’ : j’≥P 0≤i<N 0≤j<P 0≤i<N 0≤j<Q

Uniform vs Affine Dependence Uniform dependences constant offset: → + c can be described with distance vectors Affine dependences any affine function: → A.[i j]+b uniform when A = I When do we need affine dependences? 68

PRDG + Expressions = ??? PRDG is an abstraction of dependences what each statement does is lost You may want the expressions in you analysis typically when semantic properties are useful Polyhedral Equational Model 69

Alpha Language Equational Language or PRDG + Expressions or Systems of Affine Recurrence Equations or dynamic single assignment code Basic structure declaration of the domain of equations affine equations that define the computation performed at each iteration point 70

Alpha Example 71 for (i=1; i<=N; i++) { S0: A[i,i] = foo(); for (j=i+1; j<=M; j++) S1: A[i,j] = A[i,j-1] * A[i,i]; } for (i=1; i<=N; i++) { S0: A[i,i] = foo(); for (j=i+1; j<=M; j++) S1: A[i,j] = A[i,j-1] * A[i,i]; } S0 : [N,M] -> { [i] : 1<=i<=N } S1 : [N,M] -> { [i,j] : i<=i<=N and i<j<=M } S0[i] = foo(); S1[i,j] = case { : j=1} : A [i,j-1] * S0[i]; { : j>1} : S1[i,j-1] * S0[i]; esac; S0 : [N,M] -> { [i] : 1<=i<=N } S1 : [N,M] -> { [i,j] : i<=i<=N and i<j<=M } S0[i] = foo(); S1[i,j] = case { : j=1} : A [i,j-1] * S0[i]; { : j>1} : S1[i,j-1] * S0[i]; esac;

Role of Alpha in this Course Polyhedral Equational Model is not popular within the already niche Polyhedral Model I know A LOT about it because my advisor is the main guy working on it It is a good IR to look at both dependences and expressions it is also suited for teaching some of the aspects I sometimes use it to replace PRDG but keep in mind that it is a different view of it 72

Next Time Transforming polyhedral representations Tiling 73