Optimizing Compilers for Modern Architectures Copyright, 1996 © Dale Carnegie & Associates, Inc. Dependence Testing Allen and Kennedy, Chapter 3 thru Section.

Slides:



Advertisements
Similar presentations
Copyright © 2003 Pearson Education, Inc. Slide 1.
Advertisements

The Game of Algebra or The Other Side of Arithmetic The Game of Algebra or The Other Side of Arithmetic © 2007 Herbert I. Gross by Herbert I. Gross & Richard.
Optimizing Compilers for Modern Architectures Compiler Improvement of Register Usage Chapter 8, through Section 8.4.
Chapter 2 Section 3.
College of Information Technology & Design
The simplex algorithm The simplex algorithm is the classical method for solving linear programs. Its running time is not polynomial in the worst case.
Complexity Analysis (Part II)
Optimizing Compilers for Modern Architectures Coarse-Grain Parallelism Chapter 6 of Allen and Kennedy.
1 ECE734 VLSI Arrays for Digital Signal Processing Loop Transformation.
Optimizing Compilers for Modern Architectures Allen and Kennedy, Chapter 13 Compiling Array Assignments.
Introduction to arrays
Introduction to Neural Networks Computing
CHAPTER 3 Graphs of Liner Equations Slide 2Copyright 2011, 2007, 2003, 1999 Pearson Education, Inc. 3.1Graphs and Applications of Linear Equations 3.2More.
Enforcing Sequential Consistency in SPMD Programs with Arrays Wei Chen Arvind Krishnamurthy Katherine Yelick.
Optimizing Compilers for Modern Architectures Preliminary Transformations Chapter 4 of Allen and Kennedy.
SECTION 3.6 COMPLEX ZEROS; COMPLEX ZEROS; FUNDAMENTAL THEOREM OF ALGEBRA FUNDAMENTAL THEOREM OF ALGEBRA.
Practical Dependence Test Gina Goff, Ken Kennedy, Chau-Wen Tseng PLDI ’91 presented by Chong Liang Ooi.
Stanford University CS243 Winter 2006 Wei Li 1 Data Dependences and Parallelization.
1 Lecture 9  Arrays  Declaration  Initialization  Applications  Pointers  Declaration  The & and * operators  NULL pointer  Initialization  Readings:
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic B: Loop Restructuring José Nelson Amaral
Dependence Testing Optimizing Compilers for Modern Architectures, Chapter 3 Allen and Kennedy Presented by Rachel Tzoref and Rotem Oshman.
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2 pp
Mrs.Volynskaya Real Numbers
Optimizing Compilers for Modern Architectures Dependence: Theory and Practice Allen and Kennedy, Chapter 2.
1 1.1 © 2012 Pearson Education, Inc. Linear Equations in Linear Algebra SYSTEMS OF LINEAR EQUATIONS.
Exercise problems for students taking the Programming Parallel Computers course. Janusz Kowalik Piotr Arlukowicz Tadeusz Puzniakowski Informatics Institute.
Chapter 6.  Pg. 364 – 369  Obj: Learn how to solve systems of equations by graphing and analyze special systems.  Content Standard: A.REI.6.
Systems of Linear Equation and Matrices
Chapter Relations & Functions 1.2 Composition of Functions
Basic Concepts of Algebra
Equations of Lines Chapter 8 Sections
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 2.3.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 2.5.
Array Dependence Analysis COMP 621 Special Topics By Nurudeen Lameed
Chapter 4 Section 1 Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley.
Copyright © 2013, 2009, 2005 Pearson Education, Inc. 1 5 Systems and Matrices Copyright © 2013, 2009, 2005 Pearson Education, Inc.
© The McGraw-Hill Companies, Inc., Chapter 6 Prune-and-Search Strategy.
 2006 Pearson Education, Inc. All rights reserved Searching and Sorting.
Chapter 3 Polynomial and Rational Functions Copyright © 2014, 2010, 2007 Pearson Education, Inc Zeros of Polynomial Functions.
Carnegie Mellon Lecture 14 Loop Optimization and Array Analysis I. Motivation II. Data dependence analysis Chapter , 11.6 Dror E. MaydanCS243:
Chapter 4 Section 1. Objectives 1 Copyright © 2012, 2008, 2004 Pearson Education, Inc. Solving Systems of Linear Equations by Graphing Decide whether.
Slide 7- 1 Copyright © 2012 Pearson Education, Inc.
Copyright © 2010 Pearson Education, Inc. All rights reserved Sec
Copyright © 2009 Pearson Education, Inc. CHAPTER 4: Polynomial and Rational Functions 4.1 Polynomial Functions and Models 4.2 Graphing Polynomial Functions.
Copyright © 2009 Pearson Education, Inc. CHAPTER 9: Systems of Equations and Matrices 9.1 Systems of Equations in Two Variables 9.2 Systems of Equations.
Slide Copyright © 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley A set of equations is called a system of equations. The solution.
1 © 2010 Pearson Education, Inc. All rights reserved © 2010 Pearson Education, Inc. All rights reserved Chapter 3 Polynomial and Rational Functions.
 2008 Pearson Education, Inc. All rights reserved. 1 Arrays and Vectors.
Chapter 3 Polynomial and Rational Functions Copyright © 2014, 2010, 2007 Pearson Education, Inc Zeros of Polynomial Functions.
HAWKES LEARNING SYSTEMS Students Matter. Success Counts. Copyright © 2013 by Hawkes Learning Systems/Quant Systems, Inc. All rights reserved. Section 8.2.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Optimizing Compilers for Modern Architectures Enhancing Fine-Grained Parallelism Part II Chapter 5 of Allen and Kennedy.
Slide Copyright © 2009 Pearson Education, Inc. 7.2 Solving Systems of Equations by the Substitution and Addition Methods.
Tuesday, October 15, 2013 Do Now:. 3-1 Solving Systems of Equations by Graphing Objectives: 1)solve systems of linear equations by graphing 2) Determine.
LINEAR MODELS AND MATRIX ALGEBRA
Copyright © 2014, The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
Copyright © 2014, 2010, 2007 Pearson Education, Inc.
Applied Discrete Mathematics Week 2: Functions and Sequences
Copyright 2012, 2008, 2004, 2000 Pearson Education, Inc.
CS314 – Section 5 Recitation 13
Data Dependence, Parallelization, and Locality Enhancement (courtesy of Tarek Abdelrahman, University of Toronto)
Copyright © 2014, 2010, 2007 Pearson Education, Inc.
Solving Systems of Linear Equations
Copyright © 2014, 2010, 2007 Pearson Education, Inc.
Preliminary Transformations
Copyright © 2014, 2010, 2007 Pearson Education, Inc.
Algebraic Expression A number sentence without an equal sign. It will have at least two terms and one operations.
Chapter 5 Review.
Presentation transcript:

Optimizing Compilers for Modern Architectures Copyright, 1996 © Dale Carnegie & Associates, Inc. Dependence Testing Allen and Kennedy, Chapter 3 thru Section 3.3.2

Optimizing Compilers for Modern Architectures Main Theme Determining whether dependencies exist between two subscripted references to the same array in a loop nest Several tests to detect these dependencies

Optimizing Compilers for Modern Architectures Basics: Indices and Subscripts Index: Index variable for some loop surrounding a pair of references Subscript: A PAIR of subscript positions in a pair of array references For Example: A(I,j) = A(I,k) + C I,Iis the first subscript j,kis the second subscript

Optimizing Compilers for Modern Architectures Basics: Conservative Testing Consider only linear subscript expressions Finding integer solutions to system of linear Diophantine Equations is NP-Complete Most common approximation is Conservative Testing See if you can assert No dependence exists between two subscripted references of the same array Never incorrect, may be less than optimal

Optimizing Compilers for Modern Architectures The General Problem DO i 1 = L 1, U 1 DO i 2 = L 2, U 2... DO i n = L n, U n S 1 A(f 1 (i 1,...,i n ),...,f m (i 1,...,i n )) =... S 2... = A(g 1 (i 1,...,i n ),...,g m (i 1,...,i n )) ENDDO... ENDDO f i ( ) = g i ) for all i, 1 i m

Optimizing Compilers for Modern Architectures Basics: Complexity A subscript is said to be ZIV if it contains no index SIV if it contains only one index MIV if it contains more than one index For Example: A(5,I+1,j) = A(1,I,k) + C First subscript is ZIV Second subscript is SIV Third subscript is MIV

Optimizing Compilers for Modern Architectures Basics: Separability A subscript is separable if its indices do not occur in other subscripts If two different subscripts contain the same index they are coupled For Example: A(I+1,j) = A(k,j) + C Both subscripts are separable A(I,j,j) = A(I,j,k) + C Second and third subscripts are coupled

Optimizing Compilers for Modern Architectures Basics:Coupled Subscript Groups Why are they important? Coupling can cause imprecision in dependence testing DO I = 1, 100 S1A(I+1,I) = B(I) + C S2D(I) = A(I,I) * E ENDDO

Optimizing Compilers for Modern Architectures Dependence Testing: Overview Partition subscripts of a pair of array references into separable and coupled groups

Optimizing Compilers for Modern Architectures Dependence Testing: Overview Partition subscripts of a pair of array references into separable and coupled groups Classify each subscript as ZIV, SIV or MIV

Optimizing Compilers for Modern Architectures Dependence Testing: Overview Partition subscripts of a pair of array references into separable and coupled groups Classify each subscript as ZIV, SIV or MIV For each separable subscript apply single subscript test. If not done goto next step

Optimizing Compilers for Modern Architectures Dependence Testing: Overview For each coupled group apply multiple subscript test like Delta Test

Optimizing Compilers for Modern Architectures Dependence Testing: Overview For each coupled group apply multiple subscript test like Delta Test If still not done, merge all direction vectors computed in the previous steps into a single set of direction vectors

Optimizing Compilers for Modern Architectures Step 1: Subscript Partitioning Partitions the subscripts into separable and minimal coupled groups Notations // S is a set of m subscript pairs S 1, S 2,...S m each enclosed in // n loops with indexes I 1, I 2,... I n, which is to be // partitioned into separable or minimal coupled groups. // P is an output variable, containing the set of partitions // n p is the number of partitions

Optimizing Compilers for Modern Architectures Subscript Partitioning Algorithm procedure partition(S,P, n p ) n p = m; for i := 1 to m do P i {S i }; for i := 1 to n do begin k : for each remaining partition P j do if there exists s P j such that s contains I i then if k = then k j; else begin P k P k P j ; discard P j ; n p = n p – 1; end end end partition

Optimizing Compilers for Modern Architectures Step 2: Classify as ZIV/SIV/MIV Easy step Just count the number of different indices in a subscript

Optimizing Compilers for Modern Architectures Step 3: Applying Single Subscript Tests ZIV Test SIV Test Strong SIV Test Weak SIV Test –Weak-zero SIV –Weak Crossing SIV SIV Tests in Complex Iteration Spaces

Optimizing Compilers for Modern Architectures ZIV Test DO j = 1, 100 SA(e1) = A(e2) + B(j) ENDDO e1,e2 are constants or loop invariant symbols If (e1-e2)!=0 No Dependence exists

Optimizing Compilers for Modern Architectures Strong SIV Test Strong SIV subscripts are of the form For example the following are strong SIV subscripts

Optimizing Compilers for Modern Architectures Strong SIV Test Example DO k = 1, 100 DO j = 1, 100 S1A(j+1,k) =... S2... = A(j,k) + 32 ENDDO

Optimizing Compilers for Modern Architectures Strong SIV Test Dependence exists if

Optimizing Compilers for Modern Architectures Weak SIV Tests Weak SIV subscripts are of the form For example the following are weak SIV subscripts

Optimizing Compilers for Modern Architectures Weak-zero SIV Test Special case of Weak SIV where one of the coefficients of the index is zero The test consists merely of checking whether the solution is an integer and is within loop bounds

Optimizing Compilers for Modern Architectures Weak-zero SIV Test

Optimizing Compilers for Modern Architectures Weak-zero SIV & Loop Peeling DO i = 1, N S 1 Y(i, N) = Y(1, N) + Y(N, N) ENDDO Can be loop peeled to... Y(1, N) = Y(1, N) + Y(N, N) DO i = 2, N-1 S 1 Y(i, N) = Y(1, N) + Y(N, N) ENDDO Y(N, N) = Y(1, N) + Y(N, N)

Optimizing Compilers for Modern Architectures Weak-crossing SIV Test Special case of Weak SIV where the coefficients of the index are equal in magnitude but opposite in sign The test consists merely of checking whether the solution index is 1. within loop bounds and is 2. either an integer or has a non-integer part equal to 1/2

Optimizing Compilers for Modern Architectures Weak-crossing SIV Test

Optimizing Compilers for Modern Architectures Weak-crossing SIV & Loop Splitting DO i = 1, N S 1 A(i) = A(N-i+1) + C ENDDO This loop can be split into... DO i = 1,(N+1)/2 A(i) = A(N-i+1) + C ENDDO DO i = (N+1)/2 + 1, N A(i) = A(N-i+1) + C ENDDO

Optimizing Compilers for Modern Architectures Complex Iteration Spaces Till now we have applied the tests only to rectangular iteration spaces These tests can also be extended to apply to triangular or trapezoidal loops Triangular: One of the loop bounds is a function of at least one other loop index Trapezoidal: Both the loop bounds are functions of at least one other loop index

Optimizing Compilers for Modern Architectures Complex Iteration Spaces For example consider this special case of a strong SIV subscript DO I = 1,N DO J = L 0 + L 1 *I, U 0 + U 1 *I S1A(J + d) = S2= A(J) + B ENDDO

Optimizing Compilers for Modern Architectures Complex Iteration Spaces Strong SIV test gives dependence if Unless this inequality is violated for all values of I in its iteration range, we must assume a dependence in the loop

Optimizing Compilers for Modern Architectures Index Set Splitting DO I = 1,100 DO J = 1, I S1A(J+20) = A(J) + B ENDDO For values of there is no dependence

Optimizing Compilers for Modern Architectures Index Set Splitting This condition can be used to partially vectorize S1 by Index set splitting as shown DO I = 1,20 DO J = 1, I S1aA(J+20) = A(J) + B ENDDO

Optimizing Compilers for Modern Architectures Index Set Splitting DO I = 21,100 DO J = 1, Ix S1bA(J+20) = A(J) + B ENDDO Now the inner loop for the first nest can be vectorized

Optimizing Compilers for Modern Architectures Coupling makes these tests imprecise DO I = 1,100 DO J = 1, I S1 A(J+20,I) = A(J,19) + B ENDDO We will report dependence even if there isnt any But such cases are very rare

Optimizing Compilers for Modern Architectures Breaking Conditions Consider the following example DO I = 1, L S 1 A(I + N) = A(I) + B ENDDO If L<=N, then there is no dependence from S 1 to itself L<=N is called the Breaking Condition

Optimizing Compilers for Modern Architectures Using Breaking Conditions Using breaking conditions the vectorizer can generate alternative code IF (L<=N) THEN A(N+1:N+L) = A(1:L) + B ELSE DO I = 1, L S 1 A(I + N) = A(I) + B ENDDO ENDIF

Optimizing Compilers for Modern Architectures Whats next... MIV Tests Tests in Coupled groups