Software Group © 2005 IBM Corporation Compiler Technology October 17, 2005 Array privatization in IBM static compilers -- technical report CASCON 2005
Compiler technology © 2005 IBM Corporation October 17, 2005 Guansong Zhang, Erik Charlebois and Roch Archambault Compiler Development Team IBM Toronto Lab, Canada Authors
Compiler technology © 2005 IBM Corporation October 17, 2005 Overview Introduction and motivation Array data flow analysis Array data privatization Performance results Future work Possible usage
Compiler technology © 2005 IBM Corporation October 17, 2005 Expose limitations Compare SPEC2000FP and SPECOMP SPECOMP achieves good performance and scalability –Compare between explicit and auto-parallelization Expose missed opportunities 10 common benchmarks –Compare on a loop-to-loop basis
Compiler technology © 2005 IBM Corporation October 17, 2005 Improved auto-parallelization performance
Compiler technology © 2005 IBM Corporation October 17, 2005 Array privatization example
Compiler technology © 2005 IBM Corporation October 17, 2005 Basic loop parallelizer
Compiler technology © 2005 IBM Corporation October 17, 2005 Pre-parallelization Phase Induction variable identification Scalar Privatization --- only scalar ! Reduction finding Loop transformations favoring parallelism
Compiler technology © 2005 IBM Corporation October 17, 2005 The concept of data privatization Data is local to each loop iteration Do I = 1, 10 Temp = = … Temp … Enddo Purpose: eliminating loop carried dependences.
Compiler technology © 2005 IBM Corporation October 17, 2005 The concept of data privatization (cont.) Array as temp data do J = 1, 10 do I = 1, 10 Temp(I) =... end do do I = 1, = … Temp (I) … enddo
Compiler technology © 2005 IBM Corporation October 17, 2005 Array data flow and its structure Similar to data flow –MayDef: array elements that may be written. –MustDef: array elements that are definitely written. –UpExpUse: array elements that may have an upward exposed use a use not preceded by a definition along a path from the loop header –LiveOnExit: array elements that are used after the loop region. GARs: Guarded Array Regions (GARs). –A GAR is a tuple(G,D), D is a bounded Regular Section Descriptor (RSD) for the accessed array section, G is a guard that specifies the condition under which D is accessed Notes: many papers discussed the issue
Compiler technology © 2005 IBM Corporation October 17, 2005 Array privatization algorithm
Compiler technology © 2005 IBM Corporation October 17, 2005 Loop normalization and array data flow Normalized loop
Compiler technology © 2005 IBM Corporation October 17, 2005 Alias analysis and array data flow Ideal situation: no alias at all. –Other wise, you can not tell what is the precise intersection of the two array section involved Alias coming from: –Structural members, e.g. scalar replacement –Function parameters, array is a shadow (not mapped data, alias to any global array) Procedure summary may help –Alias as fall back
Compiler technology © 2005 IBM Corporation October 17, 2005 Possible parallelization results
Compiler technology © 2005 IBM Corporation October 17, 2005 Real case
Compiler technology © 2005 IBM Corporation October 17, 2005 NAS MG (-O3 –qhot –q64)
Compiler technology © 2005 IBM Corporation October 17, 2005 Summary Challenges – Compilation time Work with other optimizations – Loop unroll Graph complexity – Number of branches Array section caculation accuracy – Memory usage Managing and reusing
Compiler technology © 2005 IBM Corporation October 17, 2005 Summary (cont.) Further improvement: – Inter-procedural array data flow Procedure summary More accurate section information instead of using alias – Symbolic range analysis Expression simplifier: lot of room to be improved – Compilation efficiency Possible usage – Auto parallelization – Array contraction – Array coalescing