Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF.

Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE 1 This work is funded by the Defense Advanced Research Projects Agency (DARPA) through the Air Force Research Lab (AFRL).

Compiler Optimization Phase-Ordering Problem  Order of application of compiler optimizations drastically changes measured performance  Kulkarni et al. [CGO’ 06] show 38% average code size reduction  Zhao et al. [CGO’09] show up to 32% speedup  Production compilers still use fixed order Figure credit : Zhao et al. [CGO’09] Exascale systems multiply the cost of poor node performance 2

Phase-Order Selection Is Hard  Selecting best phase order is non-trivial  Program dependent  Relations between optimizations are complex One optimization can enable/disable another  Exhaustive empirical exploration is expensive and unrealistic  20 Optimization  2.5 * 10 18 possible optimization sequences  “Exhaustive optimization phase order space exploration.” [Kulkarni et al. CGO '06] Many optimization orders lead to structurally same function instances  Approaches  Analytically modeling code and effects of optimization is non-trivial and still in infancy “M. L. A framework for exploring optimization properties.” [Zhao et al. CC '09]  Other techniques have been tried and proven to be effective Genetic algorithms [Cooper et al. SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems 1999] 3

Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model Will show effectiveness on sample numerical program FMIN throughout the discussion with dynamic instruction count (DIC) as our optimization metric 4

Interaction Is Significant Between Pairs  Interaction is significant between pairs  Capture the ordering of pairs without regard to their absolute positions 5 a a b b a a b b b b a a b b a a Good Bad

Pruning Using Pairwise Constraints 6 a a b b b b a a a a b b

Background And Effectiveness Of Pairwise Pruning  Used by test community  In software testing : multiple input variables taking multiple values cause combinatorial explosion  Pairwise (a.k.a. all-pairs) testing is based on the observation that most faults are caused by interactions of at most two factors.  Pairwise-generated test suites cover all combinations of two therefore are much smaller than exhaustive ones yet still very effective in finding defects 7 K. Burr and W. Young [STAR’98] D. R. Wallace and D. R. Kuhn [International Journal of Reliability, Quality and Safety Engineering,2001]

Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model 8

Graph Model  Nodes represent optimizations : E.g. { a, b, c}  Directed edges represent optimization orders  Graph construction  Empirically evaluate all pairs to add edges ab < ba  edge (a,b) ac < ca  edge (a,c) cb < bc  edge (c,b)  Add weights to edges based on profitability E.g. (ab) Vs. (ba) has profit of 20% a b c 20 15 30 Graph may be cyclic or acyclic 9

Phase Order Selection For Acyclic Graphs  Topologically sort graph nodes to get a sequence  Such sequence (if exists) maintains all pair- ordering constrains a b c 20 15 30 Model found best sequence 10

Phase Order Selection For Graphs With Cycles  Cyclic ordering constraints:  ab < ba  edge (a,b)  bc < cb  edge (b,c)  ca < ac  edge (c,a)  Select an edge to break in each cycle  Select edge to minimize total weight of deleted edges (minimizes cost of pair-ordering constraint violation)  E.g. break edge (c,a)  Optimal sequence is : abc a b c 20 15 30 11

Graph Model On FMIN 12

Performance Estimation  Want to predict performance of any random sequence  Useful to ensure that a given sequence optimized for one objective function does not dramatically worsen another objective  E.g. Speed vs. Code size  Provides an analytical model for performance prediction 13

Graph Model For Performance Estimation  Graph model has built-in ability to estimate performance of a given sequence  To estimate the performance of a random sequence:  Perform a walk on the graph using the given sequence  Add weights of violated ordering-preference along the walk to the performance number of the model found best sequence (already known) 14

Example Graph Model For Performance Estimation  Let observed performance of model found best sequence (abcd) be 1200 instructions  Estimated performance of sequence dacb is: 1200 + a b c d 120 20 30 40 60 50 + + + + + + = 1340 Edges decorated with absolute difference not relative % Edges decorated with absolute difference not relative % 15 d a c b 30 40 50 20

Performance Estimation With Graph Model On FMIN  6 optimizations i.e. 720 sequences  Divergence + Phase mismatch 16

Issues With Graph Model  Considered just pairs of optimizations of length 2  Neglected global behavior of optimizations  Assumed weights or behaviors of pairs to be context-insensitive (i.e. same even in full length sequence)  Want a model that is context-sensitive 17

Getting Context Sensitive With Regression Model  Take into account context of the pairs by sampling full-length sequences  Represent sequences by regression equations  Represent all possible pairs as a parameter vector  Presence / absence of pairs in a sequence as input variables  Observed performance of a sequence as measured value X = Input variables Parameter vector Measured value 19

Example Linear Regression Model  Optimizations : { a, b, c }  Sequence :  Equation : abc X ab X ba X ac X ca X bc X cb 1 1 0 0 1 1 0 0 1 1 0 0 1045 X abc 0 0 1 1 1 1 0 0 1 1 0 0 1050 X bac Measured value … … … … 20 Parameter vector

Analytical Model For A Sequence cba 21

Regression Model On FMIN  Sequence of length 6  6! = 720 total sequences No phase mismatch, less divergence 22

Analysis of Regression-equation: Optimization Grouping Effect  Sequence of length 6  6! = 720 total sequences gn,ln,mn lg, lm 23

Refined Regression Model 100% sampling to solve regression equation 24 Superior projections, perfect corelation

Regression Model With Reduced Sampling Rate 12% sampling 25

Properties of Pairs Across Phase Shifts (m,n) = 0% (m,n) = 66.6% (l,n) = 0% (l,n) = 66.6% (g,n) = 0% (g,n) = 66.6% 27

Properties of Pairs Across Phase Shifts (l,g) = 0% (l,g) = 75% (l,g) = 0% (l,g) = 75% mn,ln,gn shift (l,m) = 0% (l,m) = 75% (l,m) = 0% (l,m) = 75% 28

Properties of Pairs Across Phase Shifts mn,ln,gn shift lm, lg shift (c,d) = 0% (c,d) = 100% 0% 100% 0% 100 % 0% 100 % 29

Conditional Sampling Model  Sample k << n! full length sequences that satisfy a set of pairwise ordering constraints C  Initially C = {}  We sampled 100 sequences in our implementation  Identify largest phase shift  Obtain pattern on either side of largest phase shift  e.g. pairs present with 100% or 0% on one side  Add pairwise constrains favoring better performance to C  Repeat sampling and refining C until we reach a performance plateau 30

Conditional Sampling On FMIN Conditions: (o,d) = 100% (o,d) = 17% od 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} 31

Conditional Sampling On FMIN Conditions: od 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} vd (v,d) = 100% (v,d) = 60% 32

Conditional Sampling On FMIN an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd an, oa, bn, cn, dn, gn, ln, ol, mn, on, qn, tn, vn, zn, oq, ov = 100% an, oa, bn, cn, dn, gn, ln, ol, mn, on, qn, tn, vn, zn, oq, ov = 100% an = 39% cn = 39% dn = 43% gn = 37% ln = 37% ol = 79% mn = 40% on = 71% qn = 37% oq = 79% ov = 100% tn = 37% oa = 80% bn = 46% vn = 13% zn = 61% 33

Conditional Sampling On FMIN cd, cv (c,d) = 100% (c,d) = 0% (c,v) = 100% (c,v) = 0% 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd 34

Conditional Sampling On FMIN Required 500 samples i.e. 8 * 10 -6 % sampling Required 500 samples i.e. 8 * 10 -6 % sampling cd, cv an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd 35

Summary  Order of application of compiler optimizations has dramatic effect on performance  “Pairwise pruning” reduces empirical search space by several orders of magnitude, yet effective  Three models of pairwise pruning  Context insensitive graph model  Context sensitive regression model  Context sensitive Conditional Sampling model  Initial results are encouraging  Technique can be used to augment other search space pruning techniques 36

Backup slides 37

Challenges And Opportunities  Not a silver bullet strategy  Sometimes patterns may not be as distinct as 0% or 100%, we may have to choose pattern based on higher percentage on one side E.g. 90% on left vs. 30% on right  In our experiments we always took 100 samples, we can tune it with various techniques  Vuduc et al. [International Journal of High Performance Computing Applications - 2004] suggest a statistical early stopping criterion which suggests when sampling can be stopped 38

Graph Model On FMIN  Six optimizations : {c,d,g,l,m,n}  Model found optimal sequence : cndgml  Model found sequence had dynamic instruction count of 1221 which was best among entire 720 possible sequences 39

Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF.

Similar presentations

Presentation on theme: "Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF.

Similar presentations

Presentation on theme: "Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF."— Presentation transcript:

Similar presentations

About project

Feedback