Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF.

Similar presentations


Presentation on theme: "Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF."— Presentation transcript:

1 Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF COMPUTER SCIENCE 1 This work is funded by the Defense Advanced Research Projects Agency (DARPA) through the Air Force Research Lab (AFRL).

2 Compiler Optimization Phase-Ordering Problem  Order of application of compiler optimizations drastically changes measured performance  Kulkarni et al. [CGO’ 06] show 38% average code size reduction  Zhao et al. [CGO’09] show up to 32% speedup  Production compilers still use fixed order Figure credit : Zhao et al. [CGO’09] Exascale systems multiply the cost of poor node performance 2

3 Phase-Order Selection Is Hard  Selecting best phase order is non-trivial  Program dependent  Relations between optimizations are complex One optimization can enable/disable another  Exhaustive empirical exploration is expensive and unrealistic  20 Optimization  2.5 * 10 18 possible optimization sequences  “Exhaustive optimization phase order space exploration.” [Kulkarni et al. CGO '06] Many optimization orders lead to structurally same function instances  Approaches  Analytically modeling code and effects of optimization is non-trivial and still in infancy “M. L. A framework for exploring optimization properties.” [Zhao et al. CC '09]  Other techniques have been tried and proven to be effective Genetic algorithms [Cooper et al. SIGPLAN Workshop on Languages, Compilers, and Tools for Embedded Systems 1999] 3

4 Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model Will show effectiveness on sample numerical program FMIN throughout the discussion with dynamic instruction count (DIC) as our optimization metric 4

5 Interaction Is Significant Between Pairs  Interaction is significant between pairs  Capture the ordering of pairs without regard to their absolute positions 5 a a b b a a b b b b a a b b a a Good Bad

6 Pruning Using Pairwise Constraints 6 a a b b b b a a a a b b

7 Background And Effectiveness Of Pairwise Pruning  Used by test community  In software testing : multiple input variables taking multiple values cause combinatorial explosion  Pairwise (a.k.a. all-pairs) testing is based on the observation that most faults are caused by interactions of at most two factors.  Pairwise-generated test suites cover all combinations of two therefore are much smaller than exhaustive ones yet still very effective in finding defects 7 K. Burr and W. Young [STAR’98] D. R. Wallace and D. R. Kuhn [International Journal of Reliability, Quality and Safety Engineering,2001]

8 Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model 8

9 Graph Model  Nodes represent optimizations : E.g. { a, b, c}  Directed edges represent optimization orders  Graph construction  Empirically evaluate all pairs to add edges ab < ba  edge (a,b) ac < ca  edge (a,c) cb < bc  edge (c,b)  Add weights to edges based on profitability E.g. (ab) Vs. (ba) has profit of 20% a b c 20 15 30 Graph may be cyclic or acyclic 9

10 Phase Order Selection For Acyclic Graphs  Topologically sort graph nodes to get a sequence  Such sequence (if exists) maintains all pair- ordering constrains a b c 20 15 30 Model found best sequence 10

11 Phase Order Selection For Graphs With Cycles  Cyclic ordering constraints:  ab < ba  edge (a,b)  bc < cb  edge (b,c)  ca < ac  edge (c,a)  Select an edge to break in each cycle  Select edge to minimize total weight of deleted edges (minimizes cost of pair-ordering constraint violation)  E.g. break edge (c,a)  Optimal sequence is : abc a b c 20 15 30 11

12 Graph Model On FMIN 12

13 Performance Estimation  Want to predict performance of any random sequence  Useful to ensure that a given sequence optimized for one objective function does not dramatically worsen another objective  E.g. Speed vs. Code size  Provides an analytical model for performance prediction 13

14 Graph Model For Performance Estimation  Graph model has built-in ability to estimate performance of a given sequence  To estimate the performance of a random sequence:  Perform a walk on the graph using the given sequence  Add weights of violated ordering-preference along the walk to the performance number of the model found best sequence (already known) 14

15 Example Graph Model For Performance Estimation  Let observed performance of model found best sequence (abcd) be 1200 instructions  Estimated performance of sequence dacb is: 1200 + a b c d 120 20 30 40 60 50 + + + + + + = 1340 Edges decorated with absolute difference not relative % Edges decorated with absolute difference not relative % 15 d a c b 30 40 50 20

16 Performance Estimation With Graph Model On FMIN  6 optimizations i.e. 720 sequences  Divergence + Phase mismatch 16

17 Issues With Graph Model  Considered just pairs of optimizations of length 2  Neglected global behavior of optimizations  Assumed weights or behaviors of pairs to be context-insensitive (i.e. same even in full length sequence)  Want a model that is context-sensitive 17

18 Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model 18

19 Getting Context Sensitive With Regression Model  Take into account context of the pairs by sampling full-length sequences  Represent sequences by regression equations  Represent all possible pairs as a parameter vector  Presence / absence of pairs in a sequence as input variables  Observed performance of a sequence as measured value X = Input variables Parameter vector Measured value 19

20 Example Linear Regression Model  Optimizations : { a, b, c }  Sequence :  Equation : abc X ab X ba X ac X ca X bc X cb 1 1 0 0 1 1 0 0 1 1 0 0 1045 X abc 0 0 1 1 1 1 0 0 1 1 0 0 1050 X bac Measured value … … … … 20 Parameter vector

21 Analytical Model For A Sequence cba 21

22 Regression Model On FMIN  Sequence of length 6  6! = 720 total sequences No phase mismatch, less divergence 22

23 Analysis of Regression-equation: Optimization Grouping Effect  Sequence of length 6  6! = 720 total sequences gn,ln,mn lg, lm 23

24 Refined Regression Model 100% sampling to solve regression equation 24 Superior projections, perfect corelation

25 Regression Model With Reduced Sampling Rate 12% sampling 25

26 Roadmap  Phase order selection using pairwise constraints between optimizations  Graph model  Regression model  Conditional Sampling model 26

27 Properties of Pairs Across Phase Shifts (m,n) = 0% (m,n) = 66.6% (l,n) = 0% (l,n) = 66.6% (g,n) = 0% (g,n) = 66.6% 27

28 Properties of Pairs Across Phase Shifts (l,g) = 0% (l,g) = 75% (l,g) = 0% (l,g) = 75% mn,ln,gn shift (l,m) = 0% (l,m) = 75% (l,m) = 0% (l,m) = 75% 28

29 Properties of Pairs Across Phase Shifts mn,ln,gn shift lm, lg shift (c,d) = 0% (c,d) = 100% 0% 100% 0% 100 % 0% 100 % 29

30 Conditional Sampling Model  Sample k << n! full length sequences that satisfy a set of pairwise ordering constraints C  Initially C = {}  We sampled 100 sequences in our implementation  Identify largest phase shift  Obtain pattern on either side of largest phase shift  e.g. pairs present with 100% or 0% on one side  Add pairwise constrains favoring better performance to C  Repeat sampling and refining C until we reach a performance plateau 30

31 Conditional Sampling On FMIN Conditions: (o,d) = 100% (o,d) = 17% od 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} 31

32 Conditional Sampling On FMIN Conditions: od 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} vd (v,d) = 100% (v,d) = 60% 32

33 Conditional Sampling On FMIN an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd an, oa, bn, cn, dn, gn, ln, ol, mn, on, qn, tn, vn, zn, oq, ov = 100% an, oa, bn, cn, dn, gn, ln, ol, mn, on, qn, tn, vn, zn, oq, ov = 100% an = 39% cn = 39% dn = 43% gn = 37% ln = 37% ol = 79% mn = 40% on = 71% qn = 37% oq = 79% ov = 100% tn = 37% oa = 80% bn = 46% vn = 13% zn = 61% 33

34 Conditional Sampling On FMIN cd, cv (c,d) = 100% (c,d) = 0% (c,v) = 100% (c,v) = 0% 13 optimization : {a, b, c, d, g, l, m, n, o, q, t, v, z} an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd 34

35 Conditional Sampling On FMIN Required 500 samples i.e. 8 * 10 -6 % sampling Required 500 samples i.e. 8 * 10 -6 % sampling cd, cv an,oa,bn,cn,dn,gn, ln,ol, mn,on, qn, tn, vn, zn, oq, ov Conditions: od vd 35

36 Summary  Order of application of compiler optimizations has dramatic effect on performance  “Pairwise pruning” reduces empirical search space by several orders of magnitude, yet effective  Three models of pairwise pruning  Context insensitive graph model  Context sensitive regression model  Context sensitive Conditional Sampling model  Initial results are encouraging  Technique can be used to augment other search space pruning techniques 36

37 Backup slides 37

38 Challenges And Opportunities  Not a silver bullet strategy  Sometimes patterns may not be as distinct as 0% or 100%, we may have to choose pattern based on higher percentage on one side E.g. 90% on left vs. 30% on right  In our experiments we always took 100 samples, we can tune it with various techniques  Vuduc et al. [International Journal of High Performance Computing Applications - 2004] suggest a statistical early stopping criterion which suggests when sampling can be stopped 38

39 Graph Model On FMIN  Six optimizations : {c,d,g,l,m,n}  Model found optimal sequence : cndgml  Model found sequence had dynamic instruction count of 1221 which was best among entire 720 possible sequences 39


Download ppt "Efficiently Exploring Compiler Optimization Sequences With Pairwise Pruning Milind Chabbi John Mellor-Crummey Keith Cooper RICE UNIVERSITY DEPARTMENT OF."

Similar presentations


Ads by Google