Download presentation
Presentation is loading. Please wait.
Published byΞάνθη Αντωνοπούλου Modified over 5 years ago
1
Interprocedural Symbolic Range Propagation for Optimizing Compilers
Hansang Bae and Rudolf Eigenmann Purdue University
2
Interprocedural Symbolic Range Propagation
Outline Motivation Symbolic Range Propagation Interprocedural Symbolic Range Propagation Experiments Conclusions 4/8/2019 Interprocedural Symbolic Range Propagation
3
Interprocedural Symbolic Range Propagation
Motivation Symbolic analysis is key to static analysis X=Y+1 better than X=? Range analysis has been successful X=[0,10] better than X=? Relevant questions How much can we achieve interprocedurally? Can interprocedural range propagation outperform other alternatives? 4/8/2019 Interprocedural Symbolic Range Propagation
4
Symbolic Range Propagation (Background)
Has been effective for compiler analyses Abstract interpretation Provides lower/upper bounds for variables Sources of information Variable definition IF conditionals Loop variables Intersect with new source, union at merge 4/8/2019 Interprocedural Symbolic Range Propagation
5
Interprocedural Symbolic Range Propagation
SRP – Example Example Code Ranges X=[-INF,INF] X=[1,1] X=[2,2] X=[3,3] X=[2,3] X = 1 IF (X.LE.N) THEN X = 2*X ELSE X = X+2 ENDIF … 4/8/2019 Interprocedural Symbolic Range Propagation
6
Interprocedural Symbolic Range Propagation (ISRP)
Propagates ranges across procedure calls Collects ISR at important program points Entry to a subroutine After a call site SRP as the source of information Iterative algorithm Context sensitivity by procedure-cloning 4/8/2019 Interprocedural Symbolic Range Propagation
7
Interprocedural Symbolic Range Propagation
ISRP – Terminology Symbolic Range Mapping from a variable to its value range, V = [LB, UB], where LB and UB are expressions Interprocedural Symbolic Range Symbolic Range valid at relevant program points - subroutine entries and call sites (forward/backward) Jump Function Set of symbolic ranges expressed in terms of input variables to a called subroutine (actual parameters, global variables) Return Jump Function terms of return variables to a calling subroutine (formal parameters, global Caller Jump Function Backward ISR Forward ISR Return Jump Function Callee 4/8/2019 Interprocedural Symbolic Range Propagation
8
Interprocedural Symbolic Range Propagation
ISRP – Algorithm Propagate_Interprocedural_Ranges() { Initialize_Call_Graph() while (there is any change in ISR) { foreach Subroutine (bottom-up) { Get_Backward_Interprocedural_Ranges() Compute_Jump_Functions() Compute_Return_Jump_Functions() } Get_Forward_Interprocedural_Ranges() 4/8/2019 Interprocedural Symbolic Range Propagation
9
Interprocedural Symbolic Range Propagation
ISRP – Algorithm Propagate_Interprocedural_Ranges() Initialize_Call_Graph() while (there is any change in ISR) { foreach Subroutine (bottom-up) { Get_Backward_Interprocedural_Ranges() Compute_Jump_Functions() Compute_Return_Jump_Functions() } Get_Forward_Interprocedural_Ranges() Get_Backward_Interprocedural_Ranges() Transforms return jump functions to ISRs Does nothing for leaf nodes of the call graph Compute_Jump_Functions() Computes intraprocedural ranges Discards non-input-variables to the callee Compute_Return_Jump_Functions() Discards non-return-variables to the caller Get_Forward_Interprocedural_Ranges() Transforms jump functions to ISRs Clone procedures if necessary Keeps track of any changes 4/8/2019 Interprocedural Symbolic Range Propagation
10
ISRP – Example (1st iteration)
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) X=[1] X=[1],Y=[2] N=[10,40] N=[10,40],T=[U+N] V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M J :X=[1],Y=[2] ISR:T=[1],U=[2] J :N=[10,40] ISR:T=[U+N] ISR:M=[10,40] RJ :V=[W+M] 4/8/2019 Interprocedural Symbolic Range Propagation
11
ISRP – Example (2nd iteration)
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) X=[1] X=[1],Y=[2] Y=[2] T=[1],U=[2] T=[1],U=[2],N=[10,40] N=[10,40],U=[2] U=[2] M=[10,40] M=[10,40],V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M J :X=[1],Y=[2] ISR:Y=[2] ISR:T=[1],U=[2] J :U=[2],N=[10,40] ISR:N=[10,40],T=[U+N] RJ :U=[2] ISR:M=[10,40] RJ :M=[10,40],V=[W+M] ISR:T=[1],U=[2] ISR:M=[10,40],W=[2] 4/8/2019 Interprocedural Symbolic Range Propagation
12
ISRP – Example (3rd iteration)
foreach subroutine (bottom-up) Get_Backward_ISRs Compute_Jump_Functions Compute_Return_Jump_Functions Get_Forward_ISRs (for call graph) X=[1] X=[1],Y=[2] Y=[2] T=[1],U=[2] T=[1],U=[2],N=[10,40] N=[10,40],U=[2] U=[2] M=[10,40],W=[2] M=[10,40],W=[2],V=[W+M] J :X=[1],Y=[2] ISR:Y=[2] ISR:T=[1],U=[2] J :U=[2],N=[10,40] ISR:N=[10,40],U=[2],T=[U+N] RJ :U=[2] ISR:M=[10,40],W=[2] RJ :M=[10,40],W=[2],V=[W+M] PROGRAM MAIN INTEGER X, Y X = 1 Y = 2 α CALL A(X, Y) END SUBROUTINE A(T, U) INTEGER N, T, U DO N = 10, 40 β CALL B(T, U, N) ENDDO SUBROUTINE B(V, W, M) INTEGER M, V, W V = W + M ISR:T=[1],U=[2] ISR:M=[10,40],W=[2] 4/8/2019 Interprocedural Symbolic Range Propagation
13
Interprocedural Symbolic Range Propagation
Experiments Efficacy of ISRP for an optimizing compiler (Polaris) Test elision and dead-code elimination Data dependence analysis Other optimizations for parallelization 21 Fortran codes from SPEC CFP and Perfect Best available optimizations in Polaris as Base Interprocedural expression propagation Forward substitution Intraprocedural symbolic range propagation Automatic partial inlining 4/8/2019 Interprocedural Symbolic Range Propagation
14
Interprocedural Symbolic Range Propagation
Result – Test Elision Codes Base ISRP ARC2D BDNA DYFESM FLO52Q MDG MIGRATION OCEAN QCD2 SPEC77 TRACK TRFD 4 15 18 2 67 3 5 26 1 6 72 10 8 applu apsi fpppp hydro2d mgrid su2cor swim tomcatv turb3d wupwise TOTAL 12 9 7 19 176 27 242 ISRP found more or same number of cases for 20 codes Base made an aggressive decision with hard-wired test elision for fpppp 4/8/2019 Interprocedural Symbolic Range Propagation
15
Result – Data Dependence Analysis
Codes Base ISRP ARC2D BDNA DYFESM FLO52Q MDG MIGRATION OCEAN QCD2 SPEC77 TRACK TRFD 801 1124 1388 263 1120 7180 433 17257 2392 2003 93 459 1081 1108 279 908 6219 370 3579 1652 1781 40 applu apsi fpppp hydro2d mgrid su2cor swim tomcatv turb3d wupwise TOTAL 677 11395 22064 110 83 10915 45 371 2013 81727 8222 20006 66 19 8712 342 1071 56636 ISRP disproved more data dependences for 20 codes Base benefits from forward substitution for FLO52Q Data dependence analysis benefits from other improved optimizations 4/8/2019 Interprocedural Symbolic Range Propagation
16
Result – Other Optimizations
IF (((-num)+(-num**2))/2.LE.0.AND.(-num).LE.0) THEN ALLOCATE (xrsiq00(1:morb, 1:num, 1:numthreads)) !$OMP PARALLEL !$OMP+IF(6+((-1)*num+(-1)*num**2)/2.LE.0) !$OMP+DEFAULT(SHARED) !$OMP+PRIVATE(MY_CPU_ID,MRS,MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,MI,MJ) my_cpu_id = omp_get_thread_num()+1 !$OMP DO DO mrs = 1, (num*(1+num))/2, IF ((num*(1+num))/2.NE.mrs) THEN DO mq = 1, num, DO mi0 = 1, num, CONTINUE xrsiq00(mi0, mq, my_cpu_id) = zero ENDDO ENDDO DO mp = 1, num, DO mq = 1, mp, val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, xrsiq00(mi0, mq, my_cpu_id) = xrsiq00(mi0, mq, my_cpu_id)+v *(mp, mi0)*val xrsiq00(mi0, mp, my_cpu_id) = xrsiq00(mi0, mp, my_cpu_id)+v *(mq, mi0)*val CONTINUE ENDDO ENDIF CONTINUE ENDDO CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/ DO mi0 = 1, num, DO mj0 = 1, mi0, CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, val = xrsiq00(mi0, mq, my_cpu_id) IF (zero.NE.val) THEN DO mj0 = 1, mi0, CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF CONTINUE ENDDO DO mj0 = 1, mi0, CONTINUE xrsij(mj0+(mi0**2+(-mi0))/2+((-num)+(-num**2)+mrs*num+mrs*num ***2)/2) = xij00(mj0) ENDDO CONTINUE ENDDO CONTINUE ELSE DO mq = 1, num, DO mi = 1, num, CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, DO mq = 1, mp, val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num* **2)/2) IF (zero.NE.val) THEN DO mi = 1, num, xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val CONTINUE ENDDO ENDIF CONTINUE ENDDO CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/ DO mi = 1, num, DO mj = 1, mi, CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF CONTINUE ENDDO DO mj = 1, mi, CONTINUE xrsij(mj+(mi**2+(-mi))/2+((-num)+(-num**2)+mrs*num+mrs*num** *)/2) = xij(mj) ENDDO CONTINUE ENDDO CONTINUE ENDIF ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL DEALLOCATE (xrsiq00) ELSE DO mrs = 1, (num*(1+num))/2, 1 !$OMP PARALLEL !$OMP+IF(6+(-1)*num.LE.0) !$OMP+DEFAULT(SHARED) !$OMP+PRIVATE(MI) !$OMP DO DO mq = 1, num, DO mi = 1, (num), CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL DO mp = 1, num, DO mq = 1, mp, mrspq = 1+mrspq val = xrspq(mrspq) IF (zero.NE.val) THEN !$OMP PARALLEL !$OMP+IF(6+(-1)*num.LE.0) !$OMP+DEFAULT(SHARED) !$OMP DO DO mi = 1, (num), xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val CONTINUE ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL ENDIF CONTINUE ENDDO CONTINUE ENDDO mrsij = mrsij DO mi = 1, (num), 1 !$OMP PARALLEL !$OMP+IF(6+(-1)*mi.LE.0) !$OMP+DEFAULT(SHARED) !$OMP DO DO mj = 1, mi, CONTINUE xij(mj) = zero ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL ALLOCATE (xij1(1:mi, 1:numthreads)) !$OMP PARALLEL !$OMP+IF(6+(-1)*num.LE.0) !$OMP+DEFAULT(SHARED) !$OMP+PRIVATE(MY_CPU_ID,MQ,TPINIT,VAL,MJ) my_cpu_id = omp_get_thread_num() DO tpinit = 1, mi, xij1(tpinit, my_cpu_id) = ENDDO !$OMP DO DO mq = 1, num, val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, CONTINUE xij1(mj, my_cpu_id) = xij1(mj, my_cpu_id)+v(mq, mj)*val ENDDO ENDIF CONTINUE ENDDO !$OMP END DO NOWAIT !$OMP CRITICAL DO tpinit = 1, mi, xij(tpinit) = xij(tpinit)+xij1(tpinit, my_cpu_id) ENDDO !$OMP END CRITICAL !$OMP END PARALLEL DEALLOCATE (xij1) DO mj = 1, mi, mrsij = mrsij CONTINUE xrsij(mrsij) = xij(mj) ENDDO CONTINUE ENDDO mrsij0 = mrsij0+(num*(num+1))/ CONTINUE ENDDO ENDIF !$OMP PARALLEL !$OMP+DEFAULT(SHARED) !$OMP+PRIVATE(MRSIJ1,MI0,MJ0,VAL,MQ,MP,XIJ00,XRSIQ00) !$OMP DO DO mrs = 1, (num+num**2)/2, IF ((num+num**2)/2.NE.mrs) THEN DO mq = 1, num, DO mi0 = 1, num, CONTINUE xrsiq00(mi0, mq) = zero ENDDO ENDDO DO mp = 1, num, DO mq = 1, mp, val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi0 = 1, num, xrsiq00(mi0, mq) = xrsiq00(mi0, mq)+v(mp, mi0)*val xrsiq00(mi0, mp) = xrsiq00(mi0, mp)+v(mq, mi0)*val CONTINUE ENDDO ENDIF CONTINUE ENDDO CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/ DO mi0 = 1, num, DO mj0 = 1, mi0, CONTINUE xij00(mj0) = zero ENDDO DO mq = 1, num, val = xrsiq00(mi0, mq) IF (zero.NE.val) THEN DO mj0 = 1, mi0, CONTINUE xij00(mj0) = xij00(mj0)+v(mq, mj0)*val ENDDO ENDIF CONTINUE ENDDO DO mj0 = 1, mi0, CONTINUE xrsij(mj0+(mi0**2+(-mi0)+(-num)+(-num**2)+mrs*num+mrs*num**2)/ *2) = xij00(mj0) ENDDO CONTINUE ENDDO CONTINUE ELSE DO mq = 1, num, DO mi = 1, num, CONTINUE xrsiq(mi, mq) = zero ENDDO ENDDO DO mp = 1, num, DO mq = 1, mp, val = xrspq(mq+(mp**2+(-mp)+(-num)+(-num**2)+mrs*num+mrs*num** *2)/2) IF (zero.NE.val) THEN DO mi = 1, num, xrsiq(mi, mq) = xrsiq(mi, mq)+v(mp, mi)*val xrsiq(mi, mp) = xrsiq(mi, mp)+v(mq, mi)*val CONTINUE ENDDO ENDIF CONTINUE ENDDO CONTINUE ENDDO mrsij1 = ((-num)+(-num**2)+mrs*num+mrs*num**2)/ DO mi = 1, num, DO mj = 1, mi, CONTINUE xij(mj) = zero ENDDO DO mq = 1, num, val = xrsiq(mi, mq) IF (zero.NE.val) THEN DO mj = 1, mi, CONTINUE xij(mj) = xij(mj)+v(mq, mj)*val ENDDO ENDIF CONTINUE ENDDO DO mj = 1, mi, CONTINUE xrsij(mj+(mi**2+(-mi)+(-num)+(-num**2)+mrs*num+mrs*num**2)/2) *= xij(mj) ENDDO CONTINUE ENDDO CONTINUE ENDIF ENDDO !$OMP END DO NOWAIT !$OMP END PARALLEL Result – Other Optimizations Induction Variable Substitution Reduction Translation “Yes” to the questions helps the compiler generate better codes ISRP helped the compiler make better decisions for 5 codes 4/8/2019 Interprocedural Symbolic Range Propagation
17
Interprocedural Symbolic Range Propagation
Conclusions Interprocedural analysis of symbolic ranges Based on intraprocedural analysis Iterative algorithm ISRP enhances other optimizations Compilation time increases up to 150% Exceptions: OCEAN and TRACK 4/8/2019 Interprocedural Symbolic Range Propagation
18
Interprocedural Symbolic Range Propagation
Thank you. 4/8/2019 Interprocedural Symbolic Range Propagation
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.