Comparison of Array Operation Synthesis and Straightforward Compilation FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN T1 (I,J)=A(I+1,J) ELSE T1 (I,J)=0 ENDFORALL FORALL (I=1:N:1; J=1:N:1) T2 (I,J)= T1 (J,I) ENDFORALL FORALL (I=1:N:1; J=1:N:1) IF (1<=I<=N-1) and (1<=J<=N) THEN B(I,J)= T2 (I+1,J) ELSE B(I,J)= T2 (I-N,J) ENDFORALL FORALL (I=1:N-1:1; J=1:N-1:1) B (I,J)=A(J+1,I+1) ENDFORALL FORALL (I=1:N-1:1;J=1:N:N) B (I,J)= 0 ENDFORALL FORALL (I=N:N:1;J=1:N-1:1) B (I,J)=A(J+1,I-N+1) ENDFORALL FORALL (I=N:N:1;J=1:N:N) B (I,J)= 0 ENDFORALL Code with Array Operation Synthesis Code by straightforward compilation
Synthesis Array Expression S1 and S2 separately Synthesis Array Expression S1 and S2 collectively SPREAD: one-to-many data movement function FORALL (I=1:N:1,J=1:M:1) A(I)=SIN(SQRT(B(I)+0.5)+COS(C(I)))+D(I,J) END FORALL Synthesis Anomaly REAL A(N),B(N),C(N),T(N,M),D(N,M) A=SIN(SQRT(B+0.5)+COS(C)) T=SPREAD(A,DIM=2,NCOPIES=M)+D Statement S1 Statement S2 FORALL (I=1:N:1) A(I)=SIN(SQRT(B(I)+0.5)+COS(C(I))) END FORALL FORALL (I=1:N:1,J=1:M:1) T(I,J)=A(I)+D(I,J) END FORALL N SIN,SQRT,COS 2*N+N*M addition N+N*M assignments N*M SIN,SQRT,COS 3*N*M addition N*M assignments
We propose a polynomial time Synthesis Anomaly Prevention algorithm Loop Interchange for more Synthesis Synthesis Anomaly(Cont’d)
Analysis of Array Operation Synthesis We prove Array Operation Synthesis can: reduce the number of stores. reduce the number of loads. do not increase the required computations.
Advanced Techniques Optimization for Segmentation Descriptors with Coupled Index Functions Synthesis of Array Reduction and Location intrinsic operations Synthesis of WHERE CONSTRUCT
Contributions The first scheme which can synthesis the following Fortran 90 intrinsic array operations Array Section Movement, SPREAD, TRANSPOSE, EOSHIFT, CHIFT, MERGE WHERE CONSTRUCT, Array Reduction Functions(ALL,COUNT,MAXVAL) Array Location Functions(MAXLOC,MINLOC)
SYNTOOL An implementation of array operation as a web-based tool Kernel Implemented in C A Web Page + CGI program Perform source-to-source Array Operation Synthesis and return Fortran 90 or HPF program Available on WWW at
SYNTOOL Test Beds Sequent S27 with 10 identical processors SGI Power Challenge with 10 identical processors Seven test suites of Fortran 90 are used last four program fragments are from real application codes Synthesis on Shared-Memory Systems CPU Cache CPU Cache CPU Cache CPU Cache Main Memory Shared Bus
Code Fragment 1 (CSHIFT, TRASPOSE, ADDITION, RESHAPE) Code Fragment 2 (Where construct) Experimental Results on Sequent (N=256)
Code Fragment 3 (EOSHIFT,MERGE RESHAPE, ADDITION) Code Fragment 4 (Purdue-set Problem 9) Experimental Results on Sequent (N=256)
Code Fragment 5 (APULE routine electromagnetic scattering problem) Code Fragment 6 (Sandia Wave) Experimental Results on Sequent (N=256)
Code Fragment 7 (Linear Equation Solve) Experimental Results on Sequent (N=256)
Experimental Results on SGI Power Challenge (N=512) Code Fragment 4 (Purdue-set Problem 9) Code Fragment 5 (APULE routine electromagnetic scattering problem)
Experimental Results on SGI Power Challenge (N=512) Code Fragment 6 (Sandia Wave) Code Fragment 7 (Linear Equation Solve)