Loop Optimization “Programs spend 90% of time in loops” Loop optimizations well studied “simple” optimizations “loop mangeling”
Loop Invariant Code Motion Identify code that computes same value during each iteration Move loop invariant code to above loop “Standard” optimization in most compilers
Loop Invariant Example for (i = 0; i < N; i++) for(j=0; j < N; j++) { c[i][j] = 0; for(k=0; k < N; k++) c[i][j] += a[i][k] * b[k][j]; }
Example (cont.) “Assembler” for Innermost (k) loop t2 = t1 + j L1: t1 = i * N t2 = t1 + j t3 = t2 * 4 t4 = &c + t3 t12 = t1 + k t13 = t12 * 4 t14 = &a + t13 t21 = k * N t22 = t21 + j t23 = t22 * 4 t24 = &b + t23 t31 = *t14 * *t24 *t4 = *t4 + t31 k = k + 1 if( k < N) goto L1 “Assembler” for Innermost (k) loop
Example (cont.) t1 = i * N t2 = t1 + j t3 = t2 * 4 t4 = &c + t3 L1: t12 = t1 + k t13 = t12 * 4 t14 = &a + t13 t21 = k * N t22 = t21 + j t23 = t22 * 4 t24 = &b + t23 t31 = *t14 * *t24 *t4 = *t4 + t31 k = k + 1 if( k < N) goto L1
Induction Variables Changes by constant amount per iteration Often used in array address computation Simplification of induction variables Strength reduction --- convert * to +
Example (cont.) t1 = i * N t2 = t1 + j t3 = t2 * 4 t4 = &c + t3 t14 = &a t24 = &b t32 = N * 4 t33 = t32 + &a L1: t31 = *t14 * *t24 *t4 = *t4 + t31 t14 = t14 + 4 t24 = t24 + t32 if(t14 < t33) goto L1
Loop Transformations More sophisticated Relatively few compilers include them Loop Interchange – for nested loops Unroll and Jam – for nested loops Loop fusion Loop distribution Loop unrolling