1 Advance Computer Architecture CSE 8383 Ranya Alawadhi
CSE8383CSE Compilers for Instruction-Level Parallelism Schlansker, M., Conte, T. M., Dehnert, J., Ebcioglu, K., Fang, J. Z., and Thompson, C. L Compilers for Instruction-Level Parallelism. Computer 30, 12 (Dec. 1997), 63-69
CSE8383CSE Agenda Instruction Level Parallelism (ILP) ILP Compilers Roles Areas of interest to ILP Compilers Conclusion Questions?
CSE8383CSE Instruction Level Parallelism (ILP) Allows a sequence of instructions derived from a sequential program to be parallelized for execution on multiple pipeline functional units Advantages: Improves performance The programmer is not required to rewrite existing applications Works with current software programs Implementation: Hardware-centric Software-centric
CSE8383CSE ILP Compilers Roles Enhance performance Eliminate the complex processing needed to parallelize code Accelerate the nonlooping codes prevalent in most applications
CSE8383CSE Optimization Criteria Operation count Vs Processor model
CSE8383CSE Statistical Compilation Statistical information is used to : Predict the outcome of conditional braches Improve program optimization & scheduling Improve the performance of frequently taken paths Statistical information : The location of operands in cache The probability of a memory alias The likelihood that an operand has a specific value
CSE8383CSE ILP Scheduling To achieve high performance, ILP compilers must jointly schedule multiple basic blocks The formation of scheduling regions is best performed using control flow statistics ILP schedulers address complex trade-offs using heuristics based on approximations
CSE8383CSE Dynamic Compilation Static Compilation: Tune code to a single implementation of a specific processor architecture Dynamic Compilation Transparently customizes an exactable file during execution Uses information not known when the software was distributed
CSE8383CSE Program Analysis Specially memory analysis Benefits: Improves program schedules Improves code quality better cache hierarchy use Analysis techniques derived from sequential processor may produce poor results in ILP processors Performing analysis over large amounts of code can be unacceptably slow and consume too much memory
CSE8383CSE Program Transformation Representation to find fine-grained parallelism: Program graph Machine model Transformations that support ILP: Expression reassociation Loop unrolling Tail duplication Register renaming Procedure inlining
CSE8383CSE Increasing Hardware Parallelism Current processor designs attempt to use more functional units to provide increased hardware parallelism Compilers take an increasingly complex responsibilities to ensure efficient use of hardware resources The number of operations “in flight” measures the amount of parallelism the compiler must provide to keep an ILP processor busy
CSE8383CSE Architectures & Compilers To assess new architectures compilers must incorporate proposed architectural features Compilers are the only way to evaluate an architecture’s performance on real applications
CSE8383CSE Promising Areas of Research ILP compiler techniques are evolving from scientific computing technology into a broadly useful scalar technology There are Obstacles inhibit the efficient use of hardware parallelism In addition to the rewards gained from beneficial techniques, they can generate side effects!
CSE8383CSE Techniques to Reduce Compile Time New strategies results in long compile times To speed compilation: Careful application partitioning Better algorithms for analysis and optimization
CSE8383CSE Conclusion ILP represents a paradigm shift that redefines the traditional field of compilation ILP compilation presents challenges not addressed in traditional compilers As we scale up the amount of hardware parallelism, compilers take on increasingly complex responsibilities to ensure efficient use of hardware resources
CSE8383CSE Questions?