PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences.

Slides:



Advertisements
Similar presentations
CSC 4181 Compiler Construction Code Generation & Optimization.
Advertisements

Comparison and Evaluation of Back Translation Algorithms for Static Single Assignment Form Masataka Sassa #, Masaki Kohama + and Yo Ito # # Dept. of Mathematical.
8. Static Single Assignment Form Marcus Denker. © Marcus Denker SSA Roadmap  Static Single Assignment Form (SSA)  Converting to SSA Form  Examples.
School of EECS, Peking University “Advanced Compiler Techniques” (Fall 2011) SSA Guo, Yao.
7. Optimization Prof. O. Nierstrasz Lecture notes by Marcus Denker.
Course Outline Traditional Static Program Analysis Software Testing
Chapter 9 Code optimization Section 0 overview 1.Position of code optimizer 2.Purpose of code optimizer to get better efficiency –Run faster –Take less.
1 CS 201 Compiler Construction Lecture 3 Data Flow Analysis.
Loops or Lather, Rinse, Repeat… CS153: Compilers Greg Morrisett.
Data-Flow Analysis Framework Domain – What kind of solution is the analysis looking for? Ex. Variables have not yet been defined – Algorithm assigns a.
Chapter 10 Code Optimization. A main goal is to achieve a better performance Front End Code Gen Intermediate Code source Code target Code user Machine-
1 Translation Validation: From Simulink to C Michael RyabtsevOfer Strichman Technion, Haifa, Israel Acknowledgement: sponsored by a grant from General.
Compilers and Language Translation
Chapter 7: User-Defined Functions II
Optimizing Compilers for Modern Architectures Preliminary Transformations Chapter 4 of Allen and Kennedy.
Preliminary Transformations Chapter 4 of Allen and Kennedy Harel Paz.
ISBN Chapter 3 Describing Syntax and Semantics.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
6/9/2015© Hal Perkins & UW CSEU-1 CSE P 501 – Compilers SSA Hal Perkins Winter 2008.
CIS 101: Computer Programming and Problem Solving Lecture 8 Usman Roshan Department of Computer Science NJIT.
1 CS 201 Compiler Construction Lecture 5 Code Optimizations: Copy Propagation & Elimination.
Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
9. Optimization Marcus Denker. 2 © Marcus Denker Optimization Roadmap  Introduction  Optimizations in the Back-end  The Optimizer  SSA Optimizations.
4/25/08Prof. Hilfinger CS164 Lecture 371 Global Optimization Lecture 37 (From notes by R. Bodik & G. Necula)
CMPUT Compiler Design and Optimization1 CMPUT680 - Winter 2006 Topic H: SSA for Predicated Code José Nelson Amaral
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 6. 2 Objectives You should be able to describe: Function and Parameter Declarations Returning a Single Value Pass by Reference Variable Scope.
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Describing Syntax and Semantics
Data Flow Analysis Compiler Design Nov. 8, 2005.
PSUCS322 HM 1 Languages and Compiler Design II IR Code Optimization Material provided by Prof. Jingke Li Stolen with pride and modified by Herb Mayer PSU.
ENGR. SHOAIB ASLAM Computer Programming I Lecture 02.
Homework Reading Programming Assignments
Compiler Code Optimizations. Introduction Introduction Optimized codeOptimized code Executes faster Executes faster efficient memory usage efficient memory.
Invitation to Computer Science 5th Edition
5.3 Machine-Independent Compiler Features
REFACTORING Lecture 4. Definition Refactoring is a process of changing the internal structure of the program, not affecting its external behavior and.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
© 2008, Renesas Technology America, Inc., All Rights Reserved 1 Purpose  This training course describes how to configure the the C/C++ compiler options.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
Chapter 13 Programming in the Large Dr. Jiung-yao Huang Dept. Comm. Eng. Nat. Chung Cheng Univ. TA: 鄭筱親 陳昱豪.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Compiler Principles Fall Compiler Principles Lecture 0: Local Optimizations Roman Manevich Ben-Gurion University.
Static Single Assignment Form in the COINS Compiler Infrastructure Masataka Sassa, Toshiharu Nakaya, Masaki Kohama, Takeaki Fukuoka and Masahito Takahashi.
The Fail-Safe C to Java translator Yuhki Kamijima (Tohoku Univ.)
Chapter 3 Part II Describing Syntax and Semantics.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
User-Defined Functions II TK1914: C++ Programming.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
CS412/413 Introduction to Compilers and Translators April 2, 1999 Lecture 24: Introduction to Optimization.
Proving Optimizations Correct using Parameterized Program Equivalence University of California, San Diego Sudipta Kundu Zachary Tatlock Sorin Lerner.
Dr. Hussien Sharaf Dr Emad Nabil. Dr. Hussien M. Sharaf 2 position := initial + rate * Lexical analyzer 2. Syntax analyzer id 1 := id 2 + id 3 *
Credible Compilation With Pointers Martin Rinard and Darko Marinov Laboratory for Computer Science Massachusetts Institute of Technology.
Code Optimization.
User-Written Functions
Weakest Precondition of Unstructured Programs
Proof Carrying Code and Proof Preserving Program Transformations
Static Single Assignment
Machine-Independent Optimization
Chapter 6 Intermediate-Code Generation
Preliminary Transformations
Compiler Code Optimizations
Intermediate Code Generation
Chapter 10: Compilers and Language Translation
CSE P 501 – Compilers SSA Hal Perkins Autumn /31/2019
Code Optimization.
Presentation transcript:

PLC '06 Experience in Testing Compiler Optimizers Using Comparison Checking Masataka Sassa and Daijiro Sudo Dept. of Mathematical and Computing Sciences Tokyo Institute of Technology

PLC '06 Background Compiler Optimization Improves time efficiency of object code Optimization is a program transformation Should not change the meaning of programs Algorithm error Implementation error Testing Optimizers Incorrect optimizations cause different output Difficult to identify the place of incorrect optimization Even incorrect optimizations may not change the output (seems correct at first glance) It will cause incorrect result in other untested programs

PLC '06 Contribution Systematic test whether the optimizations are correct Not the proof of optimizers Confirm that the optimizers are correct in fine granularity and with relatively high reliability Compare the programs before and after optimization Values of all the left-hand sides of assignment statements, etc. are output to trace file Compare the trace file In case of error, error messages are given together with the intermediate code shown in C-style, which makes finding the cause of errors easily We found four bugs through experiments, including two unknown bugs Time for testing may be long, but we regard it as not important

PLC '06 Testing Optimizers through Comparison Checking First proposed by Jaramillo [2002], but no enough experience Our paper focuses on its experience in a real development of an optimizing compiler COINS ( around 230 kloc) to test its SSA optimizershttp://coins-project.org

PLC '06 Outline of the method 1. Duplicate intermediate code (no. of optimization + 1)

PLC '06 Static Single Assignment ( SSA ) Form Definitions of variables are unique in the program text. Each variable definition is distinguished by attaching a suffix. Introduce the  -function at the join point where different definitions of a variable reach. a:= :=a a2:=a1:= a3:=(a1,a2) :=a3 a:=x+y a:=a+3 b:=x+y normal form SSA form a1:=x0+y0 a2:=a1+3 b1:=x0+y0

PLC '06 2. Loop transformation on natural loops a>5 L1 b>10 L2 c=a+b c=3 d=a+b a=0 L3 a=1 L4 printf(c) L5 a=10 b=20 c=0 d=0 L0 a>5 c=a+b c=3 d=a+b a=0 a=1 printf(c) b>10 L1 L2 L3 L4 L5 a=10 b=20 c=0 d=0 L0 Lt

PLC '06 3. SSA translation a>5 c=a+b c=3 d=a+b a=0 a=1 printf(c) b>10 L1 L2 L3 L4 L5 a=10 b=20 c=0 d=0 L0 Lt a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 c_3=3 d_2=a_1+b_0 a_2=0 a_3=1 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt

PLC '06 4. SSA form optimization (1) Loop invariant code motion a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 d_2=a_1+b_0 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 c_3=3 a_2=0 a_3=1 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 c_3=3 d_2=a_1+b_0 a_2=0 a_3=1 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt

PLC '06 4. SSA form optimization (2) Common subexpression elimination a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 d_2=a_1+b_0 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 c_3=3 a_2=0 a_3=1 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 d_2=c_2 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 c_3=3 a_2=0 a_3=1 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt

PLC '06 4. SSA form optimization (3) Dead code elimination a_1=  (a_0,a_4) c_1=  (c_0,c_4) a_1>5 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 c_3=3 a_2=0 a_3=1 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) Lt a_1=  (a_0,a_4) c_1=  (c_0,c_4) d_1=  (d_0,d_3) a_1>5 c_2=a_1+b_0 d_2=c_2 printf(c_1) b_0>10 L1 L2 L3 L4 L5 a_0=10 b_0=20 c_0=0 d_0=0 c_3=3 a_2=0 a_3=1 L0 a_4=  (a_2,a_3) c_4=  (c_3,c_1) d_3=  (d_2,d_1) Lt

PLC '06 Outline of the method (cont.) 5.Loop analysis 6.Insertion of statements for outputting trace  The value of the left-hand side variable after each assignment statement  The values of both sides of relational operator before each conditional expression  The values of parameters before and after each function call

PLC '06 Outline of the method (cont.) 7.SSA back-translation and C-style program output ◆ Output intermediate code (in SSA form) in a C-style program 8.Execution  Give each duplicated compiler the same input and execute

PLC '06 8. Execution

PLC '06 Example of trace file 0_1:a_0:10 0_2:b_0:20 0_3:c_0:0 0_4:d_0:0 5_1:CALL_printf_1_1_bef:3 5_1:CALL_printf_1_1_aft:3 (i) basic block number (ii) instruction number in the basic block (iii) variable name, parameters etc. (iv) value (of variable name etc.).

PLC '06 Output of intermediate code in C-style (part)

PLC '06 9. Comparison Checking Result of trace (Error messages are output if the values do not match) (a)no optimization (d)dead code elimination (b)loop invariant code motion (c)common subexpresssion elimination 0_1:a_0:10 0_2:b_0:20 0_3:c_0:0 0_4:d_0:0 0_5:c_3:3 0_6:a_2:0 0_7:a_3:1 5_1:CALL_printf_1_1_bef:3 5_1:CALL_printf_1_1_aft:3 1_1:a_1:10 1_2:c_1:0 1_3:d_1:0 1_4:JUMP_l:10 1_4:JUMP_r:5 2_1:JUMP_l:20 2_1:JUMP_r:10 3_1:c_2:30 3_2:d_2:30 t_1:a_4:0 t_2:c_4:3 t_3:d_3:30 1_1:a_1:0 1_2:c_1:3 1_3:d_1:30 1_4:JUMP_l:0 1_4:JUMP_r:5 Loop Lv.1

PLC '06 How to compare instructions moved by code motion optimization In loop invariant code motion and partial redundancy elimination, code is moved. The system compares instructions moved out of the loop after optimization with all instructions in the corresponding loop before optimization, by looking at the tags made by the loop analysis Ex: in comparing figures (a) and (b), c_3 and a_2 are moved out of the loop

PLC '06 9. Comparison Checking Result of trace (Error messages are output if the values do not match) (a)no optimization (d)dead code elimination (b)loop invariant code motion (c)common subexpresssion elimination 0_1:a_0:10 0_2:b_0:20 0_3:c_0:0 0_4:d_0:0 0_5:c_3:3 0_6:a_2:0 0_7:a_3:1 5_1:CALL_printf_1_1_bef:3 5_1:CALL_printf_1_1_aft:3 1_1:a_1:10 1_2:c_1:0 1_3:d_1:0 1_4:JUMP_l:10 1_4:JUMP_r:5 2_1:JUMP_l:20 2_1:JUMP_r:10 3_1:c_2:30 3_2:d_2:30 t_1:a_4:0 t_2:c_4:3 t_3:d_3:30 1_1:a_1:0 1_2:c_1:3 1_3:d_1:30 1_4:JUMP_l:0 1_4:JUMP_r:5 Loop Lv.1

PLC '06 How to compare instructions deleted or added by optimization We do not compare variables deleted or added by optimization. We compare only those variables that retain correspondence. Ex: in comparing figures (c) and (d), d_0, d_1, d_2, d_3, c_2 are deleted by dead code elimination, and are not compared.

PLC '06 9. Comparison Checking Result of trace (Error messages are output if the values do not match) (a)no optimization (d)dead code elimination (b)loop invariant code motion (c)common subexpresssion elimination 0_1:a_0:10 0_2:b_0:20 0_3:c_0:0 0_4:d_0:0 0_5:c_3:3 0_6:a_2:0 0_7:a_3:1 5_1:CALL_printf_1_1_bef:3 5_1:CALL_printf_1_1_aft:3 1_1:a_1:10 1_2:c_1:0 1_3:d_1:0 1_4:JUMP_l:10 1_4:JUMP_r:5 2_1:JUMP_l:20 2_1:JUMP_r:10 3_1:c_2:30 3_2:d_2:30 t_1:a_4:0 t_2:c_4:3 t_3:d_3:30 1_1:a_1:0 1_2:c_1:3 1_3:d_1:30 1_4:JUMP_l:0 1_4:JUMP_r:5 Loop Lv.1

PLC '06 Example trace output with errors 0_1:a:1 0_2:t_0:1 0_3:b:3 0_4:t_1:3 0_5:t_2:1 0_6:t_3:0 0_7:t_4:0 0_8:c:3 0_9:t_5:1 0_10:t_6:3 0_11:t_7:3 0_12:t_8:6 0_13:t_9:1 0_14:t_10:6 0_15:x:7 0_1:a:1 0_2:t_0:1 0_3:b:3 0_4:t_1:3 0_5:t_2:1 0_6:t_3:0 0_7:t_4:0 0_8:c:3 0_9:t_5:1 0_10:t_6:3 0_11:t_7:0 0_12:t_8:3 0_13:t_9:1 0_14:t_10:3 0_15:x:4 Errors in t_7, t_8, t_10, x before optimization after optimization (CSEQP)

PLC '06 Example of error message -----NG!----- opt::LoopLevel=0,BlkID=0,InstrNumber=11,t_7:0 unopt::LoopLevel=0,BlkID=0,InstrNumber=11,t_7: NG!----- opt::LoopLevel=0,BlkID=0,InstrNumber=12,t_8:3 unopt::LoopLevel=0,BlkID=0,InstrNumber=12,t_8: NG!----- opt::LoopLevel=0,BlkID=0,InstrNumber=14,t_10:3 unopt::LoopLevel=0,BlkID=0,InstrNumber=14,t_10: NG!----- opt::LoopLevel=0,BlkID=0,InstrNumber=15,x:4 unopt::LoopLevel=0,BlkID=0,InstrNumber=15, x:

PLC '06 Experiments We used the COINS C compiler. We implemented our system in Java. Approximately 5000 lines were added, of which 2000 lines are for comparison checker.

PLC '06 Tested optimizations Tested optimizations are: copy propagation, constant propagation, common subexpression elimination, CSEQP, partial redundancy elimination, loop invariant code motion, strength reduction and test replacement of induction variables, dead code elimination, removal of redundant  -functions, and removal of empty basic blocks The test programs used are: 700 small test programs 181.mcf from SPEC CINT2000.

PLC '06 Results We found four bugs in the COINS' SSA optimizers: Two for PRE (partial redundancy elimination) (unknown bugs) One for CSEQP (common subexpression elimination based on question propagation) One for operator strength reduction and test replacement of induction variable

PLC '06 A bug in partial redundancy elimination

PLC '06 False alarm False alarm is a situation where comparison checking displays errors although the optimization is correct. Major cause: Values representing addresses that are dynamically allocated at runtime. They may change if optimizations are applied. Errors concerning source level integers usually have a small number of digits. We provided an option to let the user specify the limiting number of digits for checking integer values. By specifying this limit as seven digits, the ratio of false alarm becomes approximately 5%. In practice, values of addresses that are a false alarm can be easily recognized.

PLC '06 Conclusions Experience in testing compiler optimizers using comparison checking Almost all program values before and after optimizations are output as a trace and checked Applied to the SSA optimizers of a real project COINS Could find bugs in optimizers Four bugs of optimizers including two unknown bugs The false alarm can be made as low as approx. 5% Can find erroneous optimization point easily Error message indicates the critical point Intermediate code in C-style helps locate the point Quite beneficial in practice

PLC '06

Difference with Jaramillo’s work 1.Alternate execution is not used and no breakpoints are used 2.Trace files are generated 3.The intermediate code is displayed in C-style program in the case of error 4.False alarms by pointers are well managed 5.SSA form is used 6.Other implementation issues, such as handling of natural loops, checking conditional expressions, and handling of each type of optimization are described

PLC '06 Related work Method by Necula et al. Validation using symbolic execution Found bugs of gcc loop expansion, register allocation Method by Zuck, Pnueli et al. Make correspondence to computation model Confirm using validation tool Method by Okuma and Minamide Prove definition of each pass using a theorem prover Generate compiler programs automatically

PLC '06 Trace file A trace of the values of variables etc. called “Variable information” is taken at execution time before and after optimization (i) basic block number (ii) instruction number in the basic block (iii) variable name etc. (iv) value (of variable name etc.). Values of left-hand side of assignment statements Values of both sides of the comparison of conditional expressions Values of parameters of function calls

PLC '06 _ L2: ((int)(*(((int *)(unsigned char *)&(a))))) = ((int)( 1)); divexI32_0 = ((int)(((int)(*(((int *)(unsigned char *)&(a))))))); ((int)(*(((int *)(unsigned char *)&(b))))) = ((int)(((int)(divexI32_0 + 2)))); divexI32_1 = ((int)(((int)(*(((int *)(unsigned char *)&(b))))))); divexI32_2 = ((int)(((int)(*(((int *)(unsigned char *)&(a))))))); divexI32_3 = ((int)(((int)(*(((int *)(unsigned char *)&(c))))))); divexI32_4 = ((int)(((int)(divexI32_2 * divexI32_3)))); ((int)(*(((int *)(unsigned char *)&(c))))) = ((int)(((int)(divexI32_1 + divexI32_4)))); divexI32_5 = ((int)(((int)(*(((int *)(unsigned char *)&(a))))))); divexI32_6 = ((int)(((int)(*(((int *)(unsigned char *)&(b))))))); divexI32_7 = ((int)(((int)(*(((int *)(unsigned char *)&(c))))))); divexI32_8 = ((int)(((int)(divexI32_6 + divexI32_7)))); divexI32_9 = ((int)(((int)(*(((int *)(unsigned char *)&(a))))))); divexI32_10 = ((int)(((int)(divexI32_8 * divexI32_9)))); ((int)(*(((int *)(unsigned char *)&(x))))) = ((int)(((int)(divexI32_5 + divexI32_10)))); _L2: ((int)(*(((int *)(unsigned char *)&(a))))) = ((int)( 1)); divexI32_0 = ((int)(((int)(*(((int *)(unsigned char *)&(a))))))); ((int)(*(((int *)(unsigned char *)&(b))))) = ((int)(((int)(divexI32_0 + 2)))); divexI32_1 = ((int)(((int)(*(((int *)(unsigned char *)&(b))))))); divexI32_2 = ((int)(divexI32_0)); divexI32_3 = ((int)(((int)(*(((int *)(unsigned char *)&(c))))))); divexI32_4 = ((int)(((int)(divexI32_2 * divexI32_3)))); ((int)(*(((int *)(unsigned char *)&(c))))) = ((int)(((int)(divexI32_1 + divexI32_4)))); divexI32_5 = ((int)(divexI32_0)); divexI32_6 = ((int)(divexI32_1)); divexI32_7 = ((int)(divexI32_3)); divexI32_8 = ((int)(((int)(divexI32_6 + divexI32_7)))); divexI32_9 = ((int)(divexI32_0)); divexI32_10 = ((int)(((int)(divexI32_8 * divexI32_9)))); ((int)(*(((int *)(unsigned char *)&(x))))) = ((int)(((int)(divexI32_5 + divexI32_10))));