Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California,

Slides:



Advertisements
Similar presentations
A Polynomial-Time Algorithm for Global Value Numbering SAS 2004 Sumit Gulwani George C. Necula.
Advertisements

1 CIS 461 Compiler Design and Construction Fall 2014 Instructor: Hugh McGuire slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module.
Intermediate Code Generation
Introduction to Assembly language
1 Compiler Construction Intermediate Code Generation.
1 Programming Languages (CS 550) Lecture Summary Functional Programming and Operational Semantics for Scheme Jeremy R. Johnson.
Computer Architecture CSCE 350
CS1104 – Computer Organization PART 2: Computer Architecture Lecture 4 Assembly Language Programming 2.
Programming Languages Marjan Sirjani 2 2. Language Design Issues Design to Run efficiently : early languages Easy to write correctly : new languages.
Abstract Interpretation with Alien Expressions and Heap Structures Bor-Yuh Evan ChangK. Rustan M. Leino University of California, BerkeleyMicrosoft Research.
Ross Tate, Juan Chen, Chris Hawblitzel. Typed Assembly Languages Compilers are great but they make mistakes and can introduce vulnerabilities Typed assembly.
Coolaid: Debugging Compilers with Untrusted Code Verification Bor-Yuh Evan Chang with George Necula, Robert Schneck, and Kun Gao May 14, 2003 OSQ Retreat.
CPSC Compiler Tutorial 8 Code Generator (unoptimized)
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
Slides prepared by Rose Williams, Binghamton University Chapter 1 Getting Started 1.1 Introduction to Java.
Under the Hood of the Open Verifier Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck October 21, 2003 OSQ Group Meeting.
Honors Compilers The Course Project Feb 28th 2002.
CS 536 Spring Code generation I Lecture 20.
Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar,
Cse321, Programming Languages and Compilers 1 6/19/2015 Lecture #18, March 14, 2007 Syntax directed translations, Meanings of programs, Rules for writing.
Program Analysis Using Randomization Sumit Gulwani, George Necula (U.C. Berkeley)
CS1104 Assembly Language: Part 1
The Structure of the GNAT Compiler. A target-independent Ada95 front-end for GCC Ada components C components SyntaxSemExpandgigiGCC AST Annotated AST.
A Polynomial-Time Algorithm for Global Value Numbering SAS 2004 Sumit Gulwani George C. Necula.
1 Type Type system for a programming language = –set of types AND – rules that specify how a typed program is allowed to behave Why? –to generate better.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
© The McGraw-Hill Companies, 2006 Chapter 1 The first step.
Topic #10: Optimization EE 456 – Compiling Techniques Prof. Carl Sable Fall 2003.
COMP2011 Assembly Language Programming and Introduction to WRAMP.
Adapted from Prof. Necula UCB CS 1641 Overview of COOL ICOM 4029 Lecture 2 ICOM 4029 Fall 2008.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
C++ Programming Language Lecture 2 Problem Analysis and Solution Representation By Ghada Al-Mashaqbeh The Hashemite University Computer Engineering Department.
CIS-165 C++ Programming I CIS-165 C++ Programming I Bergen Community College Prof. Faisal Aljamal.
How to select superinstructions for Ruby ZAKIROV Salikh*, CHIBA Shigeru*, and SHIBAYAMA Etsuya** * Tokyo Institute of Technology, dept. of Mathematical.
College Board A.P. Computer Science A Topics Program Design - Read and understand a problem's description, purpose, and goals. Procedural Constructs.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
Introduction to Compilers. Related Area Programming languages Machine architecture Language theory Algorithms Data structures Operating systems Software.
Programming Languages
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
By Mr. Muhammad Pervez Akhtar
Type Systems CSE 340 – Principles of Programming Languages Fall 2015 Adam Doupé Arizona State University
The Hashemite University Computer Engineering Department
Language Implementation Overview John Keyser Spring 2016.
©SoftMoore ConsultingSlide 1 Structure of Compilers.
Java Bytecode Verification Types Chris Male, David J. Pearce, Alex Potanin and Constantine Dymnikov Victoria University of Wellington, New.
Announcements Assignment 2 Out Today Quiz today - so I need to shut up at 4:25 1.
Objective of the course Understanding the fundamentals of the compilation technique Assist you in writing you own compiler (or any part of compiler)
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
CS 404 Introduction to Compiler Design
CS 3304 Comparative Languages
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
6.001 SICP Compilation Context: special purpose vs. universal machines
CS 536 / Fall 2017 Introduction to programming languages and compilers
Chapter 9 :: Subroutines and Control Abstraction
Introduction to the C Language
Instructions - Type and Format
Chap. 8 :: Subroutines and Control Abstraction
Chap. 8 :: Subroutines and Control Abstraction
Ada – 1983 History’s largest design effort
Chapter 6 Intermediate-Code Generation
Algorithm Correctness
Types and Type Checking (What is it good for?)
6.001 SICP Interpretation Parts of an interpreter
Some Assembly (Part 2) set.html.
ICOM 4029 Fall 2003 Lecture 2 (Adapted from Prof. Necula UCB CS 164)
Presentation transcript:

Type-Based Verification of Assembly Language for Compiler Debugging Bor-Yuh Evan ChangAdam Chlipala George C. NeculaRobert R. Schneck University of California, Berkeley TLDI 2005 Long Beach, California January 10, 2005

2 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Problem and Motivation assembly code existingType check assembly code produced by existing compilers for a Java-like language Validate that certifying compilation aids compiler debugging –using undergraduate compilers course as a test bed Introduce undergraduates to using compiler technology for software quality

3 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Problem and Motivation Starting Point: Bytecode verification –Effective, simple, and very accessible –But does not work for assembly code Why assembly code? –Native code compilers are more complex –Also debug JITs Why existing compilers? –Force to make the verifier flexible –Better accessibility to students (i.e., require little or no annotations) –More test cases to validate “certified compilation for debugging”

4 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Debugging Compilers with Verification Run Compile Run Compiler Test Case Compiled Program Test Inputs Stressed Compilers Student Compile Verify Compiler Test Case Relaxed Compilers Student Ð^K^V^H^P^L^V^H0^L^V^ Hp^L^V^H°^L^V^Hð^L^V^ H^P^M^V^H0^M^V^Hp^M^ V^H^Ð^M^V^Hð^M^V^ Line 2147: Type error – argument -8(SP0) : unknown, which is not <= nonnull String.

5 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Challenges class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() branch (= r c 0) L abort r tmp := m[r c + 8] r tmp := m[r tmp + 12] r arg0 := r c r ra := & L ret jump [r tmp ] L ret : invokevirtual P.m()

6 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Challenges class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() branch (= r c 0) L abort r tmp := m[r c + 8] r tmp := m[r tmp + 12] r arg0 := r c r ra := & L ret jump [r tmp ] L ret : h r c : P, … i h r c : nonnull P, … i h r tmp : disp(P), … i h r tmp : meth(P,12), … i h r arg0 : P, … i h r rv : int, … i

7 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Challenges class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); c.m(); … invokevirtual P.m() branch (= r c 0) L abort r tmp := m[r c + 8] r tmp := m[r tmp + 12] r p r arg0 := r p r ra := & L ret jump [r tmp ] L ret : h r c : P, … i h r c : nonnull P, … i h r tmp : disp(P), … i h r tmp : meth(P,12), … i P h r arg0 : P, … i h r rv : int, … iunsound

8 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Challenges class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); int f = p.f; p.m(); x = f + 1; branch (= r p 0) L abort r tmp := r p + 12 r f := m[r tmp ] branch (= r p 0) L abort r tmp := m[r p + 8] r tmp := m[r tmp + 12] r arg0 := r p r ra := & L ret jump [r tmp ] L ret : r x := r f + 1 reordering and optimization

9 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Challenges branch (= r p 0) L abort r tmp := r p + 12 r f := m[r tmp ] r x := r f + 1 branch (= r p 0) L abort r tmp := m[r tmp - 4] r tmp := m[r tmp + 12] r arg0 := r p r ra := & L ret jump [r tmp ] L ret : h r tmp : int ptr, … i “funny” pointer arithmetic class P { int f; int m() { … } } class C extends P { int m() { … } } … P p = new P(); P c = new C(); int f = p.f; p.m(); x = f + 1;

10 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Outline Overview Abstract State –Types –Join algorithm Verification Procedure Educational Experience Concluding Remarks

11 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Overview To maximize accessibility (i.e., minimize annotations) –infer types at intermediate program points using abstract interpretation –willingness to have a limited amount of specialization to a compilation strategy To handle assembly code –use (simple) dependent types To be less sensitive to the compilation strategy –assign types to intermediate values lazily

12 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Abstract State Keep intermediate expressions in symbolic form Symbolic values Symbolic values abstract irrelevant expressions keeping only type information Intermediate expressions track many dependencies between registers

13 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Why Symbolic Values? Used to track dependencies between registers Value state form convenient for abstract interpretation r 1 = r 2 Æ r 1 = r r 1 =  + 4, r 2 =  + 4, r 3 =   r 1 := r 2  [r 1 !  (r 2 )]

14 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Values Define a normalization of expressions to values Values give the expressions of interest –For our case, address computation and comparisons with constants –Would vary depending on the source language or compiler of interest –Not all equal expressions normalize to the same value, so value forms must be chosen carefully Registers are equal if they map to the same values (after normalization)

15 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Types Primitive Types and Object References Types for Run-time Structures Tracks that the value is the dispatch table of object 

16 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Join h r 1 =  +4, r 2 =  +4, r 3 =  D  : disp(  ),  : nonnull exactly C i h r 1 =  +4, r 2 =  +8, r 3 =  D  : disp(  ),  : nonnull C i h r 1 =  +4, r 2 = , r 3 =  D  : disp(  ),  : nonnull C,  : >i ( ,  ) !  ( ,  ) ! 

17 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Join Joining of type state basically given by subtyping –subtyping lattice still fairly simple – dictated mostly by the class inheritance hierachy Values are (fancy) labels for equivalence classes of registers –lattice of conjunctions of equalities on registers –first normalize expressions to values for join –each distinct pair of values in the input state gives an equivalence class in the result state

18 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Outline Overview Abstract State –Types –Join algorithm Verification Procedure Educational Experience Concluding Remarks

19 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Verification Example class P { int f; int m() { … } } class C extends P { … } branch (= r p 0) L abort r tmp := r p + 12 r f := m[r tmp ] r x := r f + 1 r tmp := m[r tmp - 4] r tmp := m[r tmp + 12] r arg0 := r p r ra := & L ret jump [r tmp ] L ret : Typing Rule h r tmp =  + 12 D … i h r p =  D  : P i h … D  : nonnull P i h r f =  D  : int i h r x =  + 1 D  : int i

20 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Verification Example class P { int f; int m() { … } } class C extends P { … } branch (= r p 0) L abort r tmp := r p + 12 r f := m[r tmp ] r x := r f + 1 r tmp := m[r tmp - 4] r tmp := m[r tmp + 12] r arg0 := r p r ra := & L ret jump [r tmp ] L ret : h r tmp =  + 12 D … i h r p =  D  : P i h … D  : nonnull P i Typing Rule h r f =  D  : int i h r x =  + 1 D  : int i h r tmp =  D  : disp(  ) i h r tmp =  D  : meth( ,12) i h r arg0 =  D … i h r rv =  D  : int i Calling meth( ,12) checks r arg0 =   + 12 – 4   + 8

21 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Outline Overview Abstract State –Types –Join algorithm Verification Procedure Educational Experience Concluding Remarks

22 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Comparing Compiler Development Good compilations increase: 53% to 75% Without Coolaid With Coolaid

23 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Comparing Compiler Development Type errors decrease: 44% to 19% –even 29% to 15% if only consider those that are visible in testing Without Coolaid With Coolaid

24 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Comparing Compiler Development False alarms: 6% to 1% –price for ease of use Without Coolaid With Coolaid

25 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Student Perspective 0:“counter-productive” 6:“can’t imagine being without it” A favorite comment: “I would be totally lost without Coolaid. I learn best when I am using it hands-on …. I was able to really understand stack conventions and optimizations and to appreciate them.” Traditional testing procedure –49 test cases / inputs Mean scores: –33 (67%) in 2002 –34 (69%) in 2003 –39 (79%) in 2004

26 1/10/2005Chang et al.: Type-Based Verification of Assembly Language for Compiler Debugging Conclusion Extended data-flow based type inference of intermediate languages to assembly code –Able to retrofit existing compilers –Minimal annotations for accessibility Provided experimental confirmation that certifying compilation helps early compiler debugging Introduced undergraduates to the idea of improving software quality with program analysis