Modular Machine Code Verification Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Schürmann, David Walker Department of Computer.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

A Translation from Typed Assembly Language to Certified Assembly Programming Zhong Shao Yale University Joint work with Zhaozhong Ni Paper URL:
COS 441 Exam Stuff David Walker. TAL 2 Logistics take-home exam will become available on the course web site Jan write down when you download &
Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
Certified Typechecking in Foundational Certified Code Systems Susmit Sarkar Carnegie Mellon University.
CS 4284 Systems Capstone Godmar Back Processes and Threads.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
March 4, 2005Susmit Sarkar 1 A Cost-Effective Foundational Certified Code System Susmit Sarkar Thesis Proposal.
Ensuring Operating System Kernel Integrity with OSck By Owen S. Hofmann Alan M. Dunn Sangman Kim Indrajit Roy Emmett Witchel Kent State University College.
Assembly Code Verification Using Model Checking Hao XIAO Singapore University of Technology and Design.
Nicholas Moore Bianca Curutan Pooya Samizadeh McMaster University March 30, 2012.
ISBN Chapter 3 Describing Syntax and Semantics.
An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker)
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
Modular Verification of Concurrent Assembly Code with Dynamic Thread Creation and Termination Xinyu Feng Yale University Joint work with Zhong Shao.
Typed Assembly Languages COS 441, Fall 2004 Frances Spalding Based on slides from Dave Walker and Greg Morrisett.
Stacks, Heaps and Regions: One Logic to Bind Them David Walker Princeton University SPACE 2004.
Certifying Low-Level Programs with Hardware Interrupts and Preemptive Threads Xinyu Feng Toyota Technological Institute at Chicago Joint work with Zhong.
1 ICS 51 Introductory Computer Organization Fall 2006 updated: Oct. 2, 2006.
Typed Memory Management in a Calculus of Capabilities David Walker (with Karl Crary and Greg Morrisett)
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
STAL David Walker (joint work with Karl Crary, Neal Glew and Greg Morrisett)
An Open Framework for Foundational Proof-Carrying Code Xinyu Feng Yale University Joint work with Zhaozhong Ni (Yale, now at MSR), Zhong Shao (Yale) and.
Modular Verification of Assembly Code with Stack-Based Control Abstractions Xinyu Feng Yale University Joint work with Zhong Shao, Alexander Vaynberg,
On the Relationship Between Concurrent Separation Logic and Assume-Guarantee Reasoning Xinyu Feng Yale University Joint work with Rodrigo Ferreira and.
MinML: an idealized programming language CS 510 David Walker.
A Type System for Expressive Security Policies David Walker Cornell University.
Assembly תרגול 8 פונקציות והתקפת buffer.. Procedures (Functions) A procedure call involves passing both data and control from one part of the code to.
Describing Syntax and Semantics
Stack Management Each process/thread has two stacks  Kernel stack  User stack Stack pointer changes when exiting/entering the kernel Q: Why is this necessary?
Addressing Modes Chapter 11 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer,  S.
David Evans CS201j: Engineering Software University of Virginia Computer Science Lecture 18: 0xCAFEBABE (Java Byte Codes)
Dr. José M. Reyes Álamo 1.  The 80x86 memory addressing modes provide flexible access to memory, allowing you to easily access ◦ Variables ◦ Arrays ◦
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Introduction to Our Research on Certifying Compiler Zhaopeng Li (In Chinese: 李兆鹏 ) Certifying Compiler Group USTC-Yale Joint.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Lecture 2 Foundations and Definitions Processes/Threads.
Towards Automatic Verification of Safety Architectures Carsten Schürmann Carnegie Mellon University April 2000.
A Certifying Compiler and Pointer Logic Zhaopeng Li Software Security Lab. Department of Computer Science and Technology, University of Science and Technology.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Certifying Intermediate Programming Zhaopeng Li
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 11: Functions and stack frames.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
X86 Assembly Language We will be using the nasm assembler (other assemblers: MASM, as, gas)
/ PSWLAB Thread Modular Model Checking by Cormac Flanagan and Shaz Qadeer (published in Spin’03) Hong,Shin Thread Modular Model.
CS533 Concepts of Operating Systems Jonathan Walpole.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
1 Assembly Language: Function Calls Jennifer Rexford.
Alias Types David Walker Cornell University What do you want to type check today?
Correct RelocationMarch 20, 2016 Correct Relocation: Do You Trust a Mutated Binary? Drew Bernat
Introduction to Computer Programming Concepts M. Uyguroğlu R. Uyguroğlu.
Mostly-Automated Verification of Low-Level Programs in Computational Separation Logic Adam Chlipala Harvard University PLDI 2011.
Functional Programming
Introduction to programming languages, Algorithms & flowcharts
Types for Programs and Proofs
Introduction to programming languages, Algorithms & flowcharts
Low-Level Program Verification
Threads Cannot Be Implemented As a Library
Introduction to Compilers Tim Teitelbaum
(One-Path) Reachability Logic
TALx86: A Realistic Typed Assembly Language
Introduction to programming languages, Algorithms & flowcharts
Foundations and Definitions
Presentation transcript:

Modular Machine Code Verification Zhaozhong Ni Advisor: Zhong Shao Committee: Zhong Shao, Paul Hudak Carsten Schürmann, David Walker Department of Computer Science, Yale University Nov. 29, 2006 PhD Thesis Defense

2 19 Lines of Code on Every PC swapcontext: ; store old context mov eax, [esp+4] mov [eax+0], OK mov [eax+4], ebx mov [eax+8], ecx mov [eax+12], edx mov [eax+16], esi mov [eax+20], edi mov [eax+24], ebp mov [eax+28], esp ; load new context mov eax, [esp+8] mov esp, [eax+28] mov ebp, [eax+24] mov edi, [eax+20] mov esi, [eax+16] mov edx, [eax+12] mov ecx, [eax+8] mov ebx, [eax+4] mov eax, [eax+0] ret

3 19 Lines of Code in Every ms swapcontext: Runs thousands of time per second Used by assembly, C, MSIL, JVML, etc. Basis of multi-tasking, OS, and software Safety and correctness taken for granted

4 swapcontext: old 19 Lines of Code Looks Simple eax ebx ecx edx esi edi ebp esp retp … … call swapcontext … retp’ … ……………… a1a1 a2a2 a3a3 a4a4 a5a5 a6a6 a7a7 a8a8 b1b1 b2b2 b3b3 b4b4 b5b5 b6b6 b7b7 b8b8 OK new a8a8

5 19 Lines of Code Proven Hard swapcontext: Simple code, complex reasoning! stack / heap / memory mutation procedure call / first-class code pointer protection / polymorphism Lack specification and verification that are formal (machine checkable in sound logic) general (allows all possible usage of context) realistic (usable from assembly and C level)

6 Outline Introduction The XCAP Framework Mini Thread Library Connect XCAP to TAL Conclusion

7 Software Reliability Bugs are costly Especially important for mission-critical software consumer electronics software internet software

8 Test-Patch Approach Works most of the time Gives no guarantee Could make things worse test pre-release? create patch debug no yes

9 Language-based Approach Uses types and other formal specifications Excludes all bugs in certain categories illegal command, overflow, dangling pointer, etc. Successful and popular ML, Java, C#, etc. Reached virtual machine code level JVML, MSIL, TIL, TAL, etc. Meta-theorems can make guarantees

10 Traditional Assumptions Types are for application software you can not write OS without (void *) Types are for high-level languages not much to talk about B CD 15 Types are only for “no blue screen” how about “variable x is a prime number” Type safety are bad for performance turn off array-bound checking before release

11 Program Specification bool prime (int n) { assert (n > 0); for (int i = 2; i < n; i ++) // n mod 2,…,i-1 ≠ 0 if (n % i == 0) return false; // n mod 2,…,n-1 ≠ 0 return true; } syntactic types machine-logical specifications meta-logical specifications

12 Machine Code Verification Motivations everything goes down to binary high-level safety efforts lost in compilation critical code directly written in low level Challenges Expressiveness Modularity Goals both user and system level code modular specification + certification

13 Proof-Carrying Code CodeProof Checker Meta theory Specification Proposed 10 years ago [Necula & Lee] machine code machine checkable proof

14 Foundational PCC CodeProof Checker Meta theory Specification Proposed by [Appel] mathematic logic checkermathematic logic theory

15 Approaches to PCC Type-based PCC TAL [Morrisett98] Touchstone PCC [Colby00] Syntactic FPCC [Hamid02] FTAL [Crary03] LTAL [Chen03] … Modular Generate proof easily Type safety Logic-based PCC Original PCC [Necula98] Semantic FPCC [Appel01] CAP [Yu03] Open Verifier [Chang05] CCAP/CMAP [Yu04, Feng05] … Expressive Advanced properties Good interoperability

16 PCC After 10 Years In principle, can verify any machine code! In reality, many programs are not verified. For some code, we do not know HOW! CodeProof Checker Meta theory Specification

17 User-level Code: List Append Adapted from [Reynolds02] ……

18 User-level Code: List Append Adapted from [Reynolds02] ……

19 Type-basedLogic-based Inductive definitions (correctness of list append) -+ Strong update (Separation logic) (allocation, de-allocation, mutation) -+ Embedded code pointers (continuation) +- Impredicative polymorphisms (closure) +- Adapted from [Reynolds02] User-level Code: List Append

20 ECP Problem w. Hoare Logic Embedded code pointers (ECP) Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses “… are difficult to describe in … Hoare logic” [Reynolds02] Previous approaches Ignore ECP [Necula98, Yu04] Limit ECP specifications to types [Hamid04] Sacrifice modularity [Yu03] Use complex indexed semantic models [Appel01]

21 Outline Introduction The XCAP Framework Mini Thread Library Connect XCAP to TAL Conclusion

22 The XCAP Framework [POPL’06] A logic-based PCC framework modular verification of machine code supports ECP without compromise Support both system and user code Consists of target machine (not fixed) assertion language (consistency) inference rules (soundness)

23 Target Machine

24 Dynamic Semantics

25 Hoare logic in CPS Use general predicate logic for assertions example: Mechanized in a proof assistant (Coq) Extensions made: CCAP, CMAP, etc. Certified Assembly Programming [Yu03, Hamid04, Yu04, Feng05]

26 How CAP Certify Instructions

27 How CAP Certify Programs …

28 The ECP Problem cptr(f, a) = ?

29 Internalize Hoare-derivation for ECP Previous Approach Circularity! Stratification [OHearn97, Naumann01] Works for simple case Hard for assembly Hard for polymorphism Step-Indexing [Appel01, Appel02, Schneck03] Works for polymorphism Heavyweight Not standard Hoare logic

30 CAP’s Approach Specify ECP by checking against code spec Verify all code specs are indeed valid Modularity problem

31 The XCAP Approach Specify ECP independent of code spec Check ECP against global code spec Verify global code spec is indeed valid

32 Extended Propositions

33 XCAP Rules

34 How XCAP Works with ECP (SEQ) (ECP) (JMP) (JD)

35 Verification of append()

36 Impredicative Polymorphisms Important for ECP Naïve interpretation function fails

37 New Interpretation Soundness of interpretation Interpretation Consistency

38 Recursive Specification Simple recursive data structures linked list, queue, stack, tree, etc. supported via inductive definition of Prop Complex recursive structures with ECP object (self refers to the entire object) threading invariant (each thread assumes others) Recursive specification

39 Memory Mutation Strong update special conjunction (p * q) in separation logic directly definable in Prop and PropX explicit alias control, popular in system level Weak update (general reference) mutable reference (int ref) in ML managed data pointers (int __gc*) in.NET rely on GC to recycle memory popular in user level

40 Weak Update Reference cell Interpretation Record macro

41 Implementation in Coq PropX can share similar tactics with Prop Target machine 341 lines PropX, interpretation, and consistency1733 lines XCAP with soundness 444 lines CAP with soundness 402 lines CAP to XCAP translation with proof 543 lines Separation logic and lemmas 300 lines append() example1718 lines

42 Outline Introduction The XCAP Framework Mini Thread Library Connect XCAP to TAL Conclusion

43 Why Thread Library? Concurrent verification primitives’ correctness is assumed primitives are not really “primitive”! poor portability due to lack of formal spec Core of OS kernel assignment 1 of OS course written in C and Assembly requires both safety and efficiency

44

45 A Mini Thread Library Modeled after Pth Non-preemptive user level threads Written in (subset of) x86 assembly

46 Threading Model

47 Modules and Interfaces

48 Verify That 19 Lines of Code Step 1: specify machine context Step 2: specify function call/return Step 3: specify swapcontext() Step 4: prove it!

49 Machine Context … … … … retv bx cx dx si di bp sp cs mctx public private typedef struct mctx_st *mctx_t; struct mctx_st {int eax,int ebx,int ecx,int edx, int esi, int edi, int ebp,int esp }; ret

50 Function Call / Return local storage return address argument 1 argument 2 … argument n caller frames excess space esp

51 swapcontext() void swapcontext (mctx_t old, mctx_t new); mov eax, [esp+4] mov [eax+ 0], OK mov [eax+ 4], ebx mov [eax+ 8], ecx mov [eax+12], edx mov [eax+16], esi mov [eax+20], edi mov [eax+24], ebp mov [eax+28], esp mov eax, [esp+8] mov esp, [eax+28] mov ebp, [eax+24] mov edi, [eax+20] mov esi, [eax+16] mov edx, [eax+12] mov ecx, [eax+ 8] mov ebx, [eax+ 4] mov eax, [eax+ 0] ret

52 Other Context Routines void loadcontext (mctx_t mctx); void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);

53 Thread Control Block typedef struct mth_st *mth_t; struct mth_st {mth_t next, mth_state_t state, mctx_st mctx}; mthnext state machine context q NULL state machine context next state machine context

54 Threading Invariant scheduler context mctx_sched sched st mth_cur cur ready threads mth_rq …

55 Threading Routines void mth_yield (void); mth_t mth_spawn (int stacksize, void *(*func)(void *), void *arg); void mth_scheduler (void);

56 Implementation 40,000 lines of Coq code Where comes the complexity? lemma library: large and reusable x86 machine: finite integer embedding: de Burijin indices engineering: limited proof re-use target code: this is the kernel of software!

57 Outline Introduction The XCAP Framework Mini Thread Library Connect XCAP to TAL Conclusion

58 Typed Assembly Language TAL [Morrisett et al] Top-level typing judgment Target of type-preserving compilation For user and simple system level code

59 TAL to XCAP Translation (1) Translation of value types

60 TAL to XCAP Translation (2) Translation of preconditions Translation of code heap types Translation of data heap types

61 Typing Preservation

62 Application Scenario device driver OS kernel firmware user application library TAL XCAP

63 Outline Introduction The XCAP Framework Mini Thread Library Connect XCAP to TAL Conclusion

64 Summarizing XCAP Support user-level machine code demonstrated by type-preserving translation Support system-level machine code demonstrated by mini thread library Support modular machine code verification modular as type expressive as logic

65 Other Work A syntactic approach to FPCC [LICS’02] Simple type safety, no need of indexed model Stack-based control abstractions [PLDI’06] utilizes the fixed ECP pattern to simplify things An open framework for FPCC [TLDI’07] allows different verification styles in a system

66 Some Future Directions Add logic power to higher level languages C and C#, certifying compilation Certify those safe “unsafe” code garbage collector, preemptive thread library, device driver, etc. Consider other properties correctness, liveness, security, etc. Build tools for productivity concrete syntax and parser, large lemma libraries, etc.

67 Thank You!