Ranjit Jhala Rupak Majumdar Bit-level Types for High-level Reasoning.

Slides:



Advertisements
Similar presentations
Λλ Divergence Analysis with Affine Constraints Diogo Sampaio, Sylvain Collange and Fernando Pereira The Federal University of Minas.
Advertisements

Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
Intermediate Code Generation
Satisfiability Modulo Theories (An introduction)
Chapter 2 Instructions: Language of the Computer
Program Representations. Representing programs Goals.
Code Compaction of an Operating System Kernel Haifeng He, John Trimble, Somu Perianayagam, Saumya Debray, Gregory Andrews Computer Science Department.
Optimization Compiler Baojian Hua
Background information Formal verification methods based on theorem proving techniques and model­checking –to prove the absence of errors (in the formal.
Program Analysis as Constraint Solving Sumit Gulwani (MSR Redmond) Ramarathnam Venkatesan (MSR Redmond) Saurabh Srivastava (Univ. of Maryland) TexPoint.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 355 – Programming Languages
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Intermediate Representation I High-Level to Low-Level IR Translation EECS 483 – Lecture 17 University of Michigan Monday, November 6, 2006.
Permissive Interfaces Tom Henzinger Ranjit Jhala Rupak Majumdar.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
CS 536 Spring Intermediate Code. Local Optimizations. Lecture 22.
1 Predicate Abstraction of ANSI-C Programs using SAT Edmund Clarke Daniel Kroening Natalia Sharygina Karen Yorav (modified by Zaher Andraus for presentation.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
From last time: live variables Set D = 2 Vars Lattice: (D, v, ?, >, t, u ) = (2 Vars, µ, ;,Vars, [, Å ) x := y op z in out F x := y op z (out) = out –
Low-Level Programming Languages
Code Generation Professor Yihjia Tsai Tamkang University.
CSCE 121, Sec 200, 507, 508 Fall 2010 Prof. Jennifer L. Welch.
Validating High-Level Synthesis Sudipta Kundu, Sorin Lerner, Rajesh Gupta Department of Computer Science and Engineering, University of California, San.
Chapter 4 Processor Technology and Architecture. Chapter goals Describe CPU instruction and execution cycles Explain how primitive CPU instructions are.
Choice for the rest of the semester New Plan –assembler and machine language –Operating systems Process scheduling Memory management File system Optimization.
Parallelizing Compilers Presented by Yiwei Zhang.
From last lecture x := y op z in out F x := y op z (in) = in [ x ! in(y) op in(z) ] where a op b =
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time: live variables x := 5 y := x + 2 x := x + 1 y := x y...
Lecture 1CS 380C 1 380C Last Time –Course organization –Read Backus et al. Announcements –Hadi lab Q&A Wed 1-2 in Painter 5.38N –UT Texas Learning Center:
Describing Syntax and Semantics
VHDL. What is VHDL? VHDL: VHSIC Hardware Description Language  VHSIC: Very High Speed Integrated Circuit 7/2/ R.H.Khade.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Recap from last time We saw various different issues related to program analysis and program transformations You were not expected to know all of these.
CIS 260 Computer Programming I in C Prof. Timothy Arndt.
Precision Going back to constant prop, in what cases would we lose precision?
Example x := read() v := a + b x := x + 1 w := x + 1 a := w v := a + b z := x + 1 t := a + b.
IT253: Computer Organization Lecture 4: Instruction Set Architecture Tonga Institute of Higher Education.
DEPARTMENT OF COMPUTER SCIENCE & TECHNOLOGY FACULTY OF SCIENCE & TECHNOLOGY UNIVERSITY OF UWA WELLASSA 1 CST 221 OBJECT ORIENTED PROGRAMMING(OOP) ( 2 CREDITS.
A Survey of Dynamic Techniques for Detecting Device Driver Errors Olatunji Ruwase LBA Reading Group 18 th May 2010.
Software Overview. Why review software? Software is the set of instructions that tells hardware what to do The reason for hardware is to execute a program.
Chapter 1 Introduction to Computers and C++ Programming Goals: To introduce the fundamental hardware and software components of a computer system To introduce.
1 Session 3: Flow Control & Functions iNET Academy Open Source Web Programming.
Race Checking by Context Inference Tom Henzinger Ranjit Jhala Rupak Majumdar UC Berkeley.
Predicated Static Single Assignment (PSSA) Presented by AbdulAziz Al-Shammari
Implementing a variable-sized bit-vector theory for KeY Olivier Borne May 18, 2009.
Advanced Compiler Design Early Optimizations. Introduction Constant expression evaluation (constant folding)  dataflow independent Scalar replacement.
Drew Freer, Beayna Grigorian, Collin Lambert, Alfonso Roman, Brian Soumakian.
Execution of an instruction
Module : Algorithmic state machines. Machine language Machine language is built up from discrete statements or instructions. On the processing architecture,
Proving Non-Termination Gupta, Henzinger, Majumdar, Rybalchenko, Ru-Gang Xu presentation by erkan.
Introduction 1 (Read Chap. 1) What is Programming? For some given problem: design a solution for it -- identify, organize & store the problem's data --
A System to Generate Test Data and Symbolically Execute Programs Lori A. Clarke Presented by: Xia Cheng.
Chapter 1 : Overview of Computer and Programming By Suraya Alias
FUNDAMENTALS OF ALGORITHMS MCS - 2 LECTURE # 2. MODEL OF COMPUTATION REPRESENTATION OF ALGORITHMS.
Random Interpretation Sumit Gulwani UC-Berkeley. 1 Program Analysis Applications in all aspects of software development, e.g. Program correctness Compiler.
Digital Computer Concept and Practice Copyright ©2012 by Jaejin Lee Control Unit.
Expressions and Control Flow. Expressions An expression is a combination of values, variables, operators, and functions that results in a value y = 3(abs(2x)
Formal Verification. Background Information Formal verification methods based on theorem proving techniques and model­checking –To prove the absence of.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
C HAPTER 3 Describing Syntax and Semantics. D YNAMIC S EMANTICS Describing syntax is relatively simple There is no single widely acceptable notation or.
Java Basics. Tokens: 1.Keywords int test12 = 10, i; int TEst12 = 20; Int keyword is used to declare integer variables All Key words are lower case java.
Lecture #1: Introduction to Algorithms and Problem Solving Dr. Hmood Al-Dossari King Saud University Department of Computer Science 6 February 2012.
The Ins and Outs of Gradual Type Inference Avik Chaudhuri Basil Hosmer Adobe Systems Aseem Rastogi Stony Brook University.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
Contents Introduction Bus Power Model Related Works Motivation
Introduction to Problem Solving
Presentation transcript:

Ranjit Jhala Rupak Majumdar Bit-level Types for High-level Reasoning

The Problem Bit-level operators in low-level systems code Why ? –Interact with hardware –Reduce memory footprint mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

The Problem Bit-level operators in low-level systems code Inscrutable to humans, optimizers, verifiers mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte = (p & 0xFFFFF000)>> 12; b = tab[pte] & 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } 32 p 31 1

Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } 31 1 ppte

Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } ppte tab[pte]

Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } ppte b o 12 2

Whats going on ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } ppte 1220 b o 20 2

Q: How to infer complex information flow to understand, optimize, verify code ? mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (u32 p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } ppte 1220 b o 20 2

Plan Motivation Approach

Our approach: (1) Bit-level Types p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,12}{idx,20} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} Bit-level Types Sequences of {name,size} pairs

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Expressions ! Records Bit-ops ! Field accesses if (p.rd == 0){

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ Expressions ! Records Bit-ops ! Field accesses

mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} pte.idx = p.idx; Expressions ! Records Bit-ops ! Field accesses

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; Expressions ! Records Bit-ops ! Field accesses

mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } if (p.rd == 0){ pte.idx = p.idx; Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} b.addr = tab[pte.idx].addr; Expressions ! Records Bit-ops ! Field accesses

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Expressions ! Records Bit-ops ! Field accesses

mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} o.addr = p.addr; Expressions ! Records Bit-ops ! Field accesses

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } o.addr = p.addr; b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Expressions ! Records Bit-ops ! Field accesses

mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } o.addr = p.addr; b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} return m[b.addr + o.addr]; Expressions ! Records Bit-ops ! Field accesses

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } o.addr = p.addr; return m[b.addr + o.addr]; b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Expressions ! Records Bit-ops ! Field accesses

Our approach mget(p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget(p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Low-level operations eliminated bit-level types + translation o.addr = p.addr; return m[b.addr + o.addr]; b.addr = tab[pte.idx].addr; if (p.rd == 0){ pte.idx = p.idx; Program can be understood, optimized, verified

Plan Motivation Approach –Bit-level types + Translation Key: Bit-level type Inference Experiences Related work

Constraint-based Type Inference Algorithm: 0. Variables for unknowns 1. Generate constraints on vars 2. Solve constraints 2a = b – 10 b = Alice’s age: a Bob’s age: b = 22 = 54 Remember these: If Alice doubles her age, she would still be 10 years younger than Bob, who was born in How old are Alice and Bob ?

Constraint-based Type Inference Algorithm: 0. Variables for unknown bit-level types of all program expressions 1.Generate constraints on vars 2.Solve constraints

Plan Motivation Approach –Bit-level types + Translation Key: Bit-level type Inference –Constraint Generation –Constraint Solving Experiences Related work

Constraint Generation Type variables for each expression: p  p p&0x1  p&0x1 pte  pte  mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

Generating Zero Constraints Mask:  p&0xFFC [31:12] = ;  p&0xFFC [1:0] = ; mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

Generating Zero Constraints 0 12 mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Shift:  e>>12 [31:20]= ; e is p&0xFFFFF

Why are zeros special ? Consider assignment (value flows e to x ) Should x and e have same bit-level type? x = e  K K +  Common idiom: k-bit values special case of k+  -bit values Equality results in unnecessary breaks Zeros enable precise subtyping subtypes( · ) · x e Inequality constraint  x ¸  e

Generating Inequality Constraints Mask:  p&0xFFC [11:2] ¸  p [11:2] mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

Shift:  e>>12 [19:0] ¸  e [31:12] 0 12 Generating Inequality Constraints mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } e e>>12

Generating Inequality Constraints Assignment:  o ¸  p&0xFFC that is…  o [31:0] ¸  p&0xFFC [31:0] mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; }

Plan Motivation Approach –Bit-level types + Translation Key: Bit-level type Inference –Constraint Generation –Constraint Solving Experiences Related work

Constraint Solutions Solution is an assignment A: type variables ! bit-level types A(  )[i:j] = subsequence of A(  ) from bit i through j  p ) = {idx,20}{addr,10}{wr,1}{rd,1} A(  p )[12:1] = {addr,10} {wr,1} A(  p )[31:2] = {idx,20} {addr,10} A(  p )[31:5] = undefined

Constraint Solving Overview Solution is an assignment A: type variables ! bit-level types A(  [i:j]) = subsequence from bit i through j A satisfies: zero Constraint :  [i:j] = ; –If A(  )[i:j] = ; i-j+1 inequality Constraint:  [i:j] ·  ’[i’:j’] –If A(  )[i:j] · A(  ’)[i’:j’] In both cases, A(  )[i:j] must be defined

Constraint Solving Algorithm Input:Zero constraints {z_1,…,z_m} Inequality constraints{c 1,…,c n } Output:Assignment satisfying all constraints A = A 0 for i in [1…n]: A = refine(A,c i ) return A A 0 = Initial asgn satisfying zero constraints (details in paper) refine(A,c i ) adjusts A such that: c i becomes satisfied earlier constraints stay satisfied built using Split, Unify

Refine: Split(A, ,k) f,12 A(  ) p,32 12 A’(  ) A’ = Split(A, , 12 ) Throughout A, substitute: p,12 +  12 f,12 e,  where e, f are fresh e,20 p,12-  f,12-  and substitute:

Refine: Split(A, ,k) Used to ensure A(  )[i:j] is defined 2 f,12 g,10 A(  ) p, A’(  ) A’’(  ) e,20 h,2 e,20 A’ = Split(A, , 12 ) Ensure A(  [ 11:2 ] is defined A’’ = Split(A’, , 2 ) 11 2 A’’(  [ 11:2 ] defined

Refine: Unify(A,p,q) p,  q,  Throughout A, substitute:

Refine(A,  [31:12] ·  ’[19:0]) r :12 A(  ’) p : 32 A(  ) ; :10q :10 A(  ’)[19:0] undefined A’ = Split(A,  ’,19+1) A’’ = Unify(A’,q,t) r : 12 A’(  ’) t : 20 A’(  ) ; :10q :10 s : r : 12 A’’(  ’) t : 32 A’’(  ) ; :10t :10 A’’ satisfies constraint t : A’(  ’)[19:0] A’(  )[31:12] ·

Constraint Solving Input:Constraints Output:Assignment satisfying all constraints A = A 0 for i in [1…n]: A = refine(A,c i ) return A Substitution (in Split, Unify) ensures earlier constraints stay satisfied most general solution found Efficiently implemented using graphs

Plan Motivation Approach –Bit-level types + Translation Key: Bit-level type Inference –Constraint Generation –Constraint Solving Experiences Related work

Experiences Implemented bit-level type inference for C pmap: a kernel virtual memory system Implements the code for our running example mondrian: a memory protection system scull: a linux device driver (1-3 Kloc) Inference/Translation takes less than 1s

Mondrian [Witchel et. al.] Bit packing for memory and permission bits –2600 lines of code, generated 775 constraints –Translated to program without bit-operations –18 different bit-packed structures 10 assertions provided by programmer –After translation, assertions verified using BLAST –6 safe: all require bit-level reasoning Previously, verification was not possible –4 false positives: imprecise modeling of arrays

Cop outs (i.e. Future Work) 1.Truly binary bit-vector operations –x << y, x && y –Currently: Value-flow analysis to infer constants flowing to y Break into a switch statement 2.Flow-sensitivity Currently: SSA renaming 3.Arithmetic overflow does a k-bit value “spill over” Currently: Assume no overflow 4.Path-sensitivity (value dependent types) Type of suffix depends on value of first field e.g. Instruction decoder for architecture simulator Number/type of operands depends on opcode

Plan Motivation Approach –Bit-level types + Translation Key: Bit-level type Inference –Constraint Generation –Constraint Solving Experiences Related work

Related Work O Callahan – Jackson [ICSE 97] –Type Inference Gupta et. al. [POPL 03, CC02] –Dataflow analyses for packing bit-sections Ramalingam et. al. [POPL 99] –Aggregate structure inference for COBOL

Conclusions (Automatic) reasoning about Bit-operations hard Structure: bit-operations pack data into one word Structure Inferred via Bit-level Type Inference Structure Exploited via Translation to fields Precise, efficient reasoning about Bit-operations

Thank you

Previous approaches model bitwise ops by: 1.Uninterpreted functions Imprecise 2.Logical axioms Inefficient 3.Bit-blasting terms into 32/64-bits Lose high-level relationships Q: How to infer complex information flow to understand, optimize, verify code ?

Refine Two basic operations: split, unify Split(A, ,[i:j]): ensures A(  )[i:j] is defined 2 f : 12 g : 10 A(  ) p : A’(  ) A’’(  ) e : 20h:2 Split(A, , [11:2] ) A’ = in A, substitute: p :  +(11+1) 11+1 f : 11+1 e :  where e, f are fresh e : 20 A’’ = in A’, substitute: f :  +2 h : 2 g :  where g,h are fresh 2

Generating Zero Constraints Mask: All but 1 st bit are zero  p&0x1 [31:1] = ; mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } 0 31

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Expressions ! Records Bit-ops ! Field accesses if (p.rd == 0){ pte.idx = p.idx; b.addr = tab[pte.idx].addr; o.addr = p.addr; return m[o.addr + p.addr];

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Expressions ! Records Bit-ops ! Field accesses if (p.rd == 0){

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Expressions ! Records Bit-ops ! Field accesses if (p.rd == 0){ pte.idx = p.idx; b.addr = tab[pte.idx].addr; o.addr = p.addr; return m[o.addr + p.addr];

Our approach: (2) Translation p pte 1220 b o 20 2 p : {idx,20}{addr,10}{wr,1}{rd,1} pte : { ;,20}{idx,10} b : {addr,30}{ ;,2} o : { ;,20}{addr,10}{ ;,2} mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } mget (p) { if (p & 0x1 == 0){ error(“permission”); } pte =(p & 0xFFFFF000)>>12; b = tab[pte]& 0xFFFFFFFC; o = p & 0xFFC; return m[(b+o)>>2]; } Expressions ! Records Bit-ops ! Field accesses if (p.rd == 0){ pte.idx = p.idx; b.addr = tab[pte.idx].addr; o.addr = p.addr; return m[o.addr + p.addr];

Constraint Solutions Solution is an assignment A: variables ! bit-level types A(  )[i:j] = subsequence of A(  ) from bit i through j  p ) = {idx,20}{addr,10}{wr,1}{rd,1} A(  p )[12:1] = {addr,10} {wr,1} A(  p )[31:2] = {idx,20} {addr,10} A(  p )[31:5] = undefined

Ranjit Jhala Rupak Majumdar Bit-level Types for High-level Reasoning via