New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California at Irvine UC Irvine – project.

Slides:



Advertisements
Similar presentations
Program Representations. Representing programs Goals.
Advertisements

ISBN Chapter 3 Describing Syntax and Semantics.
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
Component Patterns – Architecture and Applications with EJB copyright © 2001, MATHEMA AG Component Patterns Architecture and Applications with EJB JavaForum.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
Representing programs Goals. Representing programs Primary goals –analysis is easy and effective just a few cases to handle directly link related things.
1 Intermediate representation Goals: –encode knowledge about the program –facilitate analysis –facilitate retargeting –facilitate optimization scanning.
ISBN Chapter 1 Preliminaries. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Chapter 1 Topics Motivation Programming Domains.
Java for High Performance Computing Jordi Garcia Almiñana 14 de Octubre de 1998 de la era post-internet.
Reference Book: Modern Compiler Design by Grune, Bal, Jacobs and Langendoen Wiley 2000.
MinML: an idealized programming language CS 510 David Walker.
A Type System for Expressive Security Policies David Walker Cornell University.
1 New Architectures Need New Languages A triumph of optimism over experience! Ian Watson 3 rd July 2009.
ISBN Lecture 01 Preliminaries. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.1-2 Lecture 01 Topics Motivation Programming.
From Cooper & Torczon1 Implications Must recognize legal (and illegal) programs Must generate correct code Must manage storage of all variables (and code)
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Describing Syntax and Semantics
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
1 The Problem o Fluid software cannot be trusted to behave as advertised unknown origin (must be assumed to be malicious) known origin (can be erroneous.
Session-02. Objective In this session you will learn : What is Class Loader ? What is Byte Code Verifier? JIT & JAVA API Features of Java Java Environment.
Java Security Updated May Topics Intro to the Java Sandbox Language Level Security Run Time Security Evolution of Security Sandbox Models The Security.
Precision Going back to constant prop, in what cases would we lose precision?
Language Evaluation Criteria
Java Security. Topics Intro to the Java Sandbox Language Level Security Run Time Security Evolution of Security Sandbox Models The Security Manager.
High level & Low level language High level programming languages are more structured, are closer to spoken language and are more intuitive than low level.
COP4020 Programming Languages
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.
Introduction to Java CSIS 3701: Advanced Object Oriented Programming.
Containment and Integrity for Mobile Code Security policies as types Andrew Myers Fred Schneider Department of Computer Science Cornell University.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
University of Houston-Clear Lake Proprietary© 1997 Evolution of Programming Languages Basic cycle of improvement –Experience software difficulties –Theory.
New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California, Irvine February 2001 UC.
New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California, Irvine July 2001 UC Irvine.
Introduction and Features of Java. What is java? Developed by Sun Microsystems (James Gosling) A general-purpose object-oriented language Based on C/C++
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
CSE 425: Data Types I Data and Data Types Data may be more abstract than their representation –E.g., integer (unbounded) vs. 64-bit int (bounded) A language.
The System and Software Development Process Instructor: Dr. Hany H. Ammar Dept. of Computer Science and Electrical Engineering, WVU.
Writing Systems Software in a Functional Language An Experience Report Iavor Diatchki, Thomas Hallgren, Mark Jones, Rebekah Leslie, Andrew Tolmach.
Compressed Abstract Syntax Trees as Mobile Code Christian H. Stork Vivek Haldar University of California, Irvine.
High Integrity Ada in a UML and C world Peter Amey, Neil White Presented by Liping Cai.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter# 6 Code generation.  The final phase in our compiler model is the code generator.  It takes as input the intermediate representation(IR) produced.
 Programming - the process of creating computer programs.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
1 Asstt. Prof Navjot Kaur Computer Dept PRESENTED BY.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University Java - Introduction.
A Single Intermediate Language That Supports Multiple Implemtntation of Exceptions Delvin Defoe Washington University in Saint Louis Department of Computer.
Advanced Computer Systems
Chapter 1 Introduction.
Types for Programs and Proofs
PROGRAMMING LANGUAGES
Compiler Construction (CS-636)
Chapter 1 Introduction.
Many-core Software Development Platforms
Threads and Memory Models Hal Perkins Autumn 2011
CSE401 Introduction to Compiler Construction
Threads and Memory Models Hal Perkins Autumn 2009
Software Engineering I
(Computer fundamental Lab)
Java History, Editions, Version Features
Presentation transcript:

New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California at Irvine UC Irvine – project transprose: transporting programs securely

Technical Objective (1)  design the third line of defense in a mobile-code system first line of defense: access control (physical, logical) second line of defense: authentication intrusion false authenti- cation malicious mobile program new third line of defense prevent execution unless provably secure

Technical Objective (2)  make this “third line of defense” a pervasive property of every computer system, not just a luxury good afforded by only a few expensive ultra-secure high- end installations  rather than simply demonstrating the viability of mobile-code security, also make it practical across a wide spectrum of applications  in this context, practical means scalable to large applications, with excellent final code quality, at resonable just-in-time compilation speed and cost

Existing Practice: Java  “Java” is the de-facto standard format for distributing mobile programs  when we speak of “distributing mobile programs using Java”, we in fact usually mean “using the Java Virtual Machine”  the JVM has an instruction set that has been designed specifically for representing Java programs –interestingly enough, there still are JVM programs for which no legal equivalent Java source program exists

Existing Practice: Java Security  although the Java programming environment is type- safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyLibrary { public void NoSecret(); private void ASecret(); } class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } callMyLibrary.NoSecret... JVM-code stream

Existing Practice: Java Security  although the Java programming environment is type- safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyLibrary { public void NoSecret(); private void ASecret(); } class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } callMyLibrary.ASecret... corrupted JVM-code stream

Existing Practice: Java Security  Java’s byte-code security model requires time- consuming static verification and/or dynamic checking while the code is being executed  systematic study of security issues is still in its infancy IF THEN... ELSE MyLibrary.Asecret()

Existing Practice: JVM Performance  upon arrival at a target machine, most JVM code is translated into the appropriate native code “just-in- time”  performance resulting from “just-in-time” compilation is not competitive with off-line compilers –compilation systems such as Sun’s HotSpot are incredibly complex and haven’t delivered on their promise  JVM approach is unlikely to scale to large programs requiring top-level performance

Raising JVM Performance  raising the performance of JVM-code has been addressed by “annotating” the byte-code stream with compiler back-end related information  “annotated” class-files run much faster if an annotation-aware byte-code compiler is available on the target platform  security is lost: the “annotations” are not optional to the annotation-aware compiler; if an adversary falsifies them, the compiler will create a program that may be unsafe!

Emerging Practice: PCC  ship a native program along with a “proof” that it doesn’t violate a given security policy  although more general security policies are imaginable, current PCC systems essentially use type safety (and concomitant memory safety) as their security policy (our approach does the same)  PCC drastically reduces the size of the trusted computing base

Emerging Practice: PCC - Problems  PCC is based on native code –(otherwise the trusted computing base would become larger again, defeating the main advantage of PCC)  PCC has the performance advantages of fully optimized code, but requires multiple versions for multiple platforms  also, in the long run, dynamically generated code (using feedback from dynamic profiling) will generally outperform native code

Our Technical Approach  study the interaction of security-related information, optimization-enhancing information, and compression, rather than considering them separately –use syntax-directed compression as a means of obtaining guaranteed referential integrity –transport compiler-related annotations to obtain top-level performance on the eventual target machine –use a proof-based approach to guard the compiler-related annotations from falsification in transit

Our Technical Approach (2)  no single focus on security, code-quality, or encoding density, but attempt to study their interaction and make progress along all three dimensions  preliminary evidence suggests that these three topics are strongly interrelated and that representations based on adaptive compression of syntax trees are ideally suited for transporting mobile programs  this research is orthogonal and complementary to work on authentication and security policies

Our Policy Assumptions  type safety using the typing model of the source language –all of the host’s library routines are guaranteed to be called with parameters of the correct types –capabilities (object pointers) owned by the host can be manipulated by the mobile client application only as specified in the host’s interface definition (private, protected, …) and cannot be forged  type safety is guaranteed by our mobile code transportation scheme

Compression vs. Security  code compression and security may often be complimentary  idea: choose an encoding that can express only legal programs  example: int i, j, k, l; float r, s, t, u; { i = j }

Compression vs. Security  code compression and security may often be complimentary  idea: choose an encoding that can express only legal programs  example: int i, j, k, l; float r, s, t, u; { i = j } operator :=

Compression vs. Security  code compression and security may often be complimentary  idea: choose an encoding that can express only legal programs  example: int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) :=i

Compression vs. Security  code compression and security may often be complimentary  idea: choose an encoding that can express only legal programs  example: int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) second operand (1 out of 3 or 4!) :=ij

Compression vs. Security  code compression and security may often be complimentary  idea: choose an encoding that can express only legal programs  example:  higher-level encodings: enumerate all legal assignments = at most 2 * possibilities int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) second operand (1 out of 3 or 4!) :=ij ( ) n2n2

Virtual Machines vs. Graphs  information is lost when compiling to the “flat” representation of virtual machines  many native code optimizations require this information to be re-discovered Virtual Machine Representation Graph-Based Representation bra +2...

Performance-Enhancing Information  compiler-related information intended for improving code quality re-introduces redundancy that can be exploited by an adversary  for example, a program can be encoded with guaranteed referential integrity using a grammar close to the semantics of the source language  but in order to allow optimizations, the grammar needs to be relaxed  the “holes” in the relaxed grammar need to be guarded by other means based on proof-carrying code concepts

General Approach Taken  use encoding-inherent security wherever possible (a well-formedness property of the encoding itself)  use proof-based security where necessary to support optimizations –transporting results of alias analysis –removing range or type checks  this approach applies regardless of the semantic level on which the program is being transported  but the correct choice of such a semantic level must also be considered!

Highest-Level Encoding  simple and easily understood security policy based on type-safety  ultra-compact representation using grammar-based compression  guaranteed referential integrity provided essentially “for free” by the encoding –relatively small amount of proof-based security required only for additional performance-enhancing annotations –e.g., exceptions, alias analysis, escape analysis, dynamic type safety  time required for dynamic compilation may be a problem

Project Workflow “High Level Thread” Com- press- ion Enco- ding Proofs arithmetic encoding dictionary encoding P2K Java abstract grammar theorem prover efficient annotated JAG annotated JAG arithmetic encoding dictionary encoding JAG combination heuristics 2. compression of Java programs 1. guarantee complete static semantics through encoding (enhanced) static semantics 3. reduced verification effort due to abstract grammar encoding well formedness

Lowest-Level Encoding  compiler-oriented intermediate representation  goal is to provide much better code quality with far less effort at the code consumer’s site  requires more proof-based security than the “high- level” approach, but still far less than the “original PCC idea” where the goal is to reduce the TCB  more voluminous transportation format  could be more difficult to reason about safety because further removed from the source language

Project Workflow “Low Level Thread” Com- press- ion Enco- ding Proofs SSA-directed encoding typed SSA theorem prover secure annotated typed SSA annotated typed SSA annotation encoding 2. UAST after performing all target-machine independent optimizations annotation encoding 1. universal (source- language neutral) abstract syntax tree representation 4. provably secure target- machine independent low-level representation 3. encoding for the proofs required to guard the TASSA

Third Way: Core Calculus  two-stage mapping of the mobile code –source constructs are mapped to the core calculus –mapping may be transported as well, or assumed global shared knowledge  simple and easily understood security policy  only approach that is easily extensible even by third parties  not clear if this approach will yield adequate native code quality at the consumer’s site  the relative trade-offs are as of yet unknown

Current Status and Rationale  developed a comprehensive library of stream compressors in Java  “high-level” encoding prototype is up and running –working on a contribution to PLDI 2001 on Java compression  “low-level” encoding and “core calculus” prototypes will be operational over the summer  the relative trade-offs (encoding density vs. decoding/dynamic compilation speed vs. code quality) can only be determined by collecting experience with actual prototypes

Quantitative Metrics  security –publish complete design specification and rationale and open the design to public scrutiny and external validation  efficiency –measure by comparing generated code quality with that of existing on-the-fly compilers  code density –measure by comparing with competing proof-carrying code and mobile-code distribution formats

Expected Major Achievements  demonstrate that graph-based encoding formats are superior to virtual machines  explore the relative trade-offs that can only be determined by building an actual prototype –encoding density/network transfer speed vs. –decoding/dynamic compilation speed –code quality, especially when using the core-calculus approach  publish a design rationale that can form the basis of a subsequent standardization effort

Long-Term Impact  enable an educated choice of a replacement technology at the end of the Java Virtual Machine’s life-cycle  royalty-free and free of particular proprietary intellectual property claims  developed under the scrutiny of and in dialogue with the security community

Task Schedule Y1 Milestones: source-level representation => Java compression low-level representation core calculus representation Y2 Milestones: 3 system prototypes trade-off analysis encoding format comprehensive definition End of Project: system deliverable comprehensive documentation investigate: multiple source languages graph-based encoding schemes proof-carrying code investigate: requirements of optimizing code generators integration of security vs. compiler-related data investigate: mutual interaction of security, efficiency, and compression density security of system

Transition of Technology  the final design rationale document will provide enough detail that unrelated third parties will be able to replicate our code-transportation scheme(s)  our prototype implementation(s) will be made available in source form  the graduate students involved in this work are likely to transfer into the industrial sector

Thank You