New Approaches to Mobile Code: Reconciling Execution Efficiency with Provable Security Michael Franz University of California at Irvine UC Irvine – project transprose: transporting programs securely
Technical Objective (1) design the third line of defense in a mobile-code system first line of defense: access control (physical, logical) second line of defense: authentication intrusion false authenti- cation malicious mobile program new third line of defense prevent execution unless provably secure
Technical Objective (2) make this “third line of defense” a pervasive property of every computer system, not just a luxury good afforded by only a few expensive ultra-secure high- end installations rather than simply demonstrating the viability of mobile-code security, also make it practical across a wide spectrum of applications in this context, practical means scalable to large applications, with excellent final code quality, at resonable just-in-time compilation speed and cost
Existing Practice: Java “Java” is the de-facto standard format for distributing mobile programs when we speak of “distributing mobile programs using Java”, we in fact usually mean “using the Java Virtual Machine” the JVM has an instruction set that has been designed specifically for representing Java programs –interestingly enough, there still are JVM programs for which no legal equivalent Java source program exists
Existing Practice: Java Security although the Java programming environment is type- safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyLibrary { public void NoSecret(); private void ASecret(); } class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } callMyLibrary.NoSecret... JVM-code stream
Existing Practice: Java Security although the Java programming environment is type- safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit class MyLibrary { public void NoSecret(); private void ASecret(); } class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…} } callMyLibrary.ASecret... corrupted JVM-code stream
Existing Practice: Java Security Java’s byte-code security model requires time- consuming static verification and/or dynamic checking while the code is being executed systematic study of security issues is still in its infancy IF THEN... ELSE MyLibrary.Asecret()
Existing Practice: JVM Performance upon arrival at a target machine, most JVM code is translated into the appropriate native code “just-in- time” performance resulting from “just-in-time” compilation is not competitive with off-line compilers –compilation systems such as Sun’s HotSpot are incredibly complex and haven’t delivered on their promise JVM approach is unlikely to scale to large programs requiring top-level performance
Raising JVM Performance raising the performance of JVM-code has been addressed by “annotating” the byte-code stream with compiler back-end related information “annotated” class-files run much faster if an annotation-aware byte-code compiler is available on the target platform security is lost: the “annotations” are not optional to the annotation-aware compiler; if an adversary falsifies them, the compiler will create a program that may be unsafe!
Emerging Practice: PCC ship a native program along with a “proof” that it doesn’t violate a given security policy although more general security policies are imaginable, current PCC systems essentially use type safety (and concomitant memory safety) as their security policy (our approach does the same) PCC drastically reduces the size of the trusted computing base
Emerging Practice: PCC - Problems PCC is based on native code –(otherwise the trusted computing base would become larger again, defeating the main advantage of PCC) PCC has the performance advantages of fully optimized code, but requires multiple versions for multiple platforms also, in the long run, dynamically generated code (using feedback from dynamic profiling) will generally outperform native code
Our Technical Approach study the interaction of security-related information, optimization-enhancing information, and compression, rather than considering them separately –use syntax-directed compression as a means of obtaining guaranteed referential integrity –transport compiler-related annotations to obtain top-level performance on the eventual target machine –use a proof-based approach to guard the compiler-related annotations from falsification in transit
Our Technical Approach (2) no single focus on security, code-quality, or encoding density, but attempt to study their interaction and make progress along all three dimensions preliminary evidence suggests that these three topics are strongly interrelated and that representations based on adaptive compression of syntax trees are ideally suited for transporting mobile programs this research is orthogonal and complementary to work on authentication and security policies
Our Policy Assumptions type safety using the typing model of the source language –all of the host’s library routines are guaranteed to be called with parameters of the correct types –capabilities (object pointers) owned by the host can be manipulated by the mobile client application only as specified in the host’s interface definition (private, protected, …) and cannot be forged type safety is guaranteed by our mobile code transportation scheme
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; { i = j }
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; { i = j } operator :=
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) :=i
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) second operand (1 out of 3 or 4!) :=ij
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example: higher-level encodings: enumerate all legal assignments = at most 2 * possibilities int i, j, k, l; float r, s, t, u; { i = j } operator first operand (1 out of 8) second operand (1 out of 3 or 4!) :=ij ( ) n2n2
Virtual Machines vs. Graphs information is lost when compiling to the “flat” representation of virtual machines many native code optimizations require this information to be re-discovered Virtual Machine Representation Graph-Based Representation bra +2...
Performance-Enhancing Information compiler-related information intended for improving code quality re-introduces redundancy that can be exploited by an adversary for example, a program can be encoded with guaranteed referential integrity using a grammar close to the semantics of the source language but in order to allow optimizations, the grammar needs to be relaxed the “holes” in the relaxed grammar need to be guarded by other means based on proof-carrying code concepts
General Approach Taken use encoding-inherent security wherever possible (a well-formedness property of the encoding itself) use proof-based security where necessary to support optimizations –transporting results of alias analysis –removing range or type checks this approach applies regardless of the semantic level on which the program is being transported but the correct choice of such a semantic level must also be considered!
Highest-Level Encoding simple and easily understood security policy based on type-safety ultra-compact representation using grammar-based compression guaranteed referential integrity provided essentially “for free” by the encoding –relatively small amount of proof-based security required only for additional performance-enhancing annotations –e.g., exceptions, alias analysis, escape analysis, dynamic type safety time required for dynamic compilation may be a problem
Project Workflow “High Level Thread” Com- press- ion Enco- ding Proofs arithmetic encoding dictionary encoding P2K Java abstract grammar theorem prover efficient annotated JAG annotated JAG arithmetic encoding dictionary encoding JAG combination heuristics 2. compression of Java programs 1. guarantee complete static semantics through encoding (enhanced) static semantics 3. reduced verification effort due to abstract grammar encoding well formedness
Lowest-Level Encoding compiler-oriented intermediate representation goal is to provide much better code quality with far less effort at the code consumer’s site requires more proof-based security than the “high- level” approach, but still far less than the “original PCC idea” where the goal is to reduce the TCB more voluminous transportation format could be more difficult to reason about safety because further removed from the source language
Project Workflow “Low Level Thread” Com- press- ion Enco- ding Proofs SSA-directed encoding typed SSA theorem prover secure annotated typed SSA annotated typed SSA annotation encoding 2. UAST after performing all target-machine independent optimizations annotation encoding 1. universal (source- language neutral) abstract syntax tree representation 4. provably secure target- machine independent low-level representation 3. encoding for the proofs required to guard the TASSA
Third Way: Core Calculus two-stage mapping of the mobile code –source constructs are mapped to the core calculus –mapping may be transported as well, or assumed global shared knowledge simple and easily understood security policy only approach that is easily extensible even by third parties not clear if this approach will yield adequate native code quality at the consumer’s site the relative trade-offs are as of yet unknown
Current Status and Rationale developed a comprehensive library of stream compressors in Java “high-level” encoding prototype is up and running –working on a contribution to PLDI 2001 on Java compression “low-level” encoding and “core calculus” prototypes will be operational over the summer the relative trade-offs (encoding density vs. decoding/dynamic compilation speed vs. code quality) can only be determined by collecting experience with actual prototypes
Quantitative Metrics security –publish complete design specification and rationale and open the design to public scrutiny and external validation efficiency –measure by comparing generated code quality with that of existing on-the-fly compilers code density –measure by comparing with competing proof-carrying code and mobile-code distribution formats
Expected Major Achievements demonstrate that graph-based encoding formats are superior to virtual machines explore the relative trade-offs that can only be determined by building an actual prototype –encoding density/network transfer speed vs. –decoding/dynamic compilation speed –code quality, especially when using the core-calculus approach publish a design rationale that can form the basis of a subsequent standardization effort
Long-Term Impact enable an educated choice of a replacement technology at the end of the Java Virtual Machine’s life-cycle royalty-free and free of particular proprietary intellectual property claims developed under the scrutiny of and in dialogue with the security community
Task Schedule Y1 Milestones: source-level representation => Java compression low-level representation core calculus representation Y2 Milestones: 3 system prototypes trade-off analysis encoding format comprehensive definition End of Project: system deliverable comprehensive documentation investigate: multiple source languages graph-based encoding schemes proof-carrying code investigate: requirements of optimizing code generators integration of security vs. compiler-related data investigate: mutual interaction of security, efficiency, and compression density security of system
Transition of Technology the final design rationale document will provide enough detail that unrelated third parties will be able to replicate our code-transportation scheme(s) our prototype implementation(s) will be made available in source form the graduate students involved in this work are likely to transfer into the industrial sector
Thank You