Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Certification for Typed Assembly Language Dan Grossman (with Greg Morrisett) Cornell University 2000 ACM SIGPLAN Workshop on Types in Compilation.

Similar presentations


Presentation on theme: "Scalable Certification for Typed Assembly Language Dan Grossman (with Greg Morrisett) Cornell University 2000 ACM SIGPLAN Workshop on Types in Compilation."— Presentation transcript:

1 Scalable Certification for Typed Assembly Language Dan Grossman (with Greg Morrisett) Cornell University 2000 ACM SIGPLAN Workshop on Types in Compilation AFTER

2 September 2000 TIC00 Montreal 2 Types After Compilation -- Why? Verifying object code is “well-behaved” means we needn’t trust the code producer Producer-supplied types guide verification Encourages compiler robustness Promises efficient untrusted plug-ins To maximize benefit, we want...

3 September 2000 TIC00 Montreal 3 Certified Code Design Goals Low-level target language avoids performance / trusted computed base trade-off Source-language & compiler independent avoids hacks, promotes re-use, the object-code way Permit efficient object code otherwise, just interpret or monitor at run time Small Certificates and Fast Verification otherwise, only small programs are possible Still learning how to balance these needs in practice

4 September 2000 TIC00 Montreal 4 State of the Art

5 September 2000 TIC00 Montreal 5 Scalable Certification in 15 mins Classification of Approaches Why Compiler Independence Makes Scalability Harder Techniques that Make TAL Work Experimental Results Summary of some lessons learned See the paper for much, much more

6 September 2000 TIC00 Montreal 6 Approach #1 -- Bake It In If you allow only one way, no annotations needed and it’s trivial to check Examples: Grouping code into procedures Function prologues Installing exception handlers The type system is at a different level of abstraction An analogy: RISC vs. CISC

7 September 2000 TIC00 Montreal 7 Approach #2 -- Don’t Optimize Optimizations that are expensive to prove safe are expensive to certify Examples: Dynamic type tests Arithmetic (division by zero, array-bounds elimination) Memory initialized before use Better code can make a system look worse A new factor for where to optimize?

8 September 2000 TIC00 Montreal 8 Approach #3 -- Reconstruct Don’t write down what the verifier can easily determine Examples: Don’t put types on every instruction/operand Omit proof steps where inversion suffices Re-verify target code at each “call” site (virtual inlining) Can trade time for space or get a win/win Analogy: source-level type inference w/o the human factor

9 September 2000 TIC00 Montreal 9 Approach #4 -- Compress Let gzip and domain-specific tricks solve our problems For annotation size, no reason not to compress Easy to pipeline decompression, but certification is not I/O bound Then again, object code compresses too

10 September 2000 TIC00 Montreal 10 Approach #5 -- Abbreviate Give the code producer type-level tools for parameterization and re-use Just (terminating) functions at the type level Usually easy for the code producer Improves certificate size, but may hurt certification time Not much harder than implementing the lambda-calculus

11 September 2000 TIC00 Montreal 11 Approaches Summary Bake it in Don’t optimize Reconstruct Compress Abbreviate Now let’s get our hands dirty...

12 September 2000 TIC00 Montreal 12 An Example – Code Pre-condition int foo(int x) { return x; } foo:  MOV EAX, [ESP+0] RETN Pre-condition describes calling convention: where are the arguments, results, return address, exception handler (what’s an exception anyway),...

13 September 2000 TIC00 Montreal 13 Bake it in... int foo(int x) { return x; } foo: int  int MOV EAX, [ESP+0] RETN Pre-condition describes calling convention: where are the arguments, results, return address, exception handler (what’s an exception anyway),...

14 September 2000 TIC00 Montreal 14 Really bake it in... int foo(int x) { return x; } foo_Fii: MOV EAX, [ESP+0] RETN Pre-condition describes calling convention: where are the arguments, results, return address, exception handler (what’s an exception anyway),...

15 September 2000 TIC00 Montreal 15 Or spell it all out... int foo(int x) { return x; } foo:  a:T,b:T,c:T,r1:S,r2:S,e1:C,e2:C. {ESP: {ESP:int::r1@{EAX:exn,ESP:r2,M:e2}::r2 EAX:int, EBX:a,ESI:b,EDI:c, M:e1+e2, EBP: {EAX:exn,ESP:r2,M:e2}::r2, }::int::r1@{EAX:exn,ESP:r2,M:e2}::r2, EBP: {EAX:exn,ESP:r2,M:e2}::r2, EBX:a, ESI:b, EDI:c, M:e1+e2} MOV EAX, [ESP+0] RETN Pre-condition describes calling convention: arguments, results, return address pre-condition, callee-save registers, exception handler,...

16 September 2000 TIC00 Montreal 16 What to do?   a:T,b:T,c:T,r1:S,r2:S,e1:C,e2:C. {ESP: {ESP:int:: r1@{EAX:exn,ESP:r2,M:e2}::r2 EAX:int, EBX:a,ESI:b,EDI:c, M:e1+e2, EBP: {EAX:exn,ESP:r2,M:e2}::r2, }::int:: r1@{EAX:exn,ESP:r2,M:e2}::r2, EBP: {EAX:exn,ESP:r2,M:e2}::r2, EBX:a, ESI:b, EDI:c, M:e1+e2} Compress (compiler invariants are very repetitious) Don’t optimize (fewer invariants) Abbreviate: foo: F [int] int F = args  results  args result

17 September 2000 TIC00 Montreal 17 And Reconstruction Too If we elide a pre-condition, the verifier can re-verify the block for each predecessor Restrict to forward jumps to prevent loops Beware exponential blowup Bad news: Optimal type placement appears intractable Good news: Naive heuristics save significant space

18 September 2000 TIC00 Montreal 18 A real application A bootstrapping compiler from Popcorn to TAL Popcorn: “Java w/o objects, w/ polymorphism and limited pattern-matching” “ML w/o closures or modules, w/ C-like core syntax” “Safe C – pointerful, garbage collection, exceptions” Compiler: Conventional Graph-coloring register allocation, null-check elimination Verifier: OCaml 2.04 System: Pentium II, 266MHz, 64MB, NT4.0

19 September 2000 TIC00 Montreal 19 Bottom line – it works Source code: 18KLOC, 39 files Target code: 816 Kb (335 Kb after strip ) Target types: 419 Kb Compilation: 40 secs Assembly: 20 secs Verification: 34.5 secs And proportional to file size

20 September 2000 TIC00 Montreal 20 The engineering matters (Recall: 419Kb of types, 34.5 secs to verify) Without abbreviations: 2041Kb Without pre-condition elision: 550Kb Without either: 4500Kb As much elision as legal: 402Kb, 740 secs gzip reduces the 419Kb to 163Kb

21 September 2000 TIC00 Montreal 21 Also studied... Differences among code styles Techniques for speeding up the verifier Other forms of reconstruction Being “ gzip -friendly”

22 September 2000 TIC00 Montreal 22 Some engineering lessons Compiler-independence produces large repetitious annotations. Abbreviations are easy and space- effective, but not time-effective. Overhead should never be proportional to the number of loop-free paths in the code. Certification bottlenecks often do not appear in small, simple programs.


Download ppt "Scalable Certification for Typed Assembly Language Dan Grossman (with Greg Morrisett) Cornell University 2000 ACM SIGPLAN Workshop on Types in Compilation."

Similar presentations


Ads by Google