Trustless Grid Computing in Bor-Yuh Evan Chang, Karl Crary, Margaret DeLap, Robert Harper, Jason Liszka, Tom Murphy VII, Frank Pfenning

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Challenges in increasing tool support for programming K. Rustan M. Leino Microsoft Research, Redmond, WA, USA 23 Sep 2004 ICTAC Guiyang, Guizhou, PRC joint.
Code Optimization and Performance Chapter 5 CS 105 Tour of the Black Holes of Computing.
Mobile Code Security Yurii Kuzmin. What is Mobile Code? Term used to describe general-purpose executables that run in remote locations. Web browsers come.
Reliable Scripting Using Push Logic Push Logic David Greaves, Daniel Gordon University of Cambridge Computer Laboratory Reliable Scripting.
MINJAE HWANG THAWAN KOOBURAT CS758 CLASS PROJECT FALL 2009 Extending Task-based Programming Model beyond Shared-memory Systems.
Comparing Semantic and Syntactic Methods in Mechanized Proof Frameworks C.J. Bell, Robert Dockins, Aquinas Hobor, Andrew W. Appel, David Walker 1.
ECE 454 Computer Systems Programming Compiler and Optimization (I) Ding Yuan ECE Dept., University of Toronto
Foundational Certified Code in a Metalogical Framework Karl Crary and Susmit Sarkar Carnegie Mellon University.
David Evans CS655: Programming Languages University of Virginia Computer Science Lecture 20: Total Correctness; Proof-
March 4, 2005Susmit Sarkar 1 A Cost-Effective Foundational Certified Code System Susmit Sarkar Thesis Proposal.
ESP: A Language for Programmable Devices Sanjeev Kumar, Yitzhak Mandelbaum, Xiang Yu, Kai Li Princeton University.
Ashish Kundu CS590F Purdue 02/12/07 Language-Based Information Flow Security Andrei Sabelfield, Andrew C. Myers Presentation: Ashish Kundu
An Introduction to Proof-Carrying Code David Walker Princeton University (slides kindly donated by George Necula; modified by David Walker)
The Design and Implementation of a Certifying Compiler [Necula, Lee] A Certifying Compiler for Java [Necula, Lee et al] David W. Hill CSCI
Code-Carrying Proofs Aytekin Vargun Rensselaer Polytechnic Institute.
Assurance through Enhanced Design Methodology Orlando, FL 5 December 2012 Nirav Davé SRI International This effort is sponsored by the Defense Advanced.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
Trustless Grid Computing in ConCert (Progress Report) Robert Harper Carnegie Mellon University.
Snick  snack A Working Computer Slides based on work by Bob Woodham and others.
Extensible Verification of Untrusted Code Bor-Yuh Evan Chang, Adam Chlipala, Kun Gao, George Necula, and Robert Schneck May 14, 2004 OSQ Retreat Santa.
Conductor A Framework for Distributed, Type-checked Computing Matthew Kehrt.
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.
Typed Memory Management in a Calculus of Capabilities David Walker (with Karl Crary and Greg Morrisett)
1 A Dependently Typed Assembly Language Hongwei Xi University of Cincinnati and Robert Harper Carnegie Mellon University.
Programmability with Proof-Carrying Code George C. Necula University of California Berkeley Peter Lee Carnegie Mellon University.
8/14/03ALADDIN REU Symposium Implementing TALT William Lovas with Karl Crary.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
A Type System for Expressive Security Policies David Walker Cornell University.
School of Computer ScienceG53FSP Formal Specification1 Dr. Rong Qu Introduction to Formal Specification
Extensible Untrusted Code Verification Robert Schneck with George Necula and Bor-Yuh Evan Chang May 14, 2003 OSQ Retreat.
Extensible Code Verification Kun Gao (Senior EECS) with Professor George Necula, Evan Chang, Robert Schneck, Adam Chlipala An individual receives code.
Cormac Flanagan University of California, Santa Cruz Hybrid Type Checking.
DCT 1123 PROBLEM SOLVING & ALGORITHMS INTRODUCTION TO PROGRAMMING.
C OURSE : D ISCRETE STRUCTURE CODE : ICS 252 Lecturer: Shamiel Hashim 1 lecturer:Shamiel Hashim second semester Prepared by: amani Omer.
Pushing the Security Boundaries of Ubiquitous Computing ACSF 2006 —————— 13 th July 2006 —————— David Llewellyn-Jones, Madjid Merabti, Qi Shi, Bob Askwith.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Containment and Integrity for Mobile Code Security policies as types Andrew Myers Fred Schneider Department of Computer Science Cornell University.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
The ConCert Project Peter Lee Carnegie Mellon University MRG Workshop May 2002.
Trustless Grid Computing in Bor-Yuh Evan Chang, Karl Crary, Margaret DeLap, Robert Harper, Jason Liszka, Tom Murphy VII, Frank Pfenning
Proof-Carrying Code & Proof-Carrying Authentication Stuart Pickard CSCI 297 June 2, 2005.
Towards Automatic Verification of Safety Architectures Carsten Schürmann Carnegie Mellon University April 2000.
© Andrew IrelandDependable Systems Group On the Scalability of Proof Carrying Code for Software Certification Andrew Ireland School of Mathematical & Computer.
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
 Distributed file systems having transaction facility need to support distributed transaction service.  A distributed transaction service is an extension.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
LESSON 3. Properties of Well-Engineered Software The attributes or properties of a software product are characteristics displayed by the product once.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler Toshihiro YOSHINO (D1, Yonezawa Lab.)
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
CSCI1600: Embedded and Real Time Software Lecture 28: Verification I Steven Reiss, Fall 2015.
SAFE KERNEL EXTENSIONS WITHOUT RUN-TIME CHECKING George C. Necula Peter Lee Carnegie Mellon U.
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
Agenda  Quick Review  Finish Introduction  Java Threads.
© Janice Regan, CMPT 300, May CMPT 300 Introduction to Operating Systems Operating Systems Overview: Using Hardware.
Iktara in ConCert Realizing a Certified Grid Computing Framework from Programmer’s Perspective With the vast amount of computing resources distributed.
Proof And Strategies Chapter 2. Lecturer: Amani Mahajoub Omer Department of Computer Science and Software Engineering Discrete Structures Definition Discrete.
Theorem Proving Algorithm
Introduction to programming languages, Algorithms & flowcharts
Introduction to programming languages, Algorithms & flowcharts
TALx86: A Realistic Typed Assembly Language
State your reasons or how to keep proofs while optimizing code
Introduction to programming languages, Algorithms & flowcharts
CSCI1600: Embedded and Real Time Software
The ConCert Project Trustless Grid Computing
Lecture 19: Proof-Carrying Code Background just got here last week
CSCI1600: Embedded and Real Time Software
Presentation transcript:

Trustless Grid Computing in Bor-Yuh Evan Chang, Karl Crary, Margaret DeLap, Robert Harper, Jason Liszka, Tom Murphy VII, Frank Pfenning 18 Nov 2002 GRID 2002, Baltimore MD

2 The ConCert Project Create a system and technologies for trustless grid computing in ad hoc, peer-to-peer networks. – Trust model based on code certification. – Grid framework using this model. – Advanced languages for grid computing. – Applications of trustless grid computing. Interplay between basic research in type theory and logic, programming practice. This talk: code certification, grid framework

3 Why Peer-to-Peer? Symmetric view of the network (giant computer with many keyboards: any programmer can run tasks on the grid) Enables ad-hoc collaboration No single point of failure Lots of hard research problems!

4 Establishing Trust Relationships Fundamental difficulty in peer-to-peer grid computing: establishing trust. Code may be malicious (or simply buggy) Cycle volunteers must trust that the code is safe to run Native code is desirable: grid applications cycle-bound

5 Safety Policies The ConCert system is policy-based. “I only accept code that …” “… is memory safe.” “… does not write to my disk.” “… uses parsimonious resources.” “… comes from an educational institution.” etc.

6 Certifiable Policies Certifiable now: Memory safety, control-flow safety Compliance with abstraction boundaries From these, many others (by controlled access to APIs and system calls) Work in progress: Resource usage (CPU, memory) Privacy and information-flow properties … how exactly are these certified?

7 Certification Mathematical certification of policies Proof (“certificate”) that the donor’s policy is met Based on intrinsic properties of code, not the code producer’s reputation Proofs in a specific machine-checkable form. Basic technology: Certified Code

8 Certified Code: Certifying Compilers codecertificate SML IR x86 Start with program in safe language: Java, SML, Safe C Safe for some reason Transform the code and simultaneously the reason that it is safe. Finish with machine code, checkable certificate. Doesn’t depend on compiler correctness. No extra burden on app developer. (Bonus: great engineering benefits for compiler writers)

9 Certified Code Several certified code systems. Proof Carrying Code (PCC: Necula, Lee): Compiler produces a safety proof in logic Verification consists of proof checking Typed Assembly Language (TAL: Morrisett, Crary et al.): Compiler produces type annotations for the machine code that imply safety Verification is type-checking Both technologies work with native code No expensive/complicated JIT compilation step Allows for hand-tuned/proved inner loops

10 Typed Assembly Language A taste of TAL code: _fact: LABELTYPE MOV EDX, DWORD PTR [ESP+4] MOV EAX, subsume(,1) MOV ECX, subsume(,2) FALLTHRU forTest4: LABELTYPE CMP ECX, EDX JGE forEnd6 IMUL EAX, ECX ADD ECX, 1 JMP tapp(forTest4, ) forEnd6: RETN int fact(int i) { int r = 1; for(int j = 2; j < i; j ++) r *= j; return r; }

11 Typed Assembly Language A taste of TAL code: _fact: MOV EDX, DWORD PTR [ESP+4] MOV EAX, subsume(,1) MOV ECX, subsume(,2) FALLTHRU forTest4: LABELTYPE CMP ECX, EDX JGE forEnd6 IMUL EAX, ECX ADD ECX, 1 JMP tapp(forTest4, ) forEnd6: RETN int fact(int i) { int r = 1; for(int j = 2; j < i; j ++) r *= j; return r; }

12 Typed Assembly Language Size of certificates is a point of concern For TAL, |certificate|  |code| lightharp.o (stripped) 122.5k (code) lightharp.to 92.3k (cert) Working on techniques to reduce this overhead Code is cached; certificate can be deleted after it is verified once

13 Checkpoint! A certified code system is: A way of supplying a proof that object code meets a safety policy A way of verifying that proof Next: A peer-to-peer grid framework based around this technology.

14 The ConCert Framework Difficult distributed computing task: Thousands of nodes Trustless environment High failure rate Our engineering strategy: Intensely simple network abstraction Programming languages provide more convenient abstractions on top of the network

15 The ConCert Framework The ConCert network looks like this: A number of symmetric grid peers, that serve and run the work. Clients, that submit the initial work and collect and display the results. 120

16 Cords Cords are the unit of work on the grid. Break up a program into smaller parts Can be scheduled more easily Can support failure recovery Like compiler’s “basic blocks” Split by communication structure, not jmp s Usually containing significant computation “… factor the number n.” “… evaluate this chess position 3 moves deep.”

17 Cords Cords can have dependencies on the results of other cords. Identified by MD5 hash of code, certificate, dependencies.

18 Cords Cords are simplified by three rules: Once a cord is ready to run, it does not block No “waiting” for another cord’s result Cords are idempotent Failed cords can be re-run Cords don’t rely on effects of other cords Communication explicit through dependencies

19 Cords Not as restrictive as they may seem: Cords can create new cords. (This is where certified code is really important!) Some styles of parallelism can be coded up Continuation passing style  fork-join parallelism Compiler should be able to do this for you Not yet clear what grid apps require more This is validated by our prototype applications.

20 A Grid Participant (the Conductor software) Discover other Participants. Maintain a set of cords and their dependencies. Manage results returned by workers. Contact local and remote Schedulers to find cords. Download, verify the certificates, and run the code. Return the result. Locator Scheduler Worker(s)

21 Applications Several Applications in the ConCert framework: Lightharp: Ray Tracer Trivial branching with depth = 1 External client “joins” on the cords it inserts Iktara: Theorem Prover for Linear Logic Tougher: multiple results, functions as results Only runs on simulator now Tempo: Chess Player Jamboree algorithm (Joerg, Kuszmaul) Fork-join style, depth > 1

22 Related/Future: Programming Languages How to write grid applications? Language primitives for mobile code Code transformations and compilation techniques Compiler does the dirty work

23 Related/Future: Answer Verification Certified code establishes trust in one direction. But what about malicious volunteers? Might always give the same, wrong answer. Might collude with other donors to coordinate attacks! Some problems have self-certifying results. Factorization: check that n * m = k Theorem proving: proof checking is easy For other problems, use cryptography and voting or other techniques. (?) A work in progress!

24 Conclusion Certified Code is the enabling technology for ad hoc peer-to-peer Grid computing. ConCert is a policy-based framework where code comes with a proof (certificate) of safety within that policy. Proofs can be generated automatically by the compiler. Cords are an appropriate basic unit of abstraction for such a network: They provide sufficient expressiveness while supporting failure recovery and straightforward scheduling algorithms.

25