Presentation is loading. Please wait.

Presentation is loading. Please wait.

This One Time, at PL Camp... Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July.

Similar presentations


Presentation on theme: "This One Time, at PL Camp... Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July."— Presentation transcript:

1 This One Time, at PL Camp... Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July 2007

2 Checking Type Safety of Foreign Function Calls Jeff Foster University of Maryland  Ensure type safety across languages OCaml/JNI – C  Multi-lingual type inference system Representational types  SAFFIRE Multi-lingual type inference system

3 Dangers of FFIs  In most FFIs, programmers write “glue code” Translates data between host and foreign languages Typically written in one of the languages  Unfortunately, FFIs are often easy to misuse Little or no checking done at language boundary Mistakes can silently corrupt memory One solution: interface generators

4 Example: “Pattern Matching” if (Is_long(x)) { if (Int_val(x) == 0) /* B */... if (Int_val(x) == 1) /* D */... } else { if (Tag_val(x) == 0) /* A */ Field(x, 0) = Val_int(0) if (Tag_val(x) == 1) /* C */ Field(x, 1) = Val_int(0) } type t = A of int | B | C of int * int | D

5 Garbage Collection  C FFI functions need to play nice with the GC Pointers from C to the OCaml heap must be registered value bar(value list) { CAMLparam1(list); CAMLlocal1(temp); temp = alloc_tuple(2); CAMLreturn(Val_unit); }  Easy to forget  Difficult to find this error with testing

6 Multi-Lingual Types  Representational Types Embed OCaml types in C types and vice versa

7 SAFFIRE  Static Analysis of Foreign Function InteRfacEs

8 Programming Models for Distributed Computing Yannis Smaragdakis University of Oregon  NRMI: Natural programming model for distributed computing.  J-Orchestra: Execute unsuspecting programs over a network, using program rewriting.  Morphing: High-level language facility for safe program transformation.

9 NRMI  Identify all reachable 4 97 13 t alias1 alias2 4 97 13 tree Client sideServer side Network

10 NRMI  Execute remote procedure 4 97 13 t alias1 alias2 4 09 18 tree Client sideServer side Network 2 tmp

11 NRMI  Send back all reachable 4 97 13 t alias1 alias2 4 09 18 tree Client side Network 2

12 NRMI  Match reachable maps 4 97 13 t alias1 alias2 4 09 18 tree Network 2

13 NRMI  Update original objects 4 09 18 t alias1 alias2 4 09 18 tree Network 2

14 NRMI  Adjust links out of original objects 4 09 18 t alias1 alias2 4 09 18 tree Network 2

15 NRMI  Adjust links out of new objects 4 09 18 t alias1 alias2 4 09 18 tree Network 2

16 NRMI  Garbage collect 4 09 18 t alias1 alias2 Network 2

17 J-Orchestra  Automatic partition system  Works as bytecode compiler lots of indirection using proxies, interfaces, local and remote objects  Partitioned program equivalent to original

18 Morphing  Ensure program generators are safe  Statically check the generator to determine the safety of any generated program, under All inputs ensure that genrated programs compile  Early approach – SafeGen Using theorem provers  MJ Using types

19 Fault Tolerant Computing David August and David Walker Princeton University  Processors are becoming more susceptible to intermittent faults. Moore’s Law, radiation Alter computation or state, resulting in incorrect program execution.  Goal: Build reliable systems from unreliable components.

20 Topics  Transient faults and mechanisms designed to protect against them (HW).  The role of languages and compilers may play in creating radiation hardened programs.  New opportunities made possible by languages which embrace potentially incorrect behavior.

21 Causes

22 Software/Compiler  Duplicate instructions and check at important locations (store) [SWIFT, EDDI]

23 λ zap  λ calculus with fault tolerance Intermediate language for compilers Models single fault Based on replication  Semantics model type of faults let x = 2 in let y = x + x in out y let x 1 = 2 in let x 2 = 2 in let x 3 = 2 in let y 1 = x 1 + x 1 in let y 2 = x 2 + x 2 in let y 3 = x 3 + x 3 in out [y 1,y 2,y 3 ]

24 Testing

25 Typing Ad Hoc Data Kathleen Fisher AT&T Labs  PADS project * Data Description Language (DDL) Data Description Calculus (DDC) Automatic inference of PADS descriptions * http://padsproj.org http://padsproj.org

26 PADS  Declarative description of data source: Physical format information Semantic constraints type responseCode = { x : Int | 99 < x < 600} Pstruct webRecord { Pip ip; " - - ["; Pdate(’:’) date; ":"; Ptime(’]’) time; "]"; httpMeth meth; " "; Puint8 code; " "; Puint8 size; " "; }; Parray webLog { webRecord[] };

27 Email Raw Data ASCII log files Binary Traces struct {......................... } Data Description XML CSV Standard formats & schema; Visual Information End-user tools Learning  Problem: Producing useful tools for ad hoc data takes a lot of time.  Solution: A learning system to generate data descriptions and tools automatically.

28 Format Inference Engine Chunked Data Format Refinement Tokenization Structure Discovery Scoring Function IR to PADS Printer PADS Description Input File(s)

29 Multi-Staged Programming Walid Taha Rice University  Writing generic program that do not pay a runtime overhead. Use program generators Ensure syntactic well-formed, well-typed  MetaOCaml

30 The Abstract View P2P2 P1P1 I1I1 P Batch I2I2 I2I2 I2I2

31 MetaOCaml  Brackets (..) delay execution of an expression  Escape (.~ ) Combine smaller delayed values to construct larger ones  Run (.! ) Compile and execute the dynamically generated code

32 Power Example let rec power (n, x) = match n with 0 → 1 | n → x * (power (n-1, x));; let power2 (x) = power (2, x);; let power2 = fun x → power (2, x);; let power2 (x) = 1*x*x; let rec power (n, x) = match n with 0 →.. | n →..;; let power2 =.!..))>.;;

33 Scalable Defect Detection Manuvir Das, Daniel Wang, Zhe Yang, Microsoft Research  Program analysis at Microsoft scale scalability, accuracy  Combination of weak global analysis and slow local one (for some regions of code)  Programmers are requires to add interface annotations some automatic inference is available

34 Web and Database Application Security Zhendong Su University of California-Davis  Static analyses for enforcing correctness of dynamically generated database queries.  Runtime checking mechanisms for detecting SQL injection attacks;  Static analyses for detecting SQL injection and cross-site scripting vulnerabilities.

35 XML and Web Application Programming Anders Møller University of Aarhus  Formal models of XML schemas Expressiveness of DTD, XML Schema, Relax NG  Type checking XML transformation languages “Assuming that X is valid according to S in is T(x) valid according to S out ?”  Web application frameworks Java Servlets and JSP, JWIG, GWT

36 Types for Safe C-Level Programming Dan Grossman University of Washington  Cyclone, a safe dialect for C Designed to prevent safety violations (buffer overflow, memory management, …)  Mostly underlying theory Types, expression, memory regions

37 Analyzing and Debugging Software  Understanding Multilingual Software [Foster] Parlez vous OCaml?  Statistical Debugging [Liblit] you are my beta tester, and there’s lots of you  Scalable Defect Detection [Das, Wang, Yang] Microsoft programs have no bugs

38 Programming Models  Types for Safe C-Level Programming [Grossman] C without the ick factor  Staged Programming [Taha] Programs that produce programs that produce programs...  Prog. Modles for Dist. Comp. [Smaragdakis] We’ve secretly replaced your centralized program with a distributed application. can you tell the difference?

39 The Web  Web and Database Application Security [Su] How not to be pwn3d by 1337 haxxors  XML and Web Application Programming [Møller] X is worth 8 points in scrabble...let’s use it a lot

40 Other Really Important Stuff  Fault Tolerant Computing [August, Walker] Help, I’ve been hit by a cosmic ray!  Typing Ad Hoc Data [Fisher] Data, data, everywhere, but what does it mean?

41 Statistical Debugging Ben Liblit University Of Wisconsin-Madison

42  Statistical Debugging & Cooperative Bug Isolation Observe deployed software in the hands of real end users Build statistical models of success & failure Guide programmers to the root causes of bugs Make software suck less What’s This All About?

43 Motivation “There are no significant bugs in our released software that any significant number of users want fixed.” Bill Gates, quoted in FOCUS Magazine

44 Software Releases in the Real World [Disclaimer: this may be a caricature.]

45 Software Releases in the Real World 1.Coders & testers in tight feedback loop Detailed monitoring, high repeatability Testing approximates reality 2.Testers & management declare “Ship it!” Perfection is not an option Developers don’t decide when to ship

46 Software Releases in the Real World 3.Everyone goes on vacation Congratulate yourselves on a job well done! What could possibly go wrong? 4.Upon return, hide from tech support Much can go wrong, and you know it Users define reality, and it’s not pretty –Where “not pretty” means “badly approximated by testing”

47 Testing as Approximation of Reality  Microsoft’s Watson error reporting system Crash report from 500,000 separate programs x % of software causes 50% of bugs Care to guess what x is?  1% of software errors causes 50% of user crashes  Small mismatch ➙ big problems (sometime)  Big mismatch ➙ small problem? (sometime!) Perfection is not an economically viable option

48 Real Engineers Measure Things; Are Software Engineers Real Engineers?

49 “The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong, it usually turns out to be impossible to get at or repair.” Instrumentation Framework Douglas Adams, Mostly Harmless

50 Bug Isolation Architecture Program Source Compiler Shipping Application Sampler Predicates Counts & /  Statistical Debugging Top bugs with likely causes

51  Each behavior is expressed as a predicate P on program state at a particular program point.  Count how often “P observed true” and “P observed” using sparse but fair random samples of complete behavior. Model of Behavior

52 Program Source Compiler Shipping Application Sampler Predicates Counts & /  Statistical Debugging Top bugs with likely causes Predicate Injection: Guessing What’s Interesting

53 Branch Predicates Are Interesting if (p) … else …

54 if (p) // p was true (nonzero) else // p was false (zero)  Syntax yields instrumentation site  Site yields predicates on program behavior  Exactly one predicate true per visit to site Branch Predicate Counts

55 Returned Values Are Interesting n = fprintf(…);  Did you know that fprintf() returns a value?  Do you know what the return value means?  Do you remember to check it?

56 n = fprintf(…); // return value 0 ?  Syntax yields instrumentation site  Site yields predicates on program behavior  Exactly one predicate true per visit to site Returned Value Predicate Counts

57 Pair Relationships Are Interesting int i, j, k; … i = …;

58 Pair Relationship Predicate Counts int i, j, k; … i = …; // compare new value of i with… //other vars: j, k, … //old value of i //“important” constants

59 Many Other Behaviors of Interest  Assert statements Perhaps automatically introduced, e.g. by CCured  Unusual floating point values Did you know there are nine kinds?  Coverage of modules, functions, basic blocks, …  Reference counts: negative, zero, positive, invalid  Kinds of pointer: stack, heap, null, …  Temporal relationships: x before/after y  More ideas? Toss them all into the mix!

60  Observation stream  observation count How often is each predicate observed true? Removes time dimension, for good or ill  Bump exactly one counter per observation Infer additional predicates (e.g. ≤, ≠, ≥) offline  Feedback report is: 1.Vector of predicate counters 2.Success/failure outcome label  Still quite a lot to measure What about performance? Summarization and Reporting

61 Program Source Compiler Shipping Application Sampler Predicates Counts & /  Statistical Debugging Top bugs with likely causes Fair Sampling Transformation

62 Sampling the Bernoulli Way  Decide to examine or ignore each site… Randomly Independently Dynamically  Cannot be periodic: unfair temporal aliasing  Cannot toss coin at each site: too slow

63 Amortized Coin Tossing  Randomized global countdown Small countdown  upcoming sample  Selected from geometric distribution Inter-arrival time for biased coin toss How many tails before next head? Mean sampling rate is tunable parameter

64 Geometric Distribution  D= mean of distribution = expected sample density

65 Weighing Acyclic Regions  Break CFG into acyclic regions  Each region has: Finite number of paths Finite max number of instrumentation sites  Compute max weight in bottom-up pass 1 21 1 1 2 3 4

66 Weighing Acyclic Regions  Clone acyclic regions “Fast” variant “Slow” variant  Choose at run time  Retain decrements on fast path for now Stay tuned… >4?

67 Path Balancing Optimization  Decrements on fast path are a bummer Goal: batch them up But some paths are shorter than others  Idea: add extra “ghost” instrumentation sites Pad out shorter paths All paths now equal 1 21 1 1 2 3 4

68 Path Balancing Optimization  Fast path is faster One bulk counter decrement on entry Instrumentation sites have no code at all  Slow path is slower More decrements  Consume more randomness 1 21 1 1 2 3 4

69 Optimizations  Identify and ignore “weightless” functions / cycles  Cache global countdown in local variable  Avoid cloning  Static branch prediction at region heads  Partition sites among several binaries  Many additional possibilities…

70 What Does This Give Us?  Absolutely certain of what we do see Subset of dynamic behavior Success/failure label for entire run  Uncertain of what we don’t see  Given enough runs, samples ≈ reality Common events seen most often Rare events seen at proportionate rate

71 Playing the Numbers Game Program Source Compiler Shipping Application Sampler Predicates Counts & /  Statistical Debugging Top bugs with likely causes

72 Isolating a Deterministic Bug  Hunt for crashing bug in ccrypt-1.2  Sample function return values Triple of counters per call site: 0  Use process of elimination Look for predicates true on some bad runs, but never true on any good run

73 Elimination Strategies  Universal Falsehood Disregard P if |P| = 0 for all runs Likely a predicate that can never be true  Lack of failing coverage All predicates for S is |S|=0 for all failed runs Site not reached in failing executions  Lack of failing example |P|=0 for all failed executions Need not be true for a failure to occur  Successful counterexample |P|>0 on at least one successful run Can be true without causing failure

74 Winnowing Down the Culprits  1710 counters 3 × 570 call sites  1569 zero on all runs 141 remain  139 nonzero on at least one successful run  Not much left! file_exists() > 0 xreadline() == 0

75 Multiple, Non-Deterministic Bugs  Strict process of elimination won’t work Can’t assume program will crash when it should No single common characteristic of all failures  Look for general correlation, not perfect prediction Warning! Statistics ahead!

76 Ranked Predicate Selection  Consider each predicate P one at a time Include inferred predicates (e.g. ≤, ≠, ≥)  How likely is failure when P is true? (technically, when P is observed to be true)  Multiple bugs yield multiple bad predicates

77 Some Definitions

78 Are We Done? Not Exactly! Bad( f = NULL )= 1.0

79 Are We Done? Not Exactly!  Predicate ( x = 0 ) is innocent bystander Program is already doomed Bad( f = NULL )= 1.0 Bad( x = 0 )= 1.0

80 Crash Probability  Identify unlucky sites on the doomed path  Background risk of failure for reaching this site, regardless of predicate truth/falsehood

81 Isolate the Predictive Value of P  Does P being true increase the chance of failure over the background rate?  Formal correspondence to likelihood ratio testing

82 Increase Isolates the Predictor Increase( f = NULL )= 1.0 Increase( x = 0 )= 0.0

83 It Works! …for programs with just one bug.  Need to deal with multiple bugs How many? Nobody knows!  Redundant predictors remain a major problem Goal: isolate a single “best” predictor for each bug, with no prior knowledge of the number of bugs.

84 Multiple Bugs: Some Issues  A bug may have many redundant predictors Only need one, provided it is a good one  Bugs occur on vastly different scales Predictors for common bugs may dominate, hiding predictors of less common problems

85 Bad Idea #1: Rank by Increase(P)  High Increase but very few failing runs  These are all sub-bug predictors Each covers one special case of a larger bug  Redundancy is clearly a problem

86 Bad Idea #2: Rank by F(P)  Many failing runs but low Increase  Tend to be super-bug predictors Each covers several bugs, plus lots of junk

87 A Helpful Analogy  In the language of information retrieval Increase(P) has high precision, low recall F(P) has high recall, low precision  Standard solution: Take the harmonic mean of both Rewards high scores in both dimensions

88 Rank by Harmonic Mean  Definite improvement Large increase, many failures, few or no successes  But redundancy is still a problem

89 Redundancy Elimination  One predictor for a bug is interesting Additional predictors are a distraction Want to explain each failure once  Similar to minimum set-cover problem Cover all failed runs with subset of predicates Greedy selection using harmonic ranking

90 Simulated Iterative Bug Fixing 1.Rank all predicates under consideration 2.Select the top-ranked predicate P 3.Add P to bug predictor list 4.Discard P and all runs where P was true Simulates fixing the bug predicted by P Reduces rank of similar predicates 5.Repeat until out of failures or predicates

91 Not Covered Today  Visualization of Bug Predictors Simple visualization may help reveal trends Increase(P) S(P) error bound log(F(P) + S(P)) Context(P)

92 Not Covered Today  Reconstruction of failing paths. Bug predictor is often the smoking gun, but not always. Want short, feasible path that exhibits bug. –“Just because it’s undecidable doesn’t mean we don’t need an answer.”


Download ppt "This One Time, at PL Camp... Summer School on Language-Based Techniques for Integrating with the External World University of Oregon Eugene, Oregon July."

Similar presentations


Ads by Google