Knowledge Modules in Software Synthesis

Slides:



Advertisements
Similar presentations
Hybrid BDD and All-SAT Method for Model Checking Orna Grumberg Joint work with Assaf Schuster and Avi Yadgar Technion – Israel Institute of Technology.
Advertisements

Algorithm Design Methods (I) Fall 2003 CSE, POSTECH.
Representing Boolean Functions for Symbolic Model Checking Supratik Chakraborty IIT Bombay.
Data-Flow Analysis II CS 671 March 13, CS 671 – Spring Data-Flow Analysis Gather conservative, approximate information about what a program.
ECE-777 System Level Design and Automation Hardware/Software Co-design
1 Constraint Satisfaction Problems A Quick Overview (based on AIMA book slides)
Algorithms + L. Grewe.
On-the-Fly Garbage Collection: An Exercise in Cooperation Edsget W. Dijkstra, Leslie Lamport, A.J. Martin and E.F.M. Steffens Communications of the ACM,
Forges: Synthesizing Verified Generators Kestrel Institute PIs: Cordell Green, John Anton CSs: Lindsay Errington, Doug Smith, Alessandro Coglio, Stephen.
1 Optimisation Although Constraint Logic Programming is somehow focussed in constraint satisfaction (closer to a “logical” view), constraint optimisation.
Data Flow Analysis Compiler Design Nov. 8, 2005.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
© 2006 Pearson Addison-Wesley. All rights reserved2-1 Chapter 2 Principles of Programming & Software Engineering.
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
System Design Decomposing the System. Sequence diagram changes UML 2.x specifications tells that Sequence diagrams now support if-conditions, loops and.
Universität Dortmund  P. Marwedel, Univ. Dortmund, Informatik 12, 2003 Hardware/software partitioning  Functionality to be implemented in software.
Specifications and Morphisms Spec Partial-Order sort E op _le_: E, E  Boolean axiom reflexive x le x axiom transitive x le y  y le z  x le z axiom antisymmetric.
CONSONA: Constraint Networks for the Synthesis of Networked Applications Lambert Meertens & Cordell Green Asuman Suenbuel Stephen Fitzpatrick,
Fundamentals of Algorithms MCS - 2 Lecture # 7
Introduction to Problem Solving. Steps in Programming A Very Simplified Picture –Problem Definition & Analysis – High Level Strategy for a solution –Arriving.
Kestrel Policy Enforcement and Refinement Douglas R. Smith Kestrel Institute Palo Alto, California.
Exact and heuristics algorithms
Software Synthesis with ACL2 Eric Smith Kestrel Institute ACL2 Workshop 2015.
Cleaning up the CFG Eliminating useless nodes & edges C OMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper & Linda Torczon,
Semantics In Text: Chapter 3.
Standard Template Library The Standard Template Library was recently added to standard C++. –The STL contains generic template classes. –The STL permits.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Chapter 2) CSP solving-An overview Overview of CSP solving techniques: problem reduction, search and solution synthesis Analyses of the characteristics.
© 2006 Pearson Addison-Wesley. All rights reserved 2-1 Chapter 2 Principles of Programming & Software Engineering.
Chapter 17. Assertions State Assertion – predicate intended to express that a descriptive or prescriptive property holds in an arbitrarily chose current.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
CES 592 Theory of Software Systems B. Ravikumar (Ravi) Office: 124 Darwin Hall.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Onlinedeeneislam.blogspot.com1 Design and Analysis of Algorithms Slide # 1 Download From
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
Memory Management What if pgm mem > main mem ?. Memory Management What if pgm mem > main mem ? Overlays – program controlled.
Functional Programming
Automatic Test Generation
Lecture 7: Constrained Conditional Models
Advanced Computer Systems
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Exploratory Decomposition Dr. Xiao Qin Auburn.
Hybrid BDD and All-SAT Method for Model Checking
Computer Science cpsc322, Lecture 13
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Data Partition Dr. Xiao Qin Auburn University.
A Methodology for System-on-a-Programmable-Chip Resources Utilization
Summary of lectures Introduction to Algorithm Analysis and Design (Chapter 1-3). Lecture Slides Recurrence and Master Theorem (Chapter 4). Lecture Slides.
SMT-Based Verification of Parameterized Systems
The CPLEX Library: Mixed Integer Programming
Lectures on Network Flows
Types of Algorithms.
metaheuristic methods and their applications
Computer Science cpsc322, Lecture 13
Artificial Intelligence introduction(2)
Logical architecture refinement
Chapter 5 Designing the Architecture Shari L. Pfleeger Joanne M. Atlee
Objective of This Course
Patterns.
Metaheuristic methods and their applications. Optimization Problems Strategies for Solving NP-hard Optimization Problems What is a Metaheuristic Method?
Chap. 7 Regularization for Deep Learning (7.8~7.12 )
Overview of Machine Learning
Constraint Satisfaction Problems
Chapter 8 NP and Computational Intractability
The use of Neural Networks to schedule flow-shop with dynamic job arrival ‘A Multi-Neural Network Learning for lot Sizing and Sequencing on a Flow-Shop’
Major Design Strategies
SAT-based Methods: Logic Synthesis and Technology Mapping
Chapter 1. Formulations.
Constraint Satisfaction Problems
Major Design Strategies
Presentation transcript:

Knowledge Modules in Software Synthesis Douglas R. Smith Kestrel Institute Palo Alto, California word about Kestrel Institute

Knowledge Modules in Software Synthesis What is software synthesis? Structuring specifications Structuring design knowledge Structuring derivations expert designers know and exploit lots of design patterns

Design by Instantiation Translate requirements into an effective software design by constructing an instance of a class of designs. formal specification of required properties over-approximates the desired behavior refinement design instance class of designs formal deductive approaches versus inductive and statistical model-fitting approaches they have in common is that the user conjectures the abstract form of the solution machine learning under-approximates the desired behavior examples/training-data

Classes of Designs Common design patterns for formal refinement optimization transformations datatype refinements algorithm theories (divide-and-conquer, global search, fixpoint iteration,…) system patterns (model-view-controller, tracking, control systems,...) communication patterns (transport, publish-subscribe, mailbox,...) ad-hoc patterns: sketches, templates Common design patterns for machine learning linear regression decision trees neural networks

Generating Correct-by-Construction Software by Refinement requirement specification0 Generating Correct-by-Construction Software by Refinement specification1 proof of correct refinement spec0 → spec1 transformation1 proof of correct refinement spec1 → spec2 specification2 transformation2 metaprogram = transformation1 ; transformation2 ; transformation3 ; … proof of correct refinement spec2 → spec3 specification3 Why correctness proofs? reduce attack surface through reduced code vulnerabilities what’s manual vs automatic? emphasize reuse across application domains generates proof-carrying code focused locality for software evolution marginal cost of proof generation is near zero Our techniques directly used in HACMS – what is connection between GC, protocols, SAT solvers, …? -> It’s the reusable transformations transformation3 … code

Synthesis Applications planning and scheduling SAT solvers and other constraint solvers family of garbage collectors (one concurrent) family of secure transport-level protocol codes lots of sorting, searching, combinatorial algorithms

Deriving Common Garbage Collection Algorithms Collector Spec maintain count of predecessors trace the set of live nodes Reference Count Collectors Tracing Collectors partitioned memory model monolithic memory model Copying Collectors Marking Collectors talk about rc collectors first – they are the simplest to derive, but have not found use in commercial systems M&S: John McCarthy, March 1960 CACM, RC: George Collins, Dec 1960 CACM, Copying: Minsky, 1963 Project Mac Memo. Generational Collectors Mark & Compact Mark & Sweep

Deriving a Family of Transport-level Codes Pub-Sub Reqt ~14 xforms → proofs concurrent FlipFlop Buffers → proofs ~10 xforms low-level RADL transport spec → proofs ~63 xforms Firsts: pure reqt specs expressed logically, vs operationally generation of refinements via heterogeneous collection of transformations, addressing protocol design schemes, algorithm theories, optimizations, datatype refinements, … emission of proofs from transformations generating a family tree of codes use of a generator of certified C Summary: first generation of a software family with top-to-bottom proofs from reqt-level specifications. about 90 transformations, several hundred proof obligations discharged Intra-VM Comm Inter-VM Comm IP-based Comm C generator → proofs C generator C generator → proofs → proofs C Code C Code C Code

Knowledge Modules in Software Synthesis What is software synthesis? Structuring specifications Structuring design knowledge Structuring derivations

Structuring a Specification via Colimits spec Triv type E spec LINEAR ORDER type L op : L,L  Boolean ... LINEAR ORDER GROUP po LINEAR ORDER + GROUP {LTime,   time-le} extend spec TIME type Time op time-le ... LINEARLY ORDERED GROUP rename QUANTITY actually lo-group needs a monotonicity axiom at least, so the pushout isn’t enough… note the 2 copies of LINEAR-ORDER which we explicitly do not identify -- the coproduct takes the disjoint union of the two specs without sharing LINEAR-ORDER note the use of 1-SORT to identify the type in LINEAR-ORDER and GROUP coproduct TIME + QUANTITY

type Reservation = Time*Task*Resource ... TIME + QUANTITY spec TASK type Task ... spec RESOURCE type Resource ... po TASK-RESOURCE spec RESERVATION type Reservation = Time*Task*Resource ... Task theory is actually a meet-semilattice, so this illustrates parameterization on arbitrary theories, as contrasted with polymorphism which is parametric on simply a sort and its equality. We get 3 copies of SET -- this allows us to refine them independently, but we lack the ability to get a shared polymorphic-style implementation without identifying the 3 specs. spec SCHEDULER op Scheduler : Set(Task)* Set(Resource)  Set(Reservation) ...

Taxonomy of Algorithm Theories Problem Theory (D|I  R|O) generate-and-test Constraint Satisfaction (R = set of maps) Global Structure (R = set + recursive partition) global search binary Search backtrack branch-and-bound Local Structure (R = set + relation) genetic algorithms Problem Reduction Structure Linear Programming simplex method interior point primal dual Integer Linear Programming 0-1 methods Local Structure (R = set + relation) local search hill climbing simulated annealing tabu search Divide-and-Conquer divide-and-conquer Complement Reduction sieves GS-CSP (R = recursively partitioned set of maps) Problem Reduction Generators dynamic programming branch-and-bound game tree search Network Flow specialized simplex Ford-Fulkerson Local Poset Structure (R = set + partial order) Monotone Deflationary Function fixed point iteration Local Semilattice Structure (R = semilattice) Transportation NW algorithm GS-Horn-CSP (Horn-like Constraints) constraint propagation Assignment Problem Hungarian method

Garbage Collection Spec Derivation Step Fixpoint Problem GC_as_fixpoint Garbage Collection Spec Collector with Iteration-based Tracing colimit Fixpoint Iteration Algorithm Scheme + correctness proof pushout operation composes the algorithm theory and the problem spec in Collector to get an initial algorithm design; simultaneously, we instantiate the abstract proof to get a concrete proof in Collector theory.

Refinement Sequence for a Concurrent Mark & Sweep Garbage Collector C1. Algorithm Design C2. Simplification OM1. Observer Maintenance: WS Mem. rename {Heap +-> Memory} OR0. Observer Refinement of payload OR1. Observer Refinement: tgt tgtIM OR1a. Observer Refinement: outNodes outNodesIM OR2. Observer Refinement: roots  rootsL OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes  nodesPair Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply  supplyL OM3. Observer Maintenance: supplyCount OR5. Observer Refinement: black  blackCM OR6. Observer Refinement: WS  WL  WStack Cot1. FinalizeCoType Memory Cot2. Define initBlackCM, … Iso1. Type Isomorphism: Memory  Memory' DTR1. DataType Refinement: Maps  Vectors DTR2. DataType Refinement: Stacks  Vectors DTR3. DataType Refinement: Sets  Lists G1. Globalize Memory D. Simplifications Cgen. Code Generation Refinement Sequence for a Concurrent Mark & Sweep Garbage Collector ~25-30 tranformations; of these all should eventually co-generate proofs, except code generation.

C1. Algorithm Design: fixpoint iteration C2. Simplification Mark&Sweep Collector C1. Algorithm Design: fixpoint iteration C2. Simplification OM1. Observer Maintenance: WS OR0. Observer Refinement of payload OR1. Observer Refinement: tgt OR1a. Observer Refinement: outNodes OR2. Observer Refinement: roots OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply OM3. Observer Maintenance: supplyCount OR5. Observer Refinement: black OR6. Observer Refinement: WSWStack Cot1. FinalizeCoType Memory Cot2. Define initBlackCM, … Iso1. Type Isomorphism: Memory DTR1. DataType Refinement: Maps DTR2. DataType Refinement: Stacks DTR3. DataType Refinement: Sets G1. Globalize Memory D. Simplifications Cgen. Code Generation Cheney Copying Collector C1. Algorithm Design: fixpoint iteration C2. Simplification OM1. Observer Maintenance: WS IM1. Maintain Isomorphism: graphIso OR0. Observer Refinement of payload OR1. Observer Refinement: tgt OR1a. Observer Refinement: outNodes OR2. Observer Refinement: roots OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply OR6. Observer Refinement: WSWStack Cot1. FinalizeCoType Memory Iso1. Type Isomorphism: Memory DTR1. DataType Refinement: Maps DTR2. DataType Refinement: Stacks DTR3. DataType Refinement: Sets G1. Globalize Memory D. Simplifications Cgen. Code Generation ~25-30 tranformations; 25 xforms shown: 1 new, 3 deleted, 15 unchanged, 6 modified unchanged, modified, added, deleted

Design by Instantiation Translate requirements into effective designs by selecting an instance of a class of designs. formal specification of required properties over-approximates the desired behavior refinement design instance class of designs Differences: correctness vs handling noisy data knowledge-rich vs knowledge-impoverished, except for the conjecture of the abstract form of the solution machine learning under-approximates the desired behavior examples/training-data

Extras

Problem Solving Structure * Solution space Candidate solutions Feasible solutions * Local structure = solution space + binary relation * Global structure = solution space + recursive partition

Global Search Theory GlobalSearchTheory = spec type D input type type R output type op O : D * R  Boolean input/output predicate op mkInitial : D  R op ⊑ : R * R  Boolean op Split : D * R * R  Boolean op Subspaces : D * R  List R op Extract : D * R  Option R axiom R, ⊔, ⊑ is a semilattice axiom fa(x:D, z:R) mkInitial(x) ⊑ z axiom fa(x:D, r:R, z:R) r ⊑ z = ( Extract(x,r)=z  ex (s:R) (Split(x,r,s) & s ⊑ z)) axiom fa(x:D, r:R, s:R) Split(x,r,s) = member(s, Subspaces(x,r)) end-spec

GS Scheme theorem: fa(x:D) O(x, f(x)) provable from GS axioms f (x:D) = case propagate(x, mkInitial(x)) of | none  none | some r  GS(x,r) GS(x:D, r:Rhat | Phi(x,r)) : option R = case extract(x,r) of | some z  some z | none  GSAux(x, Subspaces(x,r)) GSAux(x:D, rs:List Rhat | fa(r:R)r∈rsPhi(x,r)) : Option R = case rs of | nil  none | hd::tl  case propagate(x, hd) of | none  GSAux(x,tl) | some r  case GS(x,r) of Scheme includes schematic loop invariants Leadin to next slide: now, this provides a simple model of refinement, but as it stands it doesn’t scale up. We have extended it for the purposes of scaling this approach up. First, no spec is unstructured… what is the effect of structuring on this refinement approach? theorem: fa(x:D) O(x, f(x)) provable from GS axioms

Problem Solving Structure * Solution space Candidate solutions Feasible solution * Local structure = solution space + binary relation * Global structure = solution space + recursive partition Where does the global or local structure come from? Inductive vs coinductive structure of the type

Global Search Problem Solving feasible solutions candidate solutions Cutting constraints split prune off subspace (contains no feasible solutions) cut cut iterated cutting = constraint propagation . fixpoint of the cutting process cut split