Download presentation
Presentation is loading. Please wait.
1
Knowledge Modules in Software Synthesis
Douglas R. Smith Kestrel Institute Palo Alto, California word about Kestrel Institute
2
Knowledge Modules in Software Synthesis
What is software synthesis? Structuring specifications Structuring design knowledge Structuring derivations expert designers know and exploit lots of design patterns
3
Design by Instantiation
Translate requirements into an effective software design by constructing an instance of a class of designs. formal specification of required properties over-approximates the desired behavior refinement design instance class of designs formal deductive approaches versus inductive and statistical model-fitting approaches they have in common is that the user conjectures the abstract form of the solution machine learning under-approximates the desired behavior examples/training-data
4
Classes of Designs Common design patterns for formal refinement
optimization transformations datatype refinements algorithm theories (divide-and-conquer, global search, fixpoint iteration,…) system patterns (model-view-controller, tracking, control systems,...) communication patterns (transport, publish-subscribe, mailbox,...) ad-hoc patterns: sketches, templates Common design patterns for machine learning linear regression decision trees neural networks
5
Generating Correct-by-Construction Software by Refinement
requirement specification0 Generating Correct-by-Construction Software by Refinement specification1 proof of correct refinement spec0 → spec1 transformation1 proof of correct refinement spec1 → spec2 specification2 transformation2 metaprogram = transformation1 ; transformation2 ; transformation3 ; … proof of correct refinement spec2 → spec3 specification3 Why correctness proofs? reduce attack surface through reduced code vulnerabilities what’s manual vs automatic? emphasize reuse across application domains generates proof-carrying code focused locality for software evolution marginal cost of proof generation is near zero Our techniques directly used in HACMS – what is connection between GC, protocols, SAT solvers, …? -> It’s the reusable transformations transformation3 … code
6
Synthesis Applications
planning and scheduling SAT solvers and other constraint solvers family of garbage collectors (one concurrent) family of secure transport-level protocol codes lots of sorting, searching, combinatorial algorithms
7
Deriving Common Garbage Collection Algorithms
Collector Spec maintain count of predecessors trace the set of live nodes Reference Count Collectors Tracing Collectors partitioned memory model monolithic memory model Copying Collectors Marking Collectors talk about rc collectors first – they are the simplest to derive, but have not found use in commercial systems M&S: John McCarthy, March 1960 CACM, RC: George Collins, Dec 1960 CACM, Copying: Minsky, 1963 Project Mac Memo. Generational Collectors Mark & Compact Mark & Sweep
8
Deriving a Family of Transport-level Codes
Pub-Sub Reqt ~14 xforms → proofs concurrent FlipFlop Buffers → proofs ~10 xforms low-level RADL transport spec → proofs ~63 xforms Firsts: pure reqt specs expressed logically, vs operationally generation of refinements via heterogeneous collection of transformations, addressing protocol design schemes, algorithm theories, optimizations, datatype refinements, … emission of proofs from transformations generating a family tree of codes use of a generator of certified C Summary: first generation of a software family with top-to-bottom proofs from reqt-level specifications. about 90 transformations, several hundred proof obligations discharged Intra-VM Comm Inter-VM Comm IP-based Comm C generator → proofs C generator C generator → proofs → proofs C Code C Code C Code
9
Knowledge Modules in Software Synthesis
What is software synthesis? Structuring specifications Structuring design knowledge Structuring derivations
10
Structuring a Specification via Colimits
spec Triv type E spec LINEAR ORDER type L op : L,L Boolean ... LINEAR ORDER GROUP po LINEAR ORDER + GROUP {LTime, time-le} extend spec TIME type Time op time-le ... LINEARLY ORDERED GROUP rename QUANTITY actually lo-group needs a monotonicity axiom at least, so the pushout isn’t enough… note the 2 copies of LINEAR-ORDER which we explicitly do not identify -- the coproduct takes the disjoint union of the two specs without sharing LINEAR-ORDER note the use of 1-SORT to identify the type in LINEAR-ORDER and GROUP coproduct TIME + QUANTITY
11
type Reservation = Time*Task*Resource ...
TIME + QUANTITY spec TASK type Task ... spec RESOURCE type Resource ... po TASK-RESOURCE spec RESERVATION type Reservation = Time*Task*Resource ... Task theory is actually a meet-semilattice, so this illustrates parameterization on arbitrary theories, as contrasted with polymorphism which is parametric on simply a sort and its equality. We get 3 copies of SET -- this allows us to refine them independently, but we lack the ability to get a shared polymorphic-style implementation without identifying the 3 specs. spec SCHEDULER op Scheduler : Set(Task)* Set(Resource) Set(Reservation) ...
12
Taxonomy of Algorithm Theories
Problem Theory (D|I R|O) generate-and-test Constraint Satisfaction (R = set of maps) Global Structure (R = set + recursive partition) global search binary Search backtrack branch-and-bound Local Structure (R = set + relation) genetic algorithms Problem Reduction Structure Linear Programming simplex method interior point primal dual Integer Linear Programming 0-1 methods Local Structure (R = set + relation) local search hill climbing simulated annealing tabu search Divide-and-Conquer divide-and-conquer Complement Reduction sieves GS-CSP (R = recursively partitioned set of maps) Problem Reduction Generators dynamic programming branch-and-bound game tree search Network Flow specialized simplex Ford-Fulkerson Local Poset Structure (R = set + partial order) Monotone Deflationary Function fixed point iteration Local Semilattice Structure (R = semilattice) Transportation NW algorithm GS-Horn-CSP (Horn-like Constraints) constraint propagation Assignment Problem Hungarian method
13
Garbage Collection Spec
Derivation Step Fixpoint Problem GC_as_fixpoint Garbage Collection Spec Collector with Iteration-based Tracing colimit Fixpoint Iteration Algorithm Scheme + correctness proof pushout operation composes the algorithm theory and the problem spec in Collector to get an initial algorithm design; simultaneously, we instantiate the abstract proof to get a concrete proof in Collector theory.
14
Refinement Sequence for a Concurrent Mark & Sweep Garbage Collector
C1. Algorithm Design C2. Simplification OM1. Observer Maintenance: WS Mem. rename {Heap +-> Memory} OR0. Observer Refinement of payload OR1. Observer Refinement: tgt tgtIM OR1a. Observer Refinement: outNodes outNodesIM OR2. Observer Refinement: roots rootsL OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes nodesPair Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply supplyL OM3. Observer Maintenance: supplyCount OR5. Observer Refinement: black blackCM OR6. Observer Refinement: WS WL WStack Cot1. FinalizeCoType Memory Cot2. Define initBlackCM, … Iso1. Type Isomorphism: Memory Memory' DTR1. DataType Refinement: Maps Vectors DTR2. DataType Refinement: Stacks Vectors DTR3. DataType Refinement: Sets Lists G Globalize Memory D Simplifications Cgen. Code Generation Refinement Sequence for a Concurrent Mark & Sweep Garbage Collector ~25-30 tranformations; of these all should eventually co-generate proofs, except code generation.
15
C1. Algorithm Design: fixpoint iteration C2. Simplification
Mark&Sweep Collector C1. Algorithm Design: fixpoint iteration C2. Simplification OM1. Observer Maintenance: WS OR0. Observer Refinement of payload OR1. Observer Refinement: tgt OR1a. Observer Refinement: outNodes OR2. Observer Refinement: roots OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply OM3. Observer Maintenance: supplyCount OR5. Observer Refinement: black OR6. Observer Refinement: WSWStack Cot1. FinalizeCoType Memory Cot2. Define initBlackCM, … Iso1. Type Isomorphism: Memory DTR1. DataType Refinement: Maps DTR2. DataType Refinement: Stacks DTR3. DataType Refinement: Sets G Globalize Memory D Simplifications Cgen. Code Generation Cheney Copying Collector C1. Algorithm Design: fixpoint iteration C2. Simplification OM1. Observer Maintenance: WS IM1. Maintain Isomorphism: graphIso OR0. Observer Refinement of payload OR1. Observer Refinement: tgt OR1a. Observer Refinement: outNodes OR2. Observer Refinement: roots OM2. Observer Maintenance: rootCount OR3. Observer Refinement: nodes Mut1. Import random mutator Mut2. Simplify OR4. Observer Refinement: supply OR6. Observer Refinement: WSWStack Cot1. FinalizeCoType Memory Iso1. Type Isomorphism: Memory DTR1. DataType Refinement: Maps DTR2. DataType Refinement: Stacks DTR3. DataType Refinement: Sets G Globalize Memory D Simplifications Cgen. Code Generation ~25-30 tranformations; 25 xforms shown: 1 new, 3 deleted, 15 unchanged, 6 modified unchanged, modified, added, deleted
16
Design by Instantiation
Translate requirements into effective designs by selecting an instance of a class of designs. formal specification of required properties over-approximates the desired behavior refinement design instance class of designs Differences: correctness vs handling noisy data knowledge-rich vs knowledge-impoverished, except for the conjecture of the abstract form of the solution machine learning under-approximates the desired behavior examples/training-data
17
Extras
18
Problem Solving Structure
* Solution space Candidate solutions Feasible solutions * Local structure = solution space + binary relation * Global structure = solution space + recursive partition
19
Global Search Theory GlobalSearchTheory = spec type D input type
type R output type op O : D * R Boolean input/output predicate op mkInitial : D R op ⊑ : R * R Boolean op Split : D * R * R Boolean op Subspaces : D * R List R op Extract : D * R Option R axiom R, ⊔, ⊑ is a semilattice axiom fa(x:D, z:R) mkInitial(x) ⊑ z axiom fa(x:D, r:R, z:R) r ⊑ z = ( Extract(x,r)=z ex (s:R) (Split(x,r,s) & s ⊑ z)) axiom fa(x:D, r:R, s:R) Split(x,r,s) = member(s, Subspaces(x,r)) end-spec
20
GS Scheme theorem: fa(x:D) O(x, f(x)) provable from GS axioms
f (x:D) = case propagate(x, mkInitial(x)) of | none none | some r GS(x,r) GS(x:D, r:Rhat | Phi(x,r)) : option R = case extract(x,r) of | some z some z | none GSAux(x, Subspaces(x,r)) GSAux(x:D, rs:List Rhat | fa(r:R)r∈rsPhi(x,r)) : Option R = case rs of | nil none | hd::tl case propagate(x, hd) of | none GSAux(x,tl) | some r case GS(x,r) of Scheme includes schematic loop invariants Leadin to next slide: now, this provides a simple model of refinement, but as it stands it doesn’t scale up. We have extended it for the purposes of scaling this approach up. First, no spec is unstructured… what is the effect of structuring on this refinement approach? theorem: fa(x:D) O(x, f(x)) provable from GS axioms
21
Problem Solving Structure
* Solution space Candidate solutions Feasible solution * Local structure = solution space + binary relation * Global structure = solution space + recursive partition Where does the global or local structure come from? Inductive vs coinductive structure of the type
22
Global Search Problem Solving
feasible solutions candidate solutions Cutting constraints split prune off subspace (contains no feasible solutions) cut cut iterated cutting = constraint propagation . fixpoint of the cutting process cut split
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.