fun with object modelling Daniel Jackson Software Design Group MIT Laboratory for Computer Science Kansas State University · November 8, 1999
2 only code matters in most developments design = code sketch progress = lines of code product = code no specification what’s lost clarity of purpose separation of concerns radical design options what’s goes wrong throw code away changes are hard reuse is difficult Plan to throw one away. You will anyhow.
3 how did we get in this mess? code is real can’t deliver spec alone code is easier than spec/design to do, to evaluate, to teach no other way to talk about software spec/design languages heavy, mathematical, inert
4 lightweight & electric models object modelling captures structural properties of domain, behaviour, or implementation global viewpoint – not OO! Alloy: A Lightweight OM Notation declarative, partial, abstract simple and rigorous semantics graphical & textual Alcoa: Alloy Constraint Analyzer fully automatic, interactive animates models: sampling & checking concrete analysis of abstract model Plan to throw one away. You will anyhow. So make it a model, not code
5 a happier future? OM extraction Lackwit Ajax Womble OM analysis Nitpick Alcoa CODE SPEC FRAGMENT ANALYZE DESIGN FRAGMENT EXTRACT
6 rest of talk CTAS a new air-traffic control system the aircraft assignment problem Alloy graphical & textual notations examples Alcoa a demonstration how it works summary experience with Alloy & Alcoa research challenges related work
7 CTAS what it is Center/TRACON Automation System suite of tools to manage traffic at big airports FAA’s choice for terminal area automation (1991) deployed in five US airports (1998) NASA Software of the Year (1998) what TMA tool does takes radar input flight plans aircraft models weather, etc gives delay advisories
8 Communications Manager what it is central component of CTAS maintains aircraft database acts as message switch about 80kloc of C why we chose it CM is a bottleneck CM’s design has degraded
9 aircraft allocation problem aircraft states active: radar tracks received assigned: to a route analyzer waiting: to be assigned process states connected: doing analysis challenge aircraft arrive, become active/inactive processes fail and come back how to allocate aircraft to analyzers? what kind of problem? not algorithmic! challenges simple, flexible mechanism easy to argue that it’s right
10 alloy basic concepts (ABC) state declaration sets of atoms subset relationships relations between sets multiplicity ? zero/one ! exactly one + one or more S T T is a subset of S S TU T, U disjoint subsets of S ST m R n R is relation from S to T each S maps to n T’s, m S’s map to each T
11 description vs. instance object model describes set of configurations each has a value for each set relation an odd instance my family tim daniel emily claudia wife judy mum
12 textual constraints expressions all exprs denote sets e1 + e2union e1 & e2intersection e. rnavigation e. +rtransitive closure formulas e1 in e2subset e1 = e2equality all v: S | FF true when any atom in S substituted for v no incest all p: Person | p.wife.mum != p.mum nobody’s her own (grand)mother no p: Woman | p in p.+mum our family some daniel, tim: Person | daniel.mum = tim.mum && daniel.wife.mum = tim.wife.mum
13 object model: state & invariants inv NoGhosts { Active in (Waiting + Assigned) } inv LiveAnalyzers { Assigned.analyzer in (RouteAnalyzer & Connected) }
14 object model: operation (1) implicit version doesn’t specify how aircraft are reassigned invariants are included implicitly op LoseAnalyzer (z: Process) { no z.load' Connected' = Connected - z Active = Active’ Waiting = Waiting’ Assigned = Assigned' }
15 object model: operation (2) explicit version A load is moved to another process op LoseAnalyzerA (z: Process) { LoseAnalyzer (z) ReassignLoadA (z) } op ReassignLoadA (z: Process) { some y: (RouteAnalyzer & Connected) - z | y.load' = y.load + z.load && (all p: RouteAnalyzer - (y + z) | p.load = p.load') }
16 object model: operation (3) explicit version B load is spread arbitrarily amongst other processes op LoseAnalyzerB (z: Process) { LoseAnalyzer (z) ReassignLoadB (z) } op ReassignLoadB (z: Process) { all a: Aircraft - (z.load) | a.analyzer = a.analyzer' z.load.analyzer' in (RouteAnalyzer & Connected') }
17 object model: analyses questions for Alcoa show me a sample of the state an execution of each variant of LoseAnalyzer an execution that leaves an analyzer with no load the difference between the versions do both versions preserve the invariant? note specifications are declarative don’t say how state is updated lets you be partial & abstract but makes analysis hard!
18 why it’s hard (1) what we’d like Alcoa to do given a formula find a solution or show there aren’t any solution is state or transition finding a solution = sampling or refuting theory says no! Alloy is undecidable (because of relations) so no decision procedure exists practice says yes! more important to find bugs than to show there aren’t any only consider instances in scope now a finite search: decidable ALCOA FORMULA SOLUTIONNONE
19 small scope hypothesis an empirical hypothesis most invalid claims can be refuted by small counterexamples smallest revealing scope cumulative invalid assertions 3 90% misscatch
20 why it’s hard (2) even in finite scope huge space of configurations add a relation in scope of k increase by 2^(k^2) for transition, 2x components why search is needed language is declarative no recipe for after-states example: allocation problem 7 sets, 1 relation in state for operation, 14 and 2 scope of 3 2^ (14 x x 9) = 2^60 at 1M/sec 10^12 secs ~ 300 years * *1 nanocentury = seconds
21 translating to SAT what you learned in CS SAT: 1st problem tshown NP-c to show a problem is hard reduce SAT to it what we know now SAT is usually easy to show a problem is easy reduce it to SAT scheme given a design problem D construct SAT problem S, mapping M S has solution s D has solution M(s) reduce map back MAPPING SAT PROBLEMSAT SOLUTION SAT solver DESIGN PROBLEM DESIGN ANALYSIS
22 architecture of Alcoa front end parse & syntax check type inference inline formulas convert to kernel middle end translate to bool scope-dependent backend exploit off-the-shelf solvers Alcoa treats as black box configured by generic mechanism TRANSLATE KERNEL BOOL SOLVE STATE,TRANS ALLOY MODEL SCOPE
23 Alcoa performance works well for problems with many solutions: eg, sampling few solutions: eg, finding bug complexity grows with # state components size of formula rule of thumb solvers bogged down for > 2000 boolean variables > 200 state space bits example: file system relations: naming, dir, links sets: file, dir, name 36 (90) bits for scope of 5 cost P(solvable)
24 ongoing work case studies CTAS: checking design conformance of Java version CAP: designing CTAS extension K42: object invariants in new IBM OS AT&T: looking at new conferencing/mobile feature analyzer performance larger models & scopes, faster analysis language development sequencing of operations more powerful compositions code analysis extracting Alloy models and analyzing with Alcoa
25 related work object modelling notations OCL expression language (Warmer et al) pUML group working on semantics semantic data models (eg, Hammer & McLeod, 1981) interacting state machines extract automata, apply model checking Bandera project (KSU, UMass, Hawaii) focus on concurrency & event sequencing design patterns fragmentary design idioms for OO programs pictorial representation of code software architecture protocols connecting components topological constraints nicely expressed in Alloy
26 conclusion what makes it useful addresses complexity of structure declarative & partial big code/model ratio what makes it possible lightweight, tractable notation small scope hypothesis show existence, not absence help from others Craig Damon, Somesh Jha, Ilya Shlyakhter, Ian Schechter SAT solvers Moore: , 3 hrs -> 1 sec Alcoa! Free while supplies last!