CSE503: SOFTWARE ENGINEERING COMPLEXITY, PROVING ADTS David Notkin Spring 2011.

Slides:



Advertisements
Similar presentations
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Advertisements

Linear Lists – Array Representation
Identity and Equality Based on material by Michael Ernst, University of Washington.
Composition CMSC 202. Code Reuse Effective software development relies on reusing existing code. Code reuse must be more than just copying code and changing.
Annoucements  Next labs 9 and 10 are paired for everyone. So don’t miss the lab.  There is a review session for the quiz on Monday, November 4, at 8:00.
David Evans CS201J: Engineering Software University of Virginia Computer Science Lecture 6: Reasoning about Data Abstractions.
Data Abstraction II SWE 619 Software Construction Last Modified, Spring 2009 Paul Ammann.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION ABSTRACT DATA TYPES II Autumn 2011.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION TESTING II Autumn 2011.
CSE 331 Software Design & Implementation Dan Grossman Fall 2014 Data Abstraction: Abstract Data Types (ADTs) (Based on slides by Mike Ernst, David Notkin,
A Complexity Measure THOMAS J. McCABE Presented by Sarochapol Rattanasopinswat.
Software Testing Sudipto Ghosh CS 406 Fall 99 November 9, 1999.
Cs2220: Engineering Software Class 8: Implementing Data Abstractions Fall 2010 University of Virginia David Evans.
Reading and Writing Mathematical Proofs
1 Abstraction  Identify important aspects and ignore the details  Permeates software development programming languages are abstractions built on hardware.
1 Program Correctness CIS 375 Bruce R. Maxim UM-Dearborn.
Cs205: engineering software university of virginia fall 2006 Data Abstraction David Evans
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION MIDTERM REVIEW Autumn 2011.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Lecture 23 Instructor Paulo Alencar.
Software Metrics (Part II). Product Metrics  Product metrics are generally concerned with the structure of the source code (example LOC).  Product metrics.
Software Testing Input Space Partition Testing. 2 Input Space Coverage Four Structures for Modeling Software Graphs Logic Input Space Syntax Use cases.
13-1 Sets, Bags, and Tables Exam 1 due Friday, March 16 Wellesley College CS230 Lecture 13 Thursday, March 15 Handout #23.
Reasoning about programs March CSE 403, Winter 2011, Brun.
(c) University of Washington15-1 CSC 143 Java List Implementation via Arrays Reading: 13.
1 CSCD 326 Data Structures I Software Design. 2 The Software Life Cycle 1. Specification 2. Design 3. Risk Analysis 4. Verification 5. Coding 6. Testing.
Type Abstraction Liskov, Chapter 7. 2 Liskov Substitution Principle In any client code, if the supertype object is substituted by a subtype object, the.
Data Abstractions EECE 310: Software Engineering.
CSE 331 Software Design & Implementation Hal Perkins Autumn 2012 Abstract Data Types – Examples / Summary (Based on slides by Mike Ernst and David Notkin)
Type Abstraction SWE Spring October 05Kaushik, Ammann Substitution Principle “In any client code, if supertype object is substituted.
Software Metrics.
ANU COMP2110 Software Design in 2003 Lecture 10Slide 1 COMP2110 Software Design in 2004 Lecture 12 Documenting Detailed Design How to write down detailed.
Understanding ADTs CSE 331 University of Washington.
Representation invariants and abstraction functions CSE 331 UW CSE.
CSE 331 Software Design & Implementation Dan Grossman Fall 2014 Abstraction Functions (Based on slides by Mike Ernst, David Notkin, Hal Perkins)
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION ABSTRACT DATA TYPES Autumn 2011.
CS223: Software Engineering Lecture 21: Unit Testing Metric.
David Evans CS201J: Engineering Software University of Virginia Computer Science Lecture 7: A Tale of Two Graphs (and.
David Evans CS201J: Engineering Software University of Virginia Computer Science Lecture 5: Implementing Data Abstractions.
Zach Tatlock / Winter 2016 CSE 331 Software Design and Implementation Lecture 6 Representation Invariants.
Chapter 1: Preliminaries Lecture # 2. Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation.
Interface specifications At the core of each Larch interface language is a model of the state manipulated by the associated programming language. Each.
Reasoning and Design (and Assertions). How to Design Your Code The hard way: Just start coding. When something doesn’t work, code some more! The easier.
CSE 331 Software Design & Implementation Hal Perkins Autumn 2012 Data Abstraction: Abstract Data Types (ADTs) (Based on slides by Mike Ernst and David.
Software Test Metrics When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot measure,
Static Software Metrics Tool
EECE 310: Software Engineering
CSE 331 Software Design & Implementation
CSE 331 Software Design & Implementation
CSE 331 Software Design and Implementation
Type Abstraction SWE Spring 2009.
CSE 331 Software Design and Implementation
CSE 331 Software Design and Implementation
CSE 331 Software Design & Implementation
slides created by Ethan Apter
CSE 331 Software Design and Implementation
CSE 331 Software Design and Implementation
Data Abstraction David Evans cs205: engineering software
SWE 619 Software Construction Last Modified, Fall 2015 Paul Ammann
slides created by Ethan Apter
Lecture 7: A Tale of Two Graphs CS201j: Engineering Software
Lecture 4: Data Abstraction CS201j: Engineering Software
CSE 331 Software Design & Implementation
CSE 331 Software Design and Implementation
Introduction to Data Structure
CSE 331 Software Design & Implementation
CSE 331 Software Design & Implementation
CSE 331 Software Design & Implementation
slides created by Ethan Apter and Marty Stepp
Type Abstraction SWE Spring 2013.
Software Specifications
Presentation transcript:

CSE503: SOFTWARE ENGINEERING COMPLEXITY, PROVING ADTS David Notkin Spring 2011

But first… from today’s Seattle Times sp © UW CSE D. Notkin 2 “Industry experts believed they knew where to look for crack-inducing metal fatigue on aging airplanes, but the in-flight rupture of a Southwest Airlines Boeing 737 on Friday has raised concerns about part of the fuselage they previously thought wasn't vulnerable. “A similar hole opened on a Southwest 737 only 21 months ago, and then on an American Airlines 757 last year, raising awareness that metal fatigue can cause the aluminum skin to separate at the so-called lap joints, where panels are spliced together.”

Software complexity sp © UW CSE D. Notkin 3  First, some common software complexity measures  Then, why they are weak measures and (perhaps) a way forward  From last lecture – complexity in the “why is it hard?” sense  Today: “how complex” is a piece of software?

Lines of code (LOC, KLOC, MLOC) sp © UW CSE D. Notkin 4  Count the lines, often omitting comments and/or omitting blank lines  Lines vs. statements  Delivered vs. total (including tests, etc.)  Productivity: LOC/person/time  I’ve seen published numbers ranging from ~2K-8K LOC/person/year  Sensible? "I have made this letter longer than usual, because I lack the time to make it short.” –Blaise Pascal

Halstead software science metrics sp © UW CSE D. Notkin 5  V = N  log 2 (n)  Volume is intended to capture the size of the implementation  Making N choices from the vocabulary – assuming that humans do logarithmic search – leads to the formula  “The volume of a function should be at least 20 and at most The volume of a parameterless one-line function that is not empty; is about 20. A volume greater than 1000 tells that the function probably does too many things.” [verifysoft.com]  D = ( n1 / 2 )  ( N2 / n2 )  Difficulty is proportional to the unique operators and the ratio of total operands to the number of operands  The intent of the second part is based on a belief that repeated use of operands is more error-prone  E = V  D  Effort to implement or understand a program ... n1 = #distinct operatorsn2 = #distinct operandsn = n1 + n2 “vocabulary” N1 = total # of operatorsN2 = total # of operandsN = N1 + N2 “length”

Cyclomatic complexity (McCabe) sp © UW CSE D. Notkin 6  Take the CFG and find the number of edges (E), number of nodes (N), and the number of connected components (P)  Connected components are subgraphs for which there is a path between any two vertices  The cyclomatic complexity is M = E − N + 2P and is intended to measure the number of linearly independent paths through a program’s source code  #tests (branch coverage)  M  #tests (path coverage)  Question: should the complexity include method dispatch in OOP?

Examples sp © UW CSE D. Notkin 7 E = 9 N = 8 P = 1 M = 3 M = 8

Software structure metrics Henry and Kafura sp © UW CSE D. Notkin 8  Measures complexity in terms of fan-in and fan-out of procedures  fan-in: the number of local flows into a procedure plus the number of data structures accessed.  fan-out: the number of local flows out a procedure plus the number of data structures that the procedure modifies.  Complexity is L 2  FI  FO  Where L is the length of a procedure

And many more sp © UW CSE D. Notkin 9  Variants of these  Some incremental improvements  Some extending to interprocedural complexity  Others that measure  Coupling and cohesion  Data complexity  Data flow complexity  …  Function points and feature points – intended to measure the function of a system as perceived by users, without reference to the implementation

So? sp © UW CSE D. Notkin 10  Although there is somewhat mixed data, it appears that most of these measures are proportional to LOC  “Les Hatton claimed recently (Keynote at TAIC-PART 2008, Windsor, UK, Sept 2008) that McCabe Cyclomatic Complexity has the same prediction ability as lines of code.” –Wikipedia [cyclomatic complexity]  Also, how “actionable” the information is has always confused me: if you are told your program is an “8” what are you supposed to do?

A hypothesis sp © UW CSE D. Notkin 11  Every complexity measure I’ve seen is based entirely on the static program (except feature/function points, which don’t consider a program directly)  If complexity measures are to have any real utility, it seems that they must also consider the relationship between the program and its behaviors  That is, the way the developer associates behaviors with a program is material to complexity, but is ignored by the literature  It is also imaginable that this measure would be “actionable” by identifying specific dependences that make this mapping complex – they could perhaps be addressed similarly to dependences that preclude parallelization

Project(s)? sp © UW CSE D. Notkin 12  Any attempt at trying to make this notion more precise would be terrific  Maybe a simple model and some empiric data  Showing that a reasonable model is proportional to LOC would weaken my hypothesis  Stop by and chat if you’re interested  Fits into NSF-funded work with Reid Holmes  ICSE 2011: “Identifying Program, Test, and Environmental Changes That Affect Behaviour”Identifying Program, Test, and Environmental Changes That Affect Behaviour  Potential quals project

What is this? sp © UW CSE D. Notkin ASCII D short 68 mask FTFFFTFF Excess-8 60 Java byte-code fstore_1 Gray code Gray scale Color scale

Types sp © UW CSE D. Notkin 14  Without getting precise, types are used to interpret and manipulate the bit patterns – that is, they give them (some level of) meaning  “Concrete” types manipulate the information in memory directly  Abstract types define a protocol for manipulating instances of those types, but they do not define an implementation

Abstract data type = objects + operations sp © UW CSE D. Notkin 15  The only operations on objects of the type are those provided by the abstraction  The implementation is hidden  We need to show that the abstraction and the implementation are each “correct” … and properly related

Big picture sp © UW CSE D. Notkin 16 Element abstract Element’ abstract Element concrete Element’ concrete For every abstract operation For every corresponding concrete operation Abstraction function (AF)  It commutes [What is purple and commutes?]  AF gives an abstract meaning to concrete representations – more soon

An Abelian grape (sorry) sp © UW CSE D. Notkin 17 math.asu.edu

Specifying ADTs sp © UW CSE D. Notkin 18  A common way is to define the abstract effect of each operation (including constructors) using formal/informal pre- and post-conditions  Might see this using an extended JavaDoc

Example sp © UW CSE D. Notkin 19 // Overview: An IntSet is a mutable, unbounded set of integers. class IntSet { // effects: makes a new IntSet = {} public IntSet() // returns: true if x  this // else returns false public boolean contains(int x) // effects: this post = this pre  {x} public void add(int x) // effects: this post = this pre - {x} public void remove(int x) …

Algebraic specifications From Stotts ( sp © UW CSE D. Notkin 20  Define a sort – give signatures of operations (you’ve seen this kind of thing before in typed OO and functional languages) sort IntSet imports Int, Bool signatures new : -> IntSet insert : IntSet × Int -> IntSet member : IntSet × Int -> Bool remove : IntSet × Int -> IntSet

Define axioms sp © UW CSE D. Notkin 21  “Just” like high school algebra variables i, j : Int; s : IntSet axioms member(new(), i) = false member(insert(s, j), i) = if i = j then true else member(s, i) remove(new(), i) = new() remove(insert(s, j), i) = if i = j then remove(s, i) else insert(remove(s, i), j)

Are these really sets? sp © UW CSE D. Notkin 22  Posit stuff like…  insert(insert(s, i), j) = insert(insert(s, j), i)  insert(insert(s, i), i) = insert(s, i)  Prove from axioms  Tons of issues about completeness, consistency, equality (initial vs. final algebras), etc.  But again, “just” like high school algebra

Proving specification properties sp © UW CSE D. Notkin 23  Regardless of the style of specification, proofs are usually done inductively  No information about the concrete representation and implementation – rather, showing the correctness of the protocol over the ADT’s operations

LetterSet case-insensitive character set [from Ernst] // effects: creates an empty LetterSet public LetterSet ( ); // effects: this post = // if (c 1  this pre | toLowerCase(c 1 ) = toLowerCase(c) // then this pre else this pre  {c} public void insert (char c); // effects: this post = this pre  {c} public void delete (char c); // returns: (c  this) public boolean member (char c); sp © UW CSE D. Notkin 24

Prove desirable property of LetterSet Large enough LetterSet contains two distinct characters Prove: |S|>1  (c 1,c 2 S | [toLowerCase(c 1 )  toLowerCase(c 2 )])  Base case: S = , vacuously true  Inductive case: S was produced by a call of the form T.insert(c) Assume: |T|>1  (c 3,c 4 T [toLowerCase(c 3 )  toLowerCase(c 4 )]) Show: |S|>1  (c 1,c 2 S [toLowerCase(c 1 )  toLowerCase(c 2 )]) where S = T.insert(c) Remember insert ’s post-condition: this post = if (c 1  this pre | toLowerCase(c 1 ) = toLowerCase(c) then this pre else this pre  {c}  For inductive case, consider the two possibilities for S  If S = T, the theorem holds by induction  If S = T  {c}, there are three cases  |T|=0 : Vacuously true  |T|≥1 : T did not contain a char of toLowerCase(c), so the theorem holds by the meaning of union  |T|>1: By inductive assumption, T contains different letters, so by the meaning of union, T  {c} also contains different letters 25

Now: Assume abstraction is correct sp © UW CSE D. Notkin 26  Abstraction function (AF): E c → E a  Maps a concrete object to an abstract value  Defines how the data structure is to be interpreted  Oh, that’s a “ D ”, that’s an fstore_1, that’s a 68, etc.  Representation invariant (RI): a boolean predicate characterizing legal concrete representations  States data structure well-formedness In essence, defines the domain of AF  Captures information that must be shared across implementations of multiple operations

CharSet Abstraction A finite mutable set of Characters [From Ernst] 27 // Overview: A CharSet is a finite mutable set of Characters // effects: creates a fresh, empty CharSet public CharSet ( ) // effects: this post = this pre  {c} public void insert (Character c); // effects: this post = this pre - {c} public void delete (Character c); // returns: (c  this) public boolean member (Character c); // returns: cardinality of this public int size ( ); sp © UW CSE D. Notkin

A CharSet implementation 28 class CharSet { private List elts = new ArrayList (); public void insert(Character c) { elts.add(c); } public void delete(Character c) { elts.remove(c); } public boolean member(Character c) { return elts.contains(c); } public int size() { return elts.size(); } CharSet s = new CharSet(); Character a = new Character(‘a’); s.insert(a); s.delete(a); if (s.member(a)) // print “wrong”; else // print “right”; sp © UW CSE D. Notkin

The RI can help identify an error  Perhaps delete is wrong  It should remove all occurrences  Perhaps insert is wrong  It should not insert a character that is already there class CharSet { // Rep invariant: elts has no nulls and no duplicates private List elts; …  Or…   indices i of elts. elts.elementAt(i) ≠ null   indices i, j of elts. i ≠ j  elts.elementAt(i).equals(elts.elementAt(j)) sp © UW CSE D. Notkin

Where’s the error? 30 // Rep invariant: elts has no nulls and no duplicates public void insert(Character c) { elts.add(c); } public void delete(Character c) { elts.remove(c); } sp © UW CSE D. Notkin

The RI constrains structure, not meaning  Another implementation of insert that preserves the RI public void insert(Character c) { Character cc = new Character(encrypt(c)); if (!elts.contains(cc)) elts.addElement(cc); } public boolean member(Character c) { return elts.contains(c); }  The program is wrong … call on the AF! sp © UW CSE D. Notkin

Abstraction function concrete to abstract value mapping  AF(CharSet this) = { c | c is contained in this.elts }  set of Characters represented by elements contained in this.elts  Typically not executable, but useful to reason about client behavior  Helps reason about the semantics of insert // effects: this post = this pre  {c} public void insert (Character c);  Helps identify a problem  Applying the AF to the result of the call to insert yields AF(elts)  {encrypt(‘a’)}  Consider the following reasonable AF AF(this) = { c | encrypt(c) is contained in this.elts } AF(this) = { decrypt(c) | c is contained in this.elts } sp © UW CSE D. Notkin

“Placing blame” using AF sp © UW CSE D. Notkin 33  AF(CharSet this) = { c | c is contained in this.elts }  Consider a call to insert:  On entry, the meaning is AF(this pre ) ≈ elts pre  On exit, the meaning is AF(this post ) = AF(this pre )  {encrypt(‘a’)}  Does this AF fix things? AF(this) = { c | encrypt(c) is contained in this.elts } = { decrypt(c) | c is contained in this.elts }

Some final odds and ends sp © UW CSE D. Notkin 34  Looking at these examples using the commutative diagram may help clarify any confusions  Or ask!  AF’s can be maintained across fairly complicated implementations that (for example) reorganize dynamically for performance  Multiple concrete values still map to the same abstract value  Why map concrete to abstract?  It’s not a function in the other direction Ex: lists [a,b] and [b,a] each represent the set {a, b}  It’s not as useful in the other direction