Programming Techniques Lec05 Data Abstraction (Chapter 5) Software Engineering Fall 2005
Programming Techniques Data Abstraction data abstraction = if only objects were provided -> user would implement program in terms of the data representation -> when representation changes -> user programs have to change therefore, the user has to call the operations to access the data type -> when representation changes -> operation implementations change -> user programs stay the same
Programming Techniques 5.1 Specification of Data Abstractions Data types are defined by interfaces & classes class dname { // OVERVIEW: brief description of the date type’s // behavior // constructors // specs for constructors // methods // specs for methods }
Programming Techniques Components of Specification OVERVIEW gives a description of the abstraction in terms of “well understood” concepts (for instance mathematical sets “{}”, union “+”…) specifies if the type is mutable or immutable constructors specify how new objects are created methods specify how objects are accessed once they have been created constructors and methods belong to objects, not classes -> no static in header
Programming Techniques Specification of mutable IntSet public class IntSet { // OVERVIEW: IntSets are mutable, unbounded sets of integers // A typical IntSet is {x1,...,xn} // constructors public IntSet () // EFFECTS: Initializes this to be empty // methods public void insert (int x) // MODIFIES: this // EFFECTS: Adds x to the elements of this, i.e. this_post = this + {x} public void remove (int x) // MODIFIES: this // EFFECTS: Removes x from this, i.e. this_post = this - {x} // observers public boolean isIn (int x) // EFFECTS: if x is in this returns true else returns false public int size ( ) // EFFECTS: Returns the cardinality of this public int choose ( ) throws EmptyException // EFFECTS: if this is empty, throws EmptyException else // returns an arbitrary element of this }
Programming Techniques IntSet Only one parameterless constructor is enough: type is mutable Mutators insert and remove have MODIFIES clause this_post = this after the method returns Observers isIn, size and choose do not change the state of the object (observers are allowed to modify anything other than this, but usually don’t) choose returns an arbitrary element of the Inset: it is underdetermined Specification is preliminary version of the class
Programming Techniques Specification of immutable Poly (1) public class Poly { // OVERVIEW: Polys are immutable polynomials with integer coefficients // A typical Poly is c0 + c1x + c2x^2 + c5x^ cnx^n // constructors public Poly() // EFFECTS: Initializes this to be the zero polynomial public Poly(int c, int n) throws NegativeExponentException // EFFECTS: If n < 0 throws NegativeExponentException else // initializes this to be the Poly cx^n // methods public int degree () // EFFECTS: Returns the degree of this, i.e. the largest exponent // with a non-zero coefficient. Returns 0 if this is the zero Poly public int coeff (int d) // EFFECTS: Returns coefficient of the term of this whose exponent is d public Poly add (Poly q) throws NullPointerException // EFFECTS: If q is null throws NullPointerException else returns // the Poly this + q public Poly mult (Poly q) throws NullPointerException // EFFECTS: If q is null throws NullPointerException else returns // the Poly this * q
Programming Techniques Specification of immutable Poly (2) public Poly sub (Poly q) throws NullPointerException // EFFECTS: If q is null throws NullPointerException else returns // the Poly this - q public Poly minus () { // EFFECTS: Returns the Poly - this }
Programming Techniques Poly Two constructors: zero and arbitrary monomial (overloaded) Arbitrary polynomials are created by adding and multiplying polynomials, each time creating a new Poly (type is immutable) -> no Mutators NegativeExponentException is unchecked (it is easy to avoid calls with a negative exponent)
Programming Techniques Using IntSet Abstraction public static IntSet getElements (int[] a) throws NullPointerException { // EFFECTS: If p is null throws NullPointerException // else returns a set containing an entry for each // distinct element of a IntSet s = new IntSet(); for (int i = 0; i < a.length(); i++) { s.insert(a[i]); } return s; }
Programming Techniques 5.2 Using Data (Poly) Abstractions public static Poly diff (Poly p) throws NullPointerException { // EFFECTS: If p is null throws NullPointerException // else returns the Poly obtained by differentiating p Poly q = new Poly (); for (int i = 1; i <= p.degree(); i++) { q = q.add(new Poly(p.coeff(i) * i, i - 1)); } return q; }
Programming Techniques diff & getElements These functions are not declared in IntSet or Poly, but in another class that uses IntSet and Poly. => if the implementation of the data abstraction changes methods diff & getElements will continue to work correctly => diff & getElements, if implemented incorrectly will not affect the correctness of the abstraction nor can they break other code that uses the abstraction but: diff & getElements may be slightly slower that if they were implemented behind the abstraction barrier
Programming Techniques 5.3 Implementing Data Abstractions One data abstraction can have many different possible representations or reps An implementation makes sure that the representation is initialized (constructors), used and modified (methods) correctly according to the data abstraction A good representation allows all operations to be implemented in a reasonably simple and efficient manner (+ frequent operations must run quickly) IntSet rep as Vector: allow duplicate elements? -> insert will be faster -> remove will be slower -> isIn will be slower for false, faster for true
Programming Techniques Instance variables A representation typically has a number of components Each component is stored in an instance variable Instance variables should be declared private to prevent a user from breaking the abstraction to allow re-implementation without breaking the user’s code Instance variables should not be declared static (i.e. there is one of each per object) Static variables occur once per class (equivalent to global variables in other languages)
Programming Techniques Implementation of IntSet (1) public class IntSet { // OVERVIEW: IntSets are mutable, unbounded sets of integers private Vector els; // the rep // constructors public IntSet() { // EFFECTS: Initializes this to be empty els = new Vector(); } // methods public void insert (int x) { // MODIFIES: this // EFFECTS: Adds x to the elements of this, i.e. this_post = this + {x} Integer y = new Integer(x); if (getIndex(y) < 0) els.add(y); } public void remove (int x) { // MODIFIES: this // EFFECTS: Removes x from this, i.e. this_post = this - {x} int i = getIndex(new Integer(x)); if (i < 0) return; els.set(i, els.lastElement()); els.remove(els.size() -1);}
Programming Techniques Implementation of IntSet (2) private int getIndex (Integer x) { // EFFECTS: if x is in this returns index where x appears else -1 for (int i = 0; i < els.size(); i++) if (x.equals(els.get(i))) return i; return -1; } public boolean isIn (int x) { // EFFECTS: if x is in this returns true else returns false return getIndex(new Integer(x)) >= 0; } public int size ( ) { // EFFECTS: Returns the cardinality of this return els.size(); } public int choose ( ) throws EmptyException { // EFFECTS: if this is empty, throws EmptyException else // returns an arbitrary element of this if (els.size() == 0) throw EmptyException("IntSet.choose"); return els.lastElement(); }
Programming Techniques IntSet Implementation rep is a single private instance variable els constructors belong to a particular object, manipulated as implicit argument this (can be omitted to access its instance variables) a private helper function getIndex is used to make sure there are no duplicates in the vector els getIndex allows insert to preserve the no-duplicates condition this condition is relied upon in size and remove
Programming Techniques Implementation of Poly (1) public class Poly { // OVERVIEW: … private int [ ] trms; private int deg; // constructors public Poly() { // EFFECTS: Initializes this to be the zero polynomial trms = new int[1]; deg = 0; } public Poly(int c, int n) throws NegativeExponentException { // EFFECTS: If n < 0 throws NegativeExponentException else // initializes this to be the Poly cx^n if (n < 0) throw NegativeExponentException("Poly(int,int) constr"); if (c == 0) {trms = new int[1]; deg = 0; return;} trms = new int[n+1]; for (int i = 0; i < n; i++) trms[i] = 0; trms[n] = c; deg = n; } private Poly (int n) {trms = new int[n+1]; deg = n;}
Programming Techniques Implementation of Poly (2) // methods public int degree () { // EFFECTS: Returns the degree of this, i.e. the largest exponent // with a non-zero coefficient. Returns 0 if this is the zero Poly return deg; } public int coeff (int d) { // EFFECTS: Returns the coefficient of term of this with exponent d if (d deg) return 0; else return trms[d]; } public Poly sub (Poly q) throws NullPointerException { // EFFECTS: If q is null throws NullPointerException else returns // the Poly this - q; return add (q.minus()); } public Poly minus () { // EFFECTS: Returns the Poly - this; Poly r = new (Poly(deg)); for (int i = 0; i < deg; i++) r.trms[i] = - trms[i]; return r; }
Programming Techniques Implementation of Poly (3) public Poly add (Poly q) throws NullPointerException { // EFFECTS: If q is null throws NullPointerException else returns // the Poly this + q Poly la, sm; if (deg > q.deg) {la = this; sm = q;} else {la = q; sm = this;} int newdeg = la.deg; // new degree is the larger degree if (deg == q.deg) // unless there are trailing zeros for (int k = deg; k > 0; k--) if (trms[k] + q.trms[k] != 0) break; else newdeg--; Poly r = new Poly(newdeg); // get a new Poly int i; for (i = 0; i < sm.deg && i <= newdeg; i++) r.trms[i] = sm.trms[i] + la.trms[i]; for (int j = i; j <= newdeg; j++) r.trms[j] = la.trms[j]; return r; }
Programming Techniques Implementation of Poly (4) public Poly mul (Poly q) throws NullPointerException { // EFFECTS: If q is null throws NullPointerException else returns // the Poly this * q if ((q.deg == 0 && q.trms[0] == 0) || (deg == 0 && trms[0] == 0)) return new Poly(); Poly r = new poly(deq + q.deg); for (int i = 0; i <= deg; i++) for (int j = 0; j <= q.deg; j++) r.trms[i+j] = r.trms[i+j] + trms[i] * q.trms[j]; return r; }
Programming Techniques Poly Implementation rep is an array storing coefficients (immutable) plus an int storing the degree (for convenience) Note that many methods access private instance variables from other objects as well as this (methods have access to private instance variables of objects of the same class) sub is implemented in terms of other methods add, mul and minus use private constructor Poly(int) and initialize the new Poly themselves
Programming Techniques Alternative Poly Implementation What if most of the terms have zero coefficients ? Previous implementation contains mostly zeroes. => store only the terms with non-zero coefficients use 2 vectors: private Vector coeffs; // the non-zero coefficients private Vector exps; // the associated exponents but this is awkward: Vectors have to be precisely lined up. instead: use one vector storing both coef and exps
Programming Techniques Records // inner class class Pair { // OVERVIEW: a record type int coeff; int exp; Pair (int c, int n) {coeff = c; exp = n;} } A record is simply a collection of instance variables and a constructor to initialize them. No methods. You can declare this inside Poly as an inner class. Do not abuse records. They are only to be used as passive storage within a full-blown data abstraction.
Programming Techniques Implementation of sparse Poly // using two vectors private Vector coeffs; // the non-zero coefficients private Vector exps; // the associated exponents public int coeff (int x) { for (int i = 0; i < exps.size(); i++) if (((Integer) exps.get(i)).intValue() == x) return ((Integer) coeffs.get(i)).intValue(); return 0; } // using Pair records private Vector trms; // the terms with non-zero coefficients public int coeff (int x) { for (int i = 0; i < trms.size(); i++) Pair p = (Pair) trms.get(i); if (p.exp == x) return p.coeff; } return 0; }
Programming Techniques 5.4 Additional Methods In our discussion of objects, we have ignored some methods that all objects have.. Unless classes define these methods, they will inherit them from the Object class, which may or may not be desirable.. We'll talk about the equals, clone, and toString methods.
Programming Techniques Other methods: equality Two objects are equal if they are behaviorally equivalent => it is not possible to distinguish between them using any sequence of calls to the objects Mutable objects are equals only if they are the same objects (otherwise you can change one of them and prove they are not the same): equals inherited from Object same as == Immutable objects are equals if they have the same state: must implement equals themselves
Programming Techniques equals for Poly public boolean equals (Poly q) { if (q == null || deg != q.deg) return false; for (int i = 0; i < deg; i++) if (trms[i] != q.trms[i]) return false; return true; } public boolean equals (Object z) { if (! ( z instanceof Poly) return false; return equals ((Poly) z); }
Programming Techniques Other methods: hashCode int hashCode () is defined by Object is used in hashtables to provide a unique number for each distinct object Objects that are equal should have the same hashCode => mutable objects do not have to define hashCode => immutable objects have to define hashCode (otherwise they will have the same hashCode only if they are ==)
Programming Techniques Other methods: similarity Two objects are similar if they have the same state at the moment of comparison Weaker notion of equality: similar immutable objects are always equal similar mutable objects may not be equal == is stronger than equal is stronger than similar
Programming Techniques Other methods: clone Object clone () makes a copy of its object the copy should be similar to the original default implementation form Object simply makes a new Object and copies all instance variables (shallow copy) this is sufficient for immutable objects clone() is made accessible by declaring: public myClass implements Cloneable {… mutable objects should implement their own cloning operation (using a deep copy)
Programming Techniques clone for IntSet private IntSet (Vector v) { els = v; } public Object clone () { return new IntSet((Vector)els.clone()); }
Programming Techniques Other methods: toString String toString() should return a String showing the type and current state of the object Default implementation from Object shows type and hashCode => not very informative => objects should implement toString themselves
Programming Techniques toString for IntSet public String toString () { if (els.size () == 0) return “IntSet: { }”; String s = “IntSet: {“ + els.elementAt(0).toString( ); for (int i = 1; i < els.size( ); i++) s = s + “, “ + els.elementAt(i).toString( ); return s + “}”; }
Programming Techniques 5.5 Aids to Understanding Implementations Abstraction function: shows how the representation maps to the data abstraction justifies the choice of representation Representation invariant: captures the common assumptions on which the implementations are based allows the implementation of one method to be read in isolation of all others for instance: why can IntSet size() return the size of els ? Because there are no duplicates in els
Programming Techniques Abstraction functions AF: C -> A the abstraction function AF maps a concrete state C to an abstract state A abstraction functions are usually many-to-one {1,2} {7} [1,2] [2,1] [7] IntSet els
Programming Techniques Abstraction function for IntSet Abstraction functions are defined informally, so the range is hard to define. Instead we show a typical example: // A typical IntSet is {x1,…xn} // The abstraction function is // AF(c) = {c.els[i].intValue | 0 <= i < c.els.size} {x| p(x)} describes the set of all x such that the predicate p(x) is true
Programming Techniques Abstraction function for Poly // A typical Poly is c0 + c1x + c5x^ cnx^n // The abstraction function is // AF(c) = c0 + c1x + c5x^ cnx^n // where // ci = c.trms[i] if 0 <= i < c.trms.size // = 0 otherwise
Programming Techniques The representation invariant Type checking ensures that when an object this is constructed or called with a method, it belongs to the class However, not all objects of a class are legitimate representations of abstract objects The statement of a property that all legitimate objects satisfy is called a representation invariant or rep invariant I: C -> boolean is predicate that holds true for legitimate objects
Programming Techniques Representation invariant of IntSet // The rep invariant is // c.els != null && // for all integers i.c.els[i] is an Integer && // for all integers i,j.(0 // c.els.[i].intValue != c.els.[j].intValue) or more informally: // The rep invariant is // c.els != null && // all elements of c.els are Integers && // there are no duplicates in c.els
Programming Techniques Alternative Representation for IntSet private boolean[100] els; private Vector otherEls; private int sz; for integers in the range we record membership by storing true in els[i] integers outside this range are stored as before the number of elements is stored in sz
Programming Techniques Abstraction function // The abstraction function is // AF(c) = {c.otherEls[i].intValue | //0 <= i < c.otherEls.size} // + // { j | 0 <= j < 100 && c.els[j] } the set is the union of elements in otherEls and the indexes of true elements in els
Programming Techniques Representation invariant // The rep invariant is // c.els != null && // c.otherEls != null && // c.els.size = 100 && // all elements in c.otherEls are Integers && // all elements in c.otherEls are not in range && // there are no duplicates in c.otherEls && // c.sz = c.otherEls + (count of true entries in c.els)
Programming Techniques Helper function in rep invariant // c.sz = c.otherEls + (count of true entries in c.els) can be rewritten as: // c.sz = c.otherEls.size + cnt(c.els, 0) // where cnt(a,i) = if i >= a.size then 0 // else if a[i] then 1+ cnt (a, i+1) //else cnt (a, i)
Programming Techniques Representation invariant of Poly // The rep invariant is // c.trms != null && // c.trms.length >= 1 && // c.deg = c.trms.length - 1 && // c.deg > 0 => x.trms[deg] != 0
Programming Techniques Implementing abstraction functions Instead of giving the abstraction function as a comment, it may also be implemented as a method The abstraction function explains the interpretation of the rep. It maps the state of each legal representation to the abstract object it is intended to represent. It is implemented by the toString method => if two different representations map to the same abstract object they should have the same toString result
Programming Techniques Implementing representation invariants The representation invariant defines all the common assumptions that underlie the implementations of a type’s operations. It defines which representations are legal by mapping each representation object to either true (if its rep is legal) or false (if its rep is not legal. The method that checks the representation invariant is called repOk: public boolean repOk ( ) // EFFECTS: Returns true if the rep invariant // holds for this; // otherwise returns false
Programming Techniques repOk for Poly public boolean repOk ( ) { if (trms == null || deg != trms.length - 1 || trms.length == 0) return false; if (deg == 0) return true; return trms[deg] != 0; }
Programming Techniques repOk for IntSet public boolean repOk ( ) { if (els == null) return false; for (int i = 0; i < els.size ( ); i++) { Object x = els.get(i); if (! (x instanceof Integer)) return false; for (int j = i + 1; j < els.size ( ); j++) if (x.equals(els.get(j))) return false; } return true; }
Programming Techniques uses of repOk repOk can be used during testing to check whether an implementation is preserving the rep invariant repOk can be used in methods and constructors to practice defensive programming (throw a FailureException if the rep invariant does not hold) Only constructors and methods that modify the representation need to check repOk: add, mul, minus, but not sub or coeff in Poly insert and remove in IntSet
Programming Techniques Discussion Representation invariants hold whenever an object is used outside its implementation Within an implementation, the invariant sometimes does not hold temporarily, for example in Poly where a trms is created with zero in the highest element by mul, but is overwritten before the method returns Invariants hold on entry and exit of methods Abstraction functions only hold for legal representations and are undefined if the repOk returns false A rep invariant specifies all properties on which a method can rely (i.e. they can be implemented by different people)
Programming Techniques 5.6 Properties of data abstractions mutable objects must have mutable representations immutable objects may have mutable representations, as long as they cannot be modified outside the implementation implementations can perform benevolent side-effects if they modify the rep without violating the rep invariant and without affecting the abstract state of its object (only when the abstract function is many-to-one)
Programming Techniques Benevolent side effect example (1) Rational abstraction function: // A typical rational is n/d // The abstraction function is // AF(C ) = c.num/c.denom Suppose we rule out zero denominators, represent negative rationals by means of negative numerators and choose NOT to keep the rep in reduced form (to speed up multiplication) // The rep invariant is // c.denom > 0
Programming Techniques Benevolent side effect example (2) // c.reduce() reduces r to minimal form // so that gcd(abs(c.num),abs(c.denom)) == 1 public boolean equals(Rational r) { if (r == null) return false; if (num == 0) return r.num == 0; if (r.num == 0) return false; reduce(); r.reduce(); return (num == r.num && denom == r.denom); } // reduction is a benevolent side effect
Programming Techniques Exposing the representation An implementation exposes the rep if it provides users of its objects with a way of accessing some mutable component of the rep (This is bad) for example: non-private instance variables for example in IntSet: public Vector allEls (){ // EFFECTS: Returns a vector containing the // elements of this, each exactly once, in // arbitrary order return els; }
Programming Techniques 5.7 Reasoning about Dta Abstractions. Data type induction: for each operation, assume that the rep invariant holds at the beginning and then show that it holds at the end of the operation.. To prove correctness of an operation, the abstract function is used to relate the abstract object to the concrete object that represent them.. Data type induction is also used to reason about invariant. However, in this case, the reasoning is based on the specs, and observers can be ignored
Programming Techniques Preserving the Rep invariant Data type induction can be used to show that the rep invariant holds for all objects of a class First show that it holds for the objects created by constructors Then show that it if it holds for any inputs to a method of the type, then it will hold for any inputs and new objects of the type when the method returns.
Programming Techniques Preserving rep invariant: IntSet.insert // c.els != null && // for all integers i.c.els[i] is an Integer && // for all integers i,j.(0 // c.els.[i].intValue != c.els.[j].intValue) is preserved by insert because: public void insert (int x) { Integer y = new Integer(x); if (getIndex(y) < 0) els.add(y); } the invariant holds for this at call time getIndex preserves the rep x is added to els only if x is not already in this (i.e. getIndex(y) returns -1)
Programming Techniques Preserving rep invariant: : Poly.mul // c.trms != null && c.trms.length >= 1 && // c.deg = c.trms.length - 1 && // c.deg > 0 => x.trms[deg] != 0 is preserved by mul because: public Poly mul (Poly q) throws NullPointerException { if ((q.deg == 0 && q.trms[0] == 0) || (deg == 0 && trms[0] == 0)) return new Poly(); Poly r = new poly(deq + q.deg); for (int i = 0; i <= deg; i++) for (int j = 0; j <= q.deg; j++) r.trms[i+j] = r.trms[i+j] + trms[i] * q.trms[j]; return r; } the invariant holds for this and q at call time if either q or this is the zero Poly, this is recognized otherwise both q and this have a nonzero coefficient in the high term => the high term of the result = product of high terms of q and this cannot be zero
Programming Techniques Reasoning about operations Proving that the rep invariant is preserved is only part of showing that an implementation is correct You also have to show that the specs (written in terms of abstract objects) correspond to the implementation (written in terms of representation) using the abstraction function.
Programming Techniques IntSet implementation The constructor: IntSet constructor returns an object whose els component is an empty vector. This is correct since the abstraction function maps the empty vector to the empty set. The size method: the size of the els vector is the cardinality of the set because the abstraction function maps the elements of the vector to the elements of the set and the rep invariant ensures there are no duplicates The remove method: checks if the element is in the vector and returns if it is not. Correct since if it isn’t in the vector, it isn’t in the set (this_post maps to this - {x}). Otherwise, the element is removed from the vector. Correct since the rep invariant guarantees there are no duplicates in the vector.
Programming Techniques 5.8 Design Issues Mutability:. A type should be immutable if its objects would naturally have unchanging values. Mathematical objects such as integers, polynomials, complex numbers arecandidates.. A type should be mutable if it is modeling something from the real world, where the state of the type changes over time.. Mutability raises the issue of efficiency versus safety.. Immutable abstracts are safer in the sense that no problems arise if their objects are shared. But they require frequent creation and discarding of objects.. Mutability is a property of the abstraction rather than its implementation.
Programming Techniques Operation Categories Operation types 1. Creators: create objects without taking as input any object of their type. 2. Producers: create objects by taking inputs of their type. 3. Mutators: modify objects of their type. 4. Observers: take objects of their own type as input and return results of other types.
Programming Techniques Adequacy. A data type is adequate if it provides enough operations so that everything users need to do with its objects can be done both conveniently and with reasonable efficiency.. The notion of adequacy must take context of use into account.. If the use of a type is limited then only a few operations need be provided; for general use, more operations are needed.
Programming Techniques Properties of data abstractions A data abstraction is mutable if it has any mutator methods Four kinds of operations are provided by data abstractions: creators: produce new objects from scratch producers: produce new objects from existing ones mutators: modify the state of objects observers: provide information about the state a data type is adequate if it provides enough operations so that whatever a user needs to do can be done conveniently and with reasonable efficiency
Programming Techniques 5.9 Locality and modifiability A data abstraction provides locality if users cannot modify components of the rep (i.e. it does not expose the rep) A data abstraction provides modifiability if in addition there is no way for using code to access (read) the rep
Programming Techniques 5.10 Summary This chapter defines data abstractions : what they are, how to specify their behavior, and how to implement them. We discussed both mutable abstractions, such as InSet, and immutable abstractions, such as Poly