Data Abstraction CS 201j: Engineering Software University of Virginia Computer Science Nathanael Paul
Overview Data abstraction Specification/Design of Abstract Data Types (ADTs) Implementation of ADTs
The Problem Programs are complex. –Windows XP: ~45 million lines of code –Mathematica: over 1.5 million Abstraction helps – Many-to-one – “forget the details” –Must separate “what” from “how”
Information Hiding Modularity - Procedural abstraction –By specification Locality Modifiability –By parameterization Data Abstraction –What you can do with the data is separated from how it is represented
Software development cycle Specifications – What do you want to do? Design – How will you do what you want? Implement – Code it. Test – Check if it works. Maintain – School projects don’t usually make it this far. Bugs are cheaper earlier in the cycle!
Database Implementation Database on library web-server stores information on users: userID, name, , etc. You are responsible for implementing the interface between the web-server and database –What happens when we ask for the address for a specific user?
Client asks for address What is address of nate? Client Server Database
Client/Server/Database Interaction I need Nate’s . The interaction between the server and database is your part. Database Server Client
Client/Server/Database Interaction Client Server Database
Client/Server/Database Interaction Client Server Database
Example: Database System Need a new data type Abstract Data Types (ADTs) –Help separate what from how –Client will use the specifications for interaction with data –Client of the web database should not know the “guts” of the implementation
Data abstraction in Java An ADT is defined by a class –The ADT in the web/database application will be a User –A private instance variable hides the class internals –public String get (); What is private in the implementation? OVERVIEW, EFFECTS, MODIFIES –A class does not provide data abstraction by itself
Accessibility Class User { // OVERVIEW: // mutable object // where the User // is a library // member. public String ; … } String nate = myUser. ; send (nate ); /* The client’s code can only see what is made public in the User class. The user’s data is public in the User class. This is BAD. */ /* Client code using a User object, myUser */
Program Maintenance Suppose storage space is at a premium –Everyone in the database is so we can drop the virginia.edu –What kind of problems will occur with the code just seen?
Program Maintenance Suppose storage space is at a premium –Everyone in the database is so we can drop the virginia.edu –What kind of problems could occur had the client code been able to access the address directly? String nate = myUser. ; send (nate ); was public in User class. ***ERROR!!!***
Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. private String ; … public String get () { // EFFECTS: returns user’s // primary return ; } String nate = myUser.get (); send (nate ); /* This code properly uses data abstraction when returning the full address. */ // Client code using a User object, myUser
Accessibility (fixed) Class User { // OVERVIEW: A // mutable object where // User is a library // member. private String ; … public String get () { // EFFECTS: returns user’s // primary return ”; } String nate = myUser.get (); send (nate ); /* The database dropped and only one line of code needed changing. */ // Client code using a User object, myUser
Advantages/Disadvantages of Data Abstraction? - More code to write and maintain initially -Overhead of calling a method -Greater initial time investment + Client doesn’t need to know about representation + Maintenance is easier. + Increases locality and modifiability
Specifying ADTs
Bad Users at the Library The library now wants to crack down on bad Users with overdue books, so the code will need to work with a group of Users. What should be used to represent the group? What data structures do we know about? How should we integrate this code with what we have? What operations should be supported? –deleteUser(String userID); –isInGroup(String userID);
Library keeping track of “bad” people You need to write some code that will manipulate a group of Users that are on the “bad” list. Implementation at right uses an array Class GroupUsers { // OVERVIEW: // Operations provided // to manage a mutable group // of users private User [] latePeople; … public void toString() { // OVERVIEW: Print user // names to standard output … }
Array implementation initialization for GroupUsers Class GroupUsers { // OVERVIEW: Unbounded, mutable // group of Users private User [] latePeople; … public void GroupUsers(String [ ] userIDs) { // OVERVIEW: Initialize group // from userIDs latePeople = new User[userIDs.length + 10]; for(int i = 0; i < userIDs.length; i++) { latePeople[i] = new User(userIDs[i]); }
ADT design Mutable/Immutable ADTs –Mutable – object’s fields or values change –Immutable – object’s fields permanently set at creation –Is this being modified? Tradeoffs Immutability simpler and safer Immutability is slower (creation/deletion of objects)
Classification of ADT operations Creator (constructor) –GroupUsers(String userIDs[ ]) Producer –addUser(String userID) Mutator –setUser (String ) Observer –isMember (String userID)
Implementing ADTs
A bad implementation Most common characteristics –Modifying implementation forces other code to be changed (violdates modifiability) –Must understand more code than necessary to reason about code (violates locality) –Maintenance is difficult
A good implementation User class needed a way to store state of a user, so operations will build around the stored state. Methods should be (procedure abstraction): –Easily coded as possible –Efficient –Exhibit locality –Should enable better testing, maintenance
Changing the group implementation The “guts” of the implementation is subject to change. What happens on the GroupUser’s deleteUser(String userID)?
deleteUser(String userID) The array must shift down an average of n/2 items when deleting an element X
Linked Lists A new data structure User 1User 2User 3 Each User has its own representation, but we store the collection in a list. In the following implementation, each user object is contained in a Node object. Head X
class Node { // OVERVIEW: // Mutable nodes that is used for a linked list // of users private User theUser; private Node next; … } List-node implementation User 1 next points to the next “bad” user User 2 … latePeople
class GroupUsers { // OVERVIEW: // Mutable, unbounded group of users private Node latePeople; /* head of list */ private int numUsers; … } /* Nodes are users with an additional member field called next. The Node class was added, so the User class would not need modification. */ List implementation
Adding a user into GroupUsers /* in GroupUsers.java */ public void addUser(User newUser) { // MODIFIES: this // EFFECTS: this_pre = this_pre U { (Node)newUser } latePeople.add(new Node(newUser)); numUsers++; }
Adding a node into a group of nodes (Node.java) public void add (Node n) { // MODIFIES: this // EFFECTS: n is inserted just after this in the list // first user in list? if (this.next == null) { this.next = n; } else { n.next = this.next; this.next = n; }
deleteUser(String userID) cont. User 1User 2User 3 Head X User 1User 3 Head X X
deleteUser(String userID) Node.java public void delete (String userID) { // MODIFIES: this // EFFECTS: this_pre = this_pre – node // where node.userID = userID Node currNode; Node prevNode; if(this.next == null) return; prevNode = this; currNode = this.next; // continued on next slide
deleteUser(String userID) cont. while(currNode.next != null) { if(userID.equals(currNode.getUserID())) { prevNode.next = currNode.next; break; } currNode = currNode.next; prevNode = prevNode.next; } // user at end of list? if (currNode.next == null && userID.equals(currNode.getUserID())) { prevNode.next = null; }
Linked List vs. Array Array is better for: –Accessing a randomly desired element Linked list is better at: –Inserting –Deleting –Dynamic resizing Users of your implementation may need to use a list or an array for efficiency, so you need an implementation that can be changed easily.
Questions?