Lists and the Collection Interface Chapter 2 CS 225 1
Chapter Objectives Become familiar with the List interface Study array-based implementation of List Understand single, double and circularly linked lists Understand the meaning of big-O notation for algorithm analysis Study single-linked list implementation of List Understand the Iterator interface Implement Iterator for a linked list Understand testing strategies Become familiar with the java Collections Framework CS 225 2
List An expandable collection of elements each element has a position (index) We'll see two kinds of lists in this chapter ArrayList LinkedList CS 225
Arrays An array is an indexed structure Random access - can select its elements in arbitrary order using a subscript value Elements may be accessed in sequence using a loop that increments the subscript You cannot Increase or decrease the length Add an element at a specified position without shifting the other elements to make room Remove an element at a specified position without shifting other elements to fill in the resulting gap CS 225 4
The List Interface List interface operations: Finding a specified target Adding an element to either end Removing an item from either end Traversing the list structure without a subscript Not all classes perform the allowed operations with the same degree of efficiency An array provides the ability to store primitive- type data whereas the List classes all store references to Objects. Autoboxing facilitates this. CS 225 5
Java List Classes CS 225 6
The ArrayList Class Simplest class that implements the List interface Improvement over an array object How? Used when a programmer wants to add new elements to the end of a list but still needs the capability to access the elements stored in the list in arbitrary order CS 225 7
Using ArrayList CS 225 8
Using ArrayList After removal of “Awful” After replacing “Jumpy” with “Sneezy” CS 225
Generic Collections Java 5.0 introduced a language feature called generic collections (or generics) Generics allow you to define a collection that contains references to objects of a specific type List<String> myList = new ArrayList<String>(); specifies that myList is a List of String where String is a type parameter which is analogous to a method parameter. Only references to objects of type String can be stored in myList, and all items retrieved would be of type String. CS 225 10
ArrayList Methods CS 225 11
Advantages of ArrayList The ArrayList gives you additional capability beyond what an array provides Combining Autoboxing with Generic Collections you can store and retrieve primitive data types when working with an ArrayList CS 225 12
Creating and Populating an ArrayList ArrayList<Integer> someInts = new ArrayList<Integer>; int nums = {5, 7, 2, 15}; for (int i=0; i<nums.length; i++) someInts.add( nums[i]); ArrayList<Entry> theDirectory = new ArrayList<Entry>(); theDirectory.add( new Entry( "Jane Smith", "555-549-1234")); CS 225
Traversing an ArrayList int sum = 0; for (int i=0; i<someInts.size(); i++) sum +=someInts.get(i); System.out.println( "sum is " + sum); CS 225
ArrayList Implementation KWArrayList: simple implementation of a ArrayList class Physical size of array indicated by data field capacity Number of data items indicated by the data field size CS 225 15
KWArrayList class public class KWArrayList<E> { private E[] theData; private int size, capacity; public KWArrayList() { capacity = 10; theData = new (E[])Object[capacity]; } CS 225
ArrayList Operations add(E) add(int, E) Remove(int) CS 225
Non-generic KWArrayList public class KWArrayList { private Object[] theData; private int size, capacity; public KWArrayList() { capacity = 10; theData = new Object[capacity]; } CS 225
Efficiency of Algorithms For programs that manage large collections of data, we need to be concerned with how efficient the program is. Measuring the time it takes for a particular part of the program to run is not easy to do accurately. We can characterize a program by how the execution time or memory requirements increase as a function of increasing input size Big-O notation A simple way to determine the big-O of an algorithm or program is to look at the loops and to see whether the loops are nested CS 225
Example 1 How many times does the body of this loop execute? public static int search( int [] x, int target) { for (int i=0; i<x.length; i++) if (x[i] == target) return i; return -1; } On average, x.length / 2 CS 225
Example 2 How many times does the body of this loop execute? public static boolean areDifferent( int [] x, int [] y) { for (int i=0; i<x.length; i++) if (search( y, x[i]) != -1) return false; return true; } On average, x.length * y.length CS 225
Example 3 How many times does the body of this loop execute? public static boolean areUnique( int [] x) { for (int i=0; i<x.length; i++) for (int j=0; j<x.length; i++) if (i!=j && x[i] == x[j]) return false; return true; } On average, x.length * x.length CS 225
Big-O Notation We generally specify the efficiency of an algorithm by giving an "order-of- magnitude" estimate of how the time taken to run it depends on the size of the input (n) Example 1: O(x.length) Example 2: O(x.length * y.length) Example 2: O(x.length 2) We call this Big-O notation CS 225
Big-O Asume T(n) is a function that counts the number of operations in an algorithm as a function of n The algorithm is O(f(n)) if there exist two positive (>0) constants n0 and c such that for all n>n0, cf(n) >= T(n) f(n) provides an upper bound to the time the algorithm takes to run CS 225
Example 4 Consider: First time through outer loop, inner loop is executed n-1 times; next time n-2, and the last time once. So we have T(n) = 3(n – 1) + 3(n – 2) + … + 3 or T(n) = 3(n – 1 + n – 2 + … + 1) CS 225
Example 4 (cont.) We can reduce the expression in parentheses to: n (n – 1) / 2 So, T(n) = 1.5n2 – 1.5n This polynomial is zero when n is 1. For values greater than 1, 1.5n2 is always greater than 1.5n2 – 1.5n Therefore, we can use 1 for n0 and 1.5 for c to conclude that T(n) is O(n2) CS 225
Comparing Performance CS 225
Sample Numbers CS 225 O(f(n)) f(50) f(100) f(100)/f(50) O(1) 1 O(log n) 5.64 6.64 1.18 O(n) 50 100 2 O(n log n) 282 664 2.35 O(n2) 2500 10000 4 O(n3) 12500 1000000 8 O2n) 1.13 x 1015 1.27 x 1030 O(n!) 3 x 1064 9.3 x 10157 3.1 x 1093 CS 225
Performance of KWArrayList Method Efficiency add O(1) get insert O(N) remove CS 225 29
Improving List Performance The ArrayList: add and remove methods operate in linear time because they require a loop to shift elements in the underlying array Linked list overcomes this by providing ability to add or remove items anywhere in the list in constant time Each element (node) in a linked list stores information and a link to the next, and optionally previous, node CS 225 30
A List Node A node contains a data item and one or more links A link is a reference to another node A node is generally defined inside of another class, making it an inner class The details of a node should be kept private See KWLinkedList CS 225 31
Build A Single-Linked List Node<String> tom = new Node<String>("Tom"); Node<String> dick = new Node<String>("Dick"); tom.next = dick; Node<String> tom = new Node<String>("Harry"); dick.next =harry; CS 225 32
Add to Single-Linked List Node<String> bob = new Node<String>("Bob"); bob.next = harry.next; harry.next = bob; CS 225 33
Remove from Single-Linked List tom.next = dick.next; CS 225 34
Traversing a Single-Linked List Set nodeRef to first Node while NodeRef is not null process data in node referenced by nodeRef set nodeRef to nodeRef.next CS 225
Other Methods To implement the List interface, we need to add methods get data at a particular index set data at a particular index add at a specified index Provide a helper method getNode to find the node at a particular index What is the efficiency of this method? CS 225
See SingleLinkedList.java CS 225
Double-Linked Lists Limitations of a single-linked list include: Insertion at the front of the list is O(1). Insertion at other positions is O(n) where n is the size of the list. Can insert a node only after a referenced node Can remove a node only if we have a reference to its predecessor node Can traverse the list only in the forward direction Above limitations removed by adding a reference in each node to the previous node (double-linked list) CS 225 38
Double-Linked Lists CS 225 39
Inserting into a Double-Linked List CS 225 40
Inserting into a Double-Linked List CS 225 41
Removing from a Double-Linked List CS 225 42
Double-Linked List Class Similar to Single-Linked List with an extra data member for the end of the list CS 225
Circular Lists Circular-linked list: link the last node of a double-linked list to the first node and the first to the last Advantage: can traverse in forward or reverse direction even after you have passed the last or first node Can visit all the list elements from any starting point Can never fall off the end of a list Disadvantage: How do you know when to quit? (infinite loop!) CS 225 44
Circular Lists CS 225 45
The LinkedList<E> Class Part of the Java API Implements the List<E> interface using a double-linked list Look at API CS 225 46
List Traversal using get What is the efficiency of for (int index=0; index<aList.size; index++) { E element = aList.get( index); // process element } get operates in O(n) time for a linked list Calling get n times results in O(n2) behavior We ought to be able to traverse a list on O(n) time CS 225
The Iterator<E> Interface The interface Iterator is defined as part of API package java.util The List interface declares the method iterator, which returns an Iterator object that will iterate over the elements of that list An Iterator does not refer to or point to a particular node at any given time but points between nodes Scanner, StringTokenizer use something like an iterator CS 225 48
The Iterator<E> Interface An Iterator allows us to keep track of where we are in a list List interface has a method called iterator() which returns an Iterator object next() CS 225
The Iterator<E> Interface Get O(n) efficiency with while (iter.hasNext()) { E element = iter.next(); // process element } CS 225 50
Example of Iterator CS 225
Improving on Iterator Iterator limitations Can only traverse the List in the forward direction Provides only a remove method Must advance an iterator using your own loop if starting position is not at the beginning of the list CS 225 52
ListIterator ListIterator<E> is an extension of the Iterator<E> interface for overcoming the above limitations CS 225 53
The ListIterator<E> Interface CS 225 54
The ListIterator<E> Interface (continued) CS 225 55
Iterator vs. ListIterator ListIterator is a subinterface of Iterator; classes that implement ListIterator provide all the capabilities of both Iterator interface requires fewer methods and can be used to iterate over more general data structures but only in one direction Iterator is required by the Collection interface, whereas the ListIterator is required only by the List interface CS 225 56
Combining ListIterator and Indexes ListIterator has the methods nextIndex and previousIndex, which return the index values associated with the items that would be returned by a call to the next or previous methods The LinkedList class has the method listIterator(int index) Returns a ListIterator whose next call to next will return the item at position index CS 225 57
The Enhanced for Statement Java has a special for statement that can be used with collections for (E element : list) // process element This type of loop uses the Iterator available in the list to traverse the elements of the list. CS 225 58
The Iterable Interface This interface requires only that a class that implements it provide an iterator method The Collection interface extends the Iterable interface, so all classes that implement the List interface (a subinterface of Collection) must provide an iterator method CS 225 59
Implementation of a Double-Linked List CS 225 60
Double-Linked List with Iterator CS 225 61
Advancing the Iterator CS 225 62
KWLinkedList This is a doubly-linked list It implements ListIterator Most of the methods use a ListIterator to do their task CS 225
Adding to an Empty Double-Linked List CS 225 64
Adding to Front of a Double-Linked List CS 225 65
Adding to End of a Double-Linked List CS 225 66
Adding to Middle of a Double-Linked List CS 225 67
The Collection Hierarchy Both the ArrayList and LinkedList represent a collection of objects that can be referenced by means of an index The Collection interface specifies a subset of the methods specified in the List interface CS 225 68
The Collection Hierarchy CS 225 69
Common Features of Collections Collection interface specifies a set of common methods Fundamental features include: Collections grow as needed Collections hold references to objects Collections have at least two constructors CS 225 70
Common Features of Collections CS 225 71
LinkedList Application Case study that uses the Java LinkedList class to solve a common problem: maintaining an ordered list The list has-a LinkedList inside it An example of aggregation The list operations are delegated to the LinkedList CS 225 72
OrderedList Application CS 225 73
Ordered List CS 225 74
Ordered List Insertion CS 225 75
Iterator Integrity CS 225
Potential Iterator Pitfalls Null references a well-designed and implemented iterator should never return a null References to removed cells Using the regular remove method while there is an active iterator Using the iterator remove method when there are multiple active iterators CS 225
Approaches Do nothing and hope for the best Lock the collection so it can't change while an iterator is active This limits what you can do What if you need multiple iterators Design the iterator to "fail fast" This is the approach used in the java Collections CS 225
Java Example ArrayList extends AbstractList code is available in /usr/local/java/src/java/util CS 225