Object-Oriented Programming 95-712 MISM/MSIT Carnegie Mellon University Lecture 8: Iterators, Collections and Maps.

Slides:



Advertisements
Similar presentations
Transparency No. 1 Java Collection API : Built-in Data Structures for Java.
Advertisements

Sets and Maps Part of the Collections Framework. 2 The Set interface A Set is unordered and has no duplicates Operations are exactly those for Collection.
DATA STRUCTURES Lecture: Interfaces Slides adapted from Prof. Steven Roehrig.
Chapter 6 The Collections API. Simple Container/ Iterator Simple Container Shape [] v = new Shape[10]; Simple Iterator For( int i=0 ; i< v.length ; i++)
CSC 205 – Java Programming II Lecture 25 March 8, 2002.
Lecture: ArrayList, Iterator
Java Collections Framework COMP53 Oct 24, Collections Framework A unified architecture for representing and manipulating collections Allows collections.
June 1, 2000 Object Oriented Programming in Java (95-707) Java Language Basics 1 Lecture 7 Object Oriented Programming in Java Advanced Topics Collection.
15-Jun-15 Lists in Java Part of the Collections Framework.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
Algorithm Programming Containers in Java Bar-Ilan University תשס " ו by Moshe Fresko.
What Is a Collection?  A collection (sometimes called a container) is simply an object that groups multiple elements into a single unit.  Collections.
24-Jun-15 Introduction to Collections. 2 Collections A collection is a structured group of objects Java 1.2 introduced the Collections Framework Collections.
UMass Lowell Computer Science Java and Distributed Computing Prof. Karen Daniels Fall, 2000 Lecture 17 Advanced Java Concepts Data Structure Support.
Collections The objectives of this chapter are: To outline the Collections infrastructure in Java To describe the various collection classes To discuss.
Lists in Java Part of the Collections Framework. Kinds of Collections Collection --a group of objects, called elements –Set-- An unordered collection.
12-Jul-15 Lists in Java Part of the Collections Framework.
The Collections Framework A Brief Introduction. Collections A collection is a structured group of objects –An array is a kind of collection –A Vector.
CSE 143 Lecture 7 Sets and Maps reading: ; 13.2 slides created by Marty Stepp
Chapter 19 Java Data Structures
Java's Collection Framework
Collections. Why collections? Collections are used to hold a collection of objects. List holds objects based on order of insertion and can hold non unique.
SEG4110 – Advanced Software Design and Reengineering TOPIC G Java Collections Framework.
Sets and Maps Part of the Collections Framework. The Set interface A Set is unordered and has no duplicates Operations are exactly those for Collection.
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Java Programming: Advanced Topics 1 Collections and Wealth of Utilities.
11: Holding Your Objects Introduction to containers Container disadvantage: unknown type Iterators Container taxonomy Collection functionality List functionality.
Data Structures and Abstract Data Types "Get your data structures correct first, and the rest of the program will write itself." - David Jones.
Liang, Introduction to Java Programming, Sixth Edition, (c) 2007 Pearson Education, Inc. All rights reserved Chapter 22 Java Collections.
Liang, Introduction to Java Programming, Eighth Edition, (c) 2011 Pearson Education, Inc. All rights reserved Chapter 22 Java Collections.
Collections in Java. Kinds of Collections Collection --a group of objects, called elements –Set-- An unordered collection with no duplicates SortedSet.
(c) University of Washington14-1 CSC 143 Java Collections.
CSS446 Spring 2014 Nan Wang.  Java Collection Framework ◦ Set ◦ Map 2.
Chapter 18 Java Collections Framework
1 TCSS 143, Autumn 2004 Lecture Notes Java Collection Framework: Maps and Sets.
LinkedList Many slides from Horstmann modified by Dr V.
1/20/03A2-1 CS494 Interfaces and Collection in Java.
Collections in Java. 2 Collections Hierarchy > ArrayListVector Stack LinkedList > Arrays Collections.
Sets and Maps Chris Nevison. Set Interface Models collection with no repetitions subinterface of Collection –has all collection methods has a subinterface.
The Java Collections Framework Based on
3-February-2003cse Collections © 2003 University of Washington1 Java Collections CSE 403, Winter 2003 Software Engineering
Java 2 Collections Bartosz Walter Software Engineering II.
Hashing as a Dictionary Implementation Chapter 19.
1 Collections Framework A collections framework is a unified architecture for representing and manipulating collections. All collections frameworks contain:
(c) University of Washington16-1 CSC 143 Java Lists via Links Reading: Ch. 23.
SETS AND MAPS Collections of Data. Advanced Data Structures Often referred to as the Java Collections Framework…. Set and map data types Hash tables Binary.
Collections Mrs. C. Furman April 21, Collection Classes ArrayList and LinkedList implements List HashSet implements Set TreeSet implements SortedSet.
CS Ananda Gunawardena.  A collection (sometimes called a container) is simply an object that groups multiple elements into a single unit.  Collections.
Object-Oriented Programming Sakir YUCEL MISM/MSIT Carnegie Mellon University Lecture: Arrays, Collection Classes, Iterators Slides adapted from.
IMPLEMENTING ARRAYLIST COMP 103. RECAP  Comparator and Comparable  Brief look at Exceptions TODAY  Abstract Classes - but note that the details are.
Sets and Maps Part of the Collections Framework. 2 The Set interface A Set is unordered and has no duplicates Operations are exactly those for Collection.
List data type(ADT). Lists Elements : a 1,a 2,a 3,… a i-1,a i, a i+1,…a n Null List contains: 0 elements Types of Operations on list 1.Insertion 2.Deletion.
Introduction to Computational Modeling of Social Systems Prof. Lars-Erik Cederman Center for Comparative and International Studies (CIS) Seilergraben 49,
CMSC 202 Containers and Iterators. Container Definition A “container” is a data structure whose purpose is to hold objects. Most languages support several.
1 Collections. 2 Concept A collection is a data structure – actually, an object – to hold other objects, which let you store and organize objects in useful.
Data Structures I Collection, List, ArrayList, LinkedList, Iterator, ListNode.
1 Copyright © 2011 Tata Consultancy Services Limited COLLECTIONS By TEAM 5 Rajendhiran Sivan Christi Yashwanth Bijay Smruthi Satyajit.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
Collections Dwight Deugo Nesa Matic
19-Mar-16 Collections and ArrayLists.. 2 Collections Why use Collections. Collections and Object-Orientation. ArrayLists. Special Features. Creating ArrayLists.
 2016, Marcus Biel, ArrayList Marcus Biel, Software Craftsman
Java Collections CHAPTER 3-February-2003
Introduction to Collections
Introduction to Collections
Part of the Collections Framework
Introduction to Collections
Collections Framework
Introduction to Collections
Part of the Collections Framework
Presentation transcript:

Object-Oriented Programming MISM/MSIT Carnegie Mellon University Lecture 8: Iterators, Collections and Maps

Today’s Topics Iterators Iterators Collections: Lists and Maps Collections: Lists and Maps Hash functions Hash functions Maps Maps Some speed comparisons Some speed comparisons

Iterators: A Killer O-O Idea We saw an example of this several weeks ago: the Selector class. We saw an example of this several weeks ago: the Selector class. Recall that Selector was an interface, implemented as a private inner class. Recall that Selector was an interface, implemented as a private inner class. The Selector interface had methods to “move around” in an array and return elements. A “tour guide” through an array! The Selector interface had methods to “move around” in an array and return elements. A “tour guide” through an array! Iterators are a Java extension of this idea. Iterators are a Java extension of this idea.

A Primitive Iterator This provides a way to access elements in “container classes.” This provides a way to access elements in “container classes.” If everyone uses the same interface, new container class types are interchangeable. If everyone uses the same interface, new container class types are interchangeable. public interface Selector { boolean end(); Object current(); void next(); }

A Primitive Container Class public class Sequence { private Object[] objects; private int next = 0; public Sequence(int size) { objects = new Object[size]; } public void add(Object x) { if (next < objects.length) { objects[next] = x; next++; }

Sequence (cont.) private class SSelector implements Selector { int i = 0; public boolean end() { return i == objects.length; } public Object current() { return objects[i]; } public void next() { if (i < objects.length) i++; } public Selector getSelector() { return new SSelector(); }

Testing The Sequence Class public class TestSequence { public static void main(String[] args) { Sequence s = new Sequence(10); for (int i = 0; i < 10; i++) s.add(Integer.toString(i)); Selector sl = s.getSelector(); while(!sl.end()) { System.out.println(sl.current()); sl.next(); }

Iterators For Collections The Iterator interface specifies The Iterator interface specifies –boolean hasNext() –Object next() –void remove() You just have to be careful You just have to be careful –to check hasNext() before using next() –to not modify the Collection while iterating, except by using remove()

Simple Iterator Example We get the iterator by asking the ArrayList for one. We get the iterator by asking the ArrayList for one. On creation, it is positioned “just before the beginning” of the ArrayList. On creation, it is positioned “just before the beginning” of the ArrayList. ArrayList cats = new ArrayList(); for (int i = 0; i < 7; i++) cats.add(new Cat(i); Iterator e = cats.iterator(); while (e.hasNext()) ( (Cat)e.next()).print();

Let’s Be Clear On This! ArrayList cats Iterator e = cats.iterator(); while (e.hasNext()) ( (Cat)e.next()).print(); When e is here, hasNext() returns false.

Another Example There is no knowledge about the type of thing being iterated over. There is no knowledge about the type of thing being iterated over. This also shows the power of the “toString() idea”. This also shows the power of the “toString() idea”. class Printer { static void printAll(Iterator e) { while(e.hasNext()) System.out.println(e.next()); }

Collection Interface Methods boolean add(Object) boolean add(Object) boolean addAll(Collection) boolean addAll(Collection) void clear() void clear() boolean contains(Object) boolean contains(Object) boolean containsAll(Collection) boolean containsAll(Collection) boolean isEmpty() boolean isEmpty() Iterator iterator() Iterator iterator() “optional”

Collection Interface Methods boolean remove(Object) boolean remove(Object) boolean removeAll(Collection) boolean removeAll(Collection) boolean retainAll(Collection) boolean retainAll(Collection) int size() int size() Object[] toArray() Object[] toArray() Object[] toArray(Object[] a) Object[] toArray(Object[] a) “optional”

What’s Missing? All the methods that use indexes: All the methods that use indexes: –boolean add(int, Object) –boolean addAll(int, Collection) –Object get(int) –int indexOf(Object) –Object set(int, Object) Why? Sets (HashSet, TreeSet) have their own way of ordering their contents. But ArrayList and LinkedList have these methods, since they are…lists. Why? Sets (HashSet, TreeSet) have their own way of ordering their contents. But ArrayList and LinkedList have these methods, since they are…lists.

Collections Example public class AABattery { public String toString() { return "AABattery"; } } public class NineVoltBattery { public String toString() { return "NineVoltBattery"; } } public class RollOfRibbon { public String toString() { return "RollOfRibbon"; } } public class PaperClip { int i; PaperClip(int i) { this.i = i; } public String toString() { return "PaperClip(" + i + ")"; } }

Collections Example (cont.) public class BandAid { public String toString() { return "BandAid"; } } public class Box { ArrayList moreStuff = new ArrayList(); public String toString() { String s = new String("Box"); s += moreStuff; return s; }

Collections Example (cont.) public class BoxOfPaperClips { ArrayList clips = new ArrayList(); public String toString() { String s = new String("BoxOfPaperClips"); s += clips; return s; }

public class JunkDrawer { ArrayList contents = new ArrayList(); public void fillDrawer() { contents.add(new RollOfRibbon()); contents.add(new AABattery()); contents.add(new NineVoltBattery()); BoxOfPaperClips boxOfClips = new BoxOfPaperClips(); for (int i = 0; i < 3; i++) boxOfClips.clips.add(new PaperClip(i)); contents.add(boxOfClips); Box box = new Box(); box.moreStuff.add(new AABattery()); box.moreStuff.add(new BandAid()); contents.add(box); contents.add(new AABattery()); }

Collections Example (cont.) public static void main(String[] args) { JunkDrawer kitchenDrawer = new JunkDrawer(); kitchenDrawer.fillDrawer(); System.out.println(kitchenDrawer.contents); } This prints [RollOfRibbon, AABattery, NineVoltBattery, BoxOfPaperClips[PaperClip(0), PaperClip(1), PaperClip(2)], Box[AABattery, BandAid], AABattery]

Removing Stuff This doesn’t work at all! You need to have a reference to a battery actually in the drawer. How do you figure out if something is an AABattery? void takeAnAABattery() { boolean b = contents.remove(new AABattery()); if (b) System.out.println("One AABattery removed"); }

Using RTTI boolean takeAnAABattery() { Iterator i = contents.iterator(); Object aa = null;// initialize, or compiler complains while(i.hasNext()) { if ( (aa = i.next()) instanceof AABattery ) { contents.remove(aa); return true; } return false; }

Containers Are Good, But… Everything in a container is “just an Object.” Everything in a container is “just an Object.” If you aren’t sure what’s in there, and its location, then finding what you want can be tedious. If you aren’t sure what’s in there, and its location, then finding what you want can be tedious. Can an über-hausfrau do better? Can an über-hausfrau do better?

A “More Organized” Drawer public class MarthaStewartDrawer { ArrayList contents = new ArrayList(); ArrayList aaBatteries = new ArrayList(); public void fillDrawer() { contents.add(new RollOfRibbon()); AABattery a1 = new AABattery(); AABattery a2 = new AABattery(); contents.add(a1); aaBatteries.add(a1); //add all the rest… contents.add(a2); aaBatteries.add(a2); }

Remove An Entire Collection boolean takeAllAABatteries() { return contents.removeAll(aaBatteries); } public static void main(String[] args) { MarthaStewartDrawer kitchenDrawer = new MarthaStewartDrawer(); kitchenDrawer.fillDrawer(); System.out.println(kitchenDrawer.contents); if (kitchenDrawer.takeAllAABatteries()) System.out.println("All AABatteries removed"); System.out.println(kitchenDrawer.contents); }

Or, Remove Everything Except... This is actually the “set intersection” of contents with aaBatteries. This is actually the “set intersection” of contents with aaBatteries. Note, however, that this removes the AABatterys in the Box… Note, however, that this removes the AABatterys in the Box… boolean leaveOnlyAABatteries() { return contents.retainAll(aaBatteries); }

Specialized Collections The List interface: The List interface: –Gives you the Collection interface, plus more –Insertion order is preserved –You can “index into” a List –The concrete types are ArrayList and LinkedList. The Set interface: The Set interface: –Just the Collection interface, but with specialized behavior –Insertion order isn’t preserved. –The concrete types are HashSet and TreeSet.

Lists Produce ListIterators

Operational Efficiencies ArrayList ArrayList –Holds data internally as an array (duh!) –Random access is fast, just “index into” the array –Insertion (except at the end) is very slow LinkedList LinkedList –Random access is slow (but provided for) –Insertion anywhere is fast (once you are there!)

ListIterator void add(Object) void add(Object) boolean hasNext() boolean hasNext() boolean hasPrevious() boolean hasPrevious() Object next() Object next() int nextIndex() int nextIndex() Object previous() Object previous() int peviousIndex() int peviousIndex() void remove() void remove() void set(Object) (this does replacement) void set(Object) (this does replacement)

LinkedList void addFirst(Object) void addFirst(Object) void addLast(Object) void addLast(Object) Object getFirst() Object getFirst() Object getLast() Object getLast() Object removeFirst() Object removeFirst() Object removeLast() Object removeLast() It seems clear that a LinkedList is really a doubly-linked list. You can easily make a queue, deque, or stack class by simply deriving a subclass of LinkedList and limiting the subclass behavior. This is called “adapting.”

The Set Interface Elements in Set implementations are unique—no duplicates allowed. Elements in Set implementations are unique—no duplicates allowed. Objects added to Sets must have equals() defined. Objects added to Sets must have equals() defined. Generally, there is no guarantee that elements will be in any particular order, Generally, there is no guarantee that elements will be in any particular order, but concrete instances (HashSet, TreeSet) don’t just randomly place elements! but concrete instances (HashSet, TreeSet) don’t just randomly place elements!

TreeSet Guaranteed to keep elements in ascending order, according to their natural ordering, or through a Comparator. Guaranteed to keep elements in ascending order, according to their natural ordering, or through a Comparator. Each element must be comparable to every other element. You probably won’t put PaperClips and AABatterys into the same TreeSet Each element must be comparable to every other element. You probably won’t put PaperClips and AABatterys into the same TreeSet Iterators are fail-fast, meaning that you get an exception right away if you use the iterator on a TreeSet that’s been modified other than through remove(). Iterators are fail-fast, meaning that you get an exception right away if you use the iterator on a TreeSet that’s been modified other than through remove(). log(n) performance for the basic operations. log(n) performance for the basic operations.

HashSet Constant time performance for the basic operations (at least on average). Constant time performance for the basic operations (at least on average). Iterators are fail-fast. Iterators are fail-fast. Requires that objects implement a hashCode() method (one is provided by Object, but may not be optimal). Requires that objects implement a hashCode() method (one is provided by Object, but may not be optimal). In general, objects are not stored in order. In general, objects are not stored in order.

Hash Tables And Hash Functions A sneaky method for storage where fast look-up is desired. A sneaky method for storage where fast look-up is desired. Two major components: Two major components: –A bucket array, maintained by the hash table, and –A hash function, typically belonging to the class of objects to be stored. These two work in concert. These two work in concert.

Storing Integers In A Hash Table (From Goodrich & Tomassia) Hash function h(k) = k % 13 “collisions” Bucket array size = “initial capacity” = 13 Load factor = #objects/array size = 10/13

Adding The Integer Hash function h(k) = k % 13 = 33 % 13 = 7 33

Adding The Integer Hash function h(k) = k % 13 = 23 % 13 =

Finding The Integer Hash function h(k) = k % 13 = 28 % 13 = Start here

The Hash Function Should provide a “relatively unique” integer for each object stored. Should provide a “relatively unique” integer for each object stored. Every time it is invoked for the same object, it must return the same integer! Every time it is invoked for the same object, it must return the same integer! If two objects are equal (according to equals(Object)), then they must return the same integer. If two objects are equal (according to equals(Object)), then they must return the same integer. If two objects are unequal, they need not return different integers (although they probably should). If two objects are unequal, they need not return different integers (although they probably should). The default Object hashMap() method turns the address of an object into an int. The default Object hashMap() method turns the address of an object into an int.

Hash Function Example public class Student implements Comparable { public Student(String name, float gpa) { this.name = name; this.gpa = gpa; } public Student() {} public int compareTo(Object o) { if ( ((Student)o).gpa < gpa ) return 1; else if ( ((Student)o).gpa > gpa ) return -1; else return 0; }

Hash Function Example (cont.) public boolean equals(Object o) { if (gpa == ((Student) o).gpa) return true; else return false; } public int hashCode() { return (int) (gpa*10.0); } public String getName() { return name;} public float getGpa() { return gpa;} private String name; private float gpa = 0.0F; //make sure hashing works! }

public static void main(String[] args) { Student s1 = new Student("Fred", 3.0F); Student s2 = new Student("Sam", 3.5F); Student s3 = new Student("Steve", 2.1F); //Set studentSet = new TreeSet(); Set studentSet = new HashSet(); studentSet.add(s1); studentSet.add(s2); studentSet.add(s3); Iterator i = studentSet.iterator(); while(i.hasNext()) System.out.println( ((Student)i.next()).getName()); } Hash Function Example (cont.)

For this example, both TreeSet and HashSet return For this example, both TreeSet and HashSet return –Steve –Fred –Sam But if my GPA goes up to 2.2, HashSet gives But if my GPA goes up to 2.2, HashSet gives –Fred –Sam –Steve

“Re-Hashing” If the load factor gets too large, searching performance goes way down. If the load factor gets too large, searching performance goes way down. Typically, a hash table “adjusts itself” when the load factor exceeds some value (typically 0.75). Typically, a hash table “adjusts itself” when the load factor exceeds some value (typically 0.75). The bucket size is increased, and the elements are “re-hashed”, resulting in a new storage layout. The bucket size is increased, and the elements are “re-hashed”, resulting in a new storage layout. Our original example had load factor 0.77, so let’s re-hash it. Our original example had load factor 0.77, so let’s re-hash it.

The Original Hash Table Hash function h(k) = k % 13 Bucket array size = 13 Load factor = 0.77 Increase bucket array size to 17

The Table Re-Hashed Hash function h(k) = k % 17 Bucket array size = 17 Load factor =

The Map Interface A Map is for storing (key, value) pairs of objects. A Map is for storing (key, value) pairs of objects. Also known as a dictionary or associative container. Also known as a dictionary or associative container. There are TreeMap, HashMap, and WeakHashMap. There are TreeMap, HashMap, and WeakHashMap. TreeMap is sorted (like TreeSet). TreeMap is sorted (like TreeSet).

The Java Map Classes

Map Example: Counting Words Much ado lately about a work newly attributed to Shakespeare, as a result of computer analysis. Much ado lately about a work newly attributed to Shakespeare, as a result of computer analysis. Let’s write a program to tally the word frequencies in Shakespeare’s plays. Let’s write a program to tally the word frequencies in Shakespeare’s plays. This follows Eckel’s “Statistics” example (sort of…) This follows Eckel’s “Statistics” example (sort of…)

A Class To Hold The Counts public class WordCount { int i = 1; public String toString() {return Integer.toString(i); } } This will be the value part of the (key, value) pair. It just holds an int, that will be incremented whenever its associated key (a word) is encountered again. Remember, both key and value must be objects.

A Class To Hold The Map public class WordFrequencies { public HashMap hm = new HashMap(); public void put(String c) { if (hm.containsKey(c)) ((WordCount)hm.get(c)).i++; else hm.put(c, new WordCount()); } public String toString() { return hm.toString(); } } The put() method looks to see if the word is already in the map. If so, it updates the WordCount object. If not, it makes a new one.

A Class To Do The Work public class FindWordFrequencies { SimpleInput file = null; WordFrequencies wf = new WordFrequencies(); FindWordFrequencies() {} FindWordFrequencies(String fileName) { file = new SimpleInput(fileName); file.setDelimiters(" \t,:;.?-[]{}!"); } The constructor opens a file, and sets SimpleInput’s delimiters. Punctuation and whitespace are ignored.

A Class To Do The Work (cont.) void buildWordFrequencyMap() { String nextWord = file.nextWord(); while (nextWord != null) { // haven’t reached EOF nextWord = nextWord.toLowerCase(); wf.put(nextWord); nextWord = file.nextWord(); } public String toString() { return wf.toString(); } } String’s toLowerCase() method is used so that e.g., “King” and “king” are considered the same word.

Finally, A Class To Test Everything public class Shakespeare { public static void main(String[] args) { FindWordFrequencies findFrequencies = new FindWordFrequencies("midsummer.txt"); findFrequencies.buildWordFrequencyMap(); System.out.println(findFrequencies.toString()); }

Results On “A Midsummer Night’s Dream”, we get On “A Midsummer Night’s Dream”, we get –a=280, abate=1, abide=2, abjure=1, able=2… –…yourself=3, yourselves=3, youth=7 There are 17,214 total words, 3036 different words. “Love” is the most popular word longer than three letters (168 in various forms!); hate = 18, midsummer=10, methinks=9 There are 17,214 total words, 3036 different words. “Love” is the most popular word longer than three letters (168 in various forms!); hate = 18, midsummer=10, methinks=9 HashMap takes sec., TreeMap takes sec. HashMap takes sec., TreeMap takes sec.

List Speed Comparisons TypeGetIterationInsertRemove array1,4303,850nana ArrayList3,07012, ,850 LinkedList16,3209, Vector4,89016, ,850