9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays.

Slides:



Advertisements
Similar presentations
Chapter 7. Binary Search Trees
Advertisements

Linear Lists – Array Representation
JAVA Programming (Session 7) When you are willing to make sacrifices for a great cause, you will never be alone. Instructor:
Sets and Maps Part of the Collections Framework. 2 The Set interface A Set is unordered and has no duplicates Operations are exactly those for Collection.
Data Structures A data structure is a collection of data organized in some fashion that permits access to individual elements stored in the structure This.
Hashing as a Dictionary Implementation
6-1 6 Stack ADTs Stack concepts. Stack applications. A stack ADT: requirements, contract. Implementations of stacks: using arrays, linked lists. Stacks.
Queue ADTs Queue concepts. Queue applications. A queue ADT: requirements, contract. Implementations of queues: using arrays, linked lists.
7 Queue ADTs  Queue concepts  Queue applications  A queue ADT: requirements, contract  Implementations of queues: using arrays and linked-lists  Queues.
9-1 9 Queue ADTs Queue concepts. Queue applications. A queue ADT: requirements, contract. Implementations of queues: using arrays, linked lists. Queues.
CHAPTER 7 Queues.
7-1 7 Queue ADTs Queue concepts. Queue applications. A queue ADT: requirements, contract. Implementations of queues: using arrays, linked lists. Queues.
Priority Queues. 2 Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-first-out The “smallest” element.
Hashing Chapters What is Hashing? A technique that determines an index or location for storage of an item in a data structure The hash function.
8 List and Iterator ADTs  List concepts  List applications  A list ADT: requirements, contract  Iterators  Implementations of lists: using arrays.
CSci 143 Sets & Maps Adapted from Marty Stepp, University of Washington
Fall 2007CS 2251 Iterators and Tree Traversals. Fall 2007CS 2252 Binary Trees In a binary tree, each node has at most two subtrees A set of nodes T is.
Sets and Maps Chapter 9. Chapter 9: Sets and Maps2 Chapter Objectives To understand the Java Map and Set interfaces and how to use them To learn about.
24-Jun-15 Introduction to Collections. 2 Collections A collection is a structured group of objects Java 1.2 introduced the Collections Framework Collections.
COMP 171 Data Structures and Algorithms Tutorial 10 Hash Tables.
CSE 143 Lecture 7 Sets and Maps reading: ; 13.2 slides created by Marty Stepp
Maps A map is an object that maps keys to values Each key can map to at most one value, and a map cannot contain duplicate keys KeyValue Map Examples Dictionaries:
(c) University of Washingtonhashing-1 CSC 143 Java Hashing Set Implementation via Hashing.
Geoff Holmes and Bernhard Pfahringer COMP206-08S General Programming 2.
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
CS2110: SW Development Methods Textbook readings: MSD, Chapter 8 (Sect. 8.1 and 8.2) But we won’t implement our own, so study the section on Java’s Map.
Chapter 3 List Stacks and Queues. Data Structures Data structure is a representation of data and the operations allowed on that data. Data structure is.
CM0551 Exam Prep. What are an algorithm’s time and space complexity? (2 marks) Answer: The growth rate of the algorithm’s time requirement and the computer.
1 Java's Collection Framework By Rick Mercer with help from The Java Tutorial, The Collections Trail, by Joshua BlockThe Collections Trail.
6 Stack ADTs  Stack concepts  Stack applications  Stack ADTs: requirements, contracts  Implementations of stacks: using arrays and linked-lists  Stacks.
Information and Computer Sciences University of Hawaii, Manoa
2-1 Week 2 Sets Set concepts (you should know these!) Set applications. A set ADT (abstract data type): requirements, contract. Implementations of sets:
CSS446 Spring 2014 Nan Wang.  Java Collection Framework ◦ Set ◦ Map 2.
7.2 Priority Queue ADTs Priority queue concepts
1 TCSS 143, Autumn 2004 Lecture Notes Java Collection Framework: Maps and Sets.
Sets and Maps Chris Nevison. Set Interface Models collection with no repetitions subinterface of Collection –has all collection methods has a subinterface.
13-1 Sets, Bags, and Tables Exam 1 due Friday, March 16 Wellesley College CS230 Lecture 13 Thursday, March 15 Handout #23.
Hashing as a Dictionary Implementation Chapter 19.
10 Binary-Search-Tree Data Structure  Binary-trees and binary-search-trees  Searching  Insertion  Deletion  Traversal  Implementation of sets using.
Chapter 11 Hash Anshuman Razdan Div of Computing Studies
“Never doubt that a small group of thoughtful, committed people can change the world. Indeed, it is the only thing that ever has.” – Margaret Meade Thought.
COMP 103 Bitsets. 2 Sets, and more Sets!  Unsorted Array  Sorted ArrayO(n) for at least one of  Linked Listcontains, add, remove  Binary Search TreeO(log.
11 Map ADTs  Map concepts  Map applications  A map ADT: requirements, contract.  Implementations of maps: using key-indexed arrays, entry arrays, linked-lists,
Sets and Maps Computer Science 4 Mr. Gerb Reference: Objective: Understand the two basic applications of searching.
Building Java Programs Bonus Slides Hashing. 2 Recall: ADTs (11.1) abstract data type (ADT): A specification of a collection of data and the operations.
Tries Data Structure. Tries  Trie is a special structure to represent sets of character strings.  Can also be used to represent data types that are.
1 Introduction  Algorithms  Data structures  Abstract data types  Programming with lists and sets © 2008 David A Watt, University of Glasgow Algorithms.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
9 Set ADTs  Set concepts  Set applications  A set ADT: requirements, contract  Implementations of sets: using member arrays, linked lists, boolean.
CMSC 202 Containers and Iterators. Container Definition A “container” is a data structure whose purpose is to hold objects. Most languages support several.
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of CHAPTER 15: Sets and Maps Java Software Structures: Designing and Using.
Sets and Maps Chapter 9. Chapter Objectives  To understand the Java Map and Set interfaces and how to use them  To learn about hash coding and its use.
CSE 373 Data Structures and Algorithms Lecture 9: Set ADT / Trees.
3-1 Java's Collection Framework Another use of polymorphism and interfaces Rick Mercer.
Course: Programming II - Abstract Data Types HeapsSlide Number 1 The ADT Heap So far we have seen the following sorting types : 1) Linked List sort by.
CS2005 Week 7 Lectures Set Abstract Data Type.
Binary Tree Data Structures Binary trees and binary search trees. Searching. Insertion. Deletion. Traversal. Implementation of sets using BSTs.
Linked Data Structures
11 Map ADTs Map concepts. Map applications.
The Tree Data Structure
Efficiency of in Binary Trees
Associative Structures
TCSS 342, Winter 2006 Lecture Notes
CSE 373: Data Structures and Algorithms
CSE 373 Data Structures and Algorithms
CSE 373: Data Structures and Algorithms
8 List ADTs List concepts. List applications.
Presentation transcript:

9-1 9 Set ADTs Set concepts. Set applications. A set ADT: requirements, contract. Implementations of sets: using arrays, linked lists, boolean arrays. Sets in the Java class library. © 2001, D.A. Watt and D.F. Brown

9-2 Set concepts (1) A set is a collection of distinct members (values or objects), whose order is insignificant. Notation for sets: {a, b, …, z}. The empty set is { }.  Set notation is used here, but not supported by Java.

9-3 Set concepts (2) Examples of sets: evens= {0, 2, 4, 6, 8} punct= {‘.’, ‘!’, ‘?’, ‘:’, ‘;’, ‘,’} EU= {AT, BE, DE, DK, ES, FI, FR, GR, IE, IT, LU, NL, PT, SE, UK} NAFTA= {CA, MX, US} NATO= {BE, CA, CZ, DE, DK, ES, FR, GR, HU, IS, IT, LU, NL, NO, PL, PT, TR, UK, US} set of integers set of characters sets of countries

9-4 Set concepts (3) The cardinality of a set s is the number of members of s. This is written #s. E.g.: #EU = 15 #{red, white, red} = 2 Duplicate members aren’t counted. An empty set has cardinality zero. We can test whether x is a member of set s (i.e., s contains x). This is the membership test, written x  s. E.g.: UK  EU SE  EU SE  NATO SE is not a member of NATO.

9-5 Set concepts (4) Two sets are equal if they contain exactly the same members. E.g.: NAFTA = {US, CA, MX} NAFTA  {CA, US} Order of members doesn’t matter. These two sets are unequal. Set s 1 subsumes (is a superset of) set s 2 if every member of s 2 is also a member of s 1. This is written s 1  s 2. E.g.: NATO  {CA, US} NATO  EU NATO does not subsume EU.

9-6 Set concepts (5) The union of sets s 1 and s 2 is a set containing just those values that are members of s 1 or s 2 or both. This is written s 1  s 2. E.g.: {DK, NO, SE}  {FI, IS}= {DK, FI, IS, NO, SE} {DK, NO, SE}  {IS, NO}= {DK, IS, NO, SE}

9-7 Set concepts (6) The intersection of sets s 1 and s 2 is a set containing just those values that are members of both s 1 and s 2. This is written s 1  s 2. E.g.: NAFTA  NATO= {CA, US} NAFTA  EU= {} Two sets are disjoint if they have no common member, I.e., if their intersection is empty. E.g.: NAFTA and EU are disjoint NATO and EU are not disjoint.

9-8 Set concepts (7) The difference of sets s 1 and s 2 is a set containing just those values that are members of s 1 but not of s 2. This is written s 1 – s 2. E.g.: NATO – EU= {CA, CZ, HU, IS, NO, PL, TR, US} EU – NATO= {AT, FI, IE, SE}

9-9 Set applications Spelling checker:  A spelling checker’s dictionary is a set of words.  The spelling checker highlights any words in the document that are not in the dictionary.  The spelling checker might allow the user to add words to the dictionary. Relational database system:  A relation is essentially a set of tuples.  Each tuple is distinct.  The tuples are in no particular order.

9-10 Example 1: prime numbers A prime number is an integer that is divisible only by itself and 1. E.g.: 2, 7, 11, 13 are prime numbers. Eratosthenes’ sieve algorithm: To compute the set of prime numbers less than m (where m > 0): 1.Set sieve = {2, 3, …, m–1}. 2.For i = 2, 3, …, while i 2  m, repeat: 2.1.If i is a member of sieve: Remove all multiples of i from sieve. 3.Terminate with answer sieve For mult = 2i, 3i,..., while mult < m, repeat: Remove mult from sieve. 1.1.Set sieve = { }. 1.2.For i = 2,..., m–1, repeat: Add i to sieve.

9-11 Set ADT: requirements Requirements: 1)It must be possible to make a set empty. 2)It must be possible to test whether a set is empty. 3)It must be possible to obtain the cardinality of a set. 4)It must be possible to perform a membership test. 5)It must be possible to add or remove a member of a set. 6)It must be possible to test whether two sets are equal. 7)It must be possible to test whether one set subsumes another. 8)It must be possible to compute the union, intersection, or difference of two sets. 9)It must be possible to traverse a set.

9-12 Set ADT: contract (1) Possible contract: public interface Set { // Each Set object is a set whose members are objects. //////////// Accessors //////////// public boolean isEmpty (); // Return true if and only if this set is empty. public int size (); // Return the cardinality of this set. public boolean contains (Object obj); // Return true if and only if obj is a member of this set.

9-13 Set ADT: contract (2) Possible contract (continued): public boolean equals (Set that); // Return true if and only if this set is equal to that. public boolean containsAll (Set that); // Return true if and only if this set subsumes that.

9-14 Set ADT: contract (3) Possible contract (continued): //////////// Transformers //////////// public void clear (); // Make this set empty. public void add (Object obj); // Add obj as a member of this set. public void remove (Object obj); // Remove obj from this set. public void addAll (Set that); // Make this set the union of itself and that.

9-15 Set ADT: contract (4) Possible contract (continued): public void removeAll (Set that); // Make this set the difference of itself and that. public void retainAll (Set that); // Make this set the intersection of itself and that. //////////// Iterator //////////// public Iterator iterator(); // Return an iterator that will visit all members of this set, in no // particular order. }

9-16 Implementation of sets using arrays (1) Represent a bounded set (cardinality  maxcard) by:  a variable card, containing the current cardinality  an array members of length maxcard, containing the set members in members[0… card–1]. Keep the array sorted, and avoid storing duplicates. Illustration (maxcard = 6): MXUSCA card=35 Empty set: 1 card=0maxcard–1 01card–1card Invariant: member maxcard–1 greatest member unoccupied least member

9-17 Implementation using arrays (2) Summary of algorithms and time complexities: OperationAlgorithmTime complexity contains binary searchO(log n) add binary search + insertionO(n)O(n) remove binary search + deletionO(n)O(n) equals pairwise comparisonO(n2)O(n2) containsAll variant of pairwise comparisonO(n2)O(n2) addAll array mergeO(n1+n2)O(n1+n2) removeAll variant of array mergeO(n1+n2)O(n1+n2) retainAll variant of array mergeO(n1+n2)O(n1+n2)

9-18 Implementation of sets using SLLs (1) Represent an (unbounded) set by:  a variable card, containing the current cardinality  an SLL, containing one member per node. Keep the SLL sorted, and avoid storing duplicates. member Invariant: Empty set: Illustration: CAMXUS least membergreatest member represents the set {CA, US, MX}

9-19 Implementation using SLLs (2) Summary of algorithms and time complexities: OperationAlgorithmTime complexity contains SLL linear searchO(n)O(n) add SLL linear search + insertionO(n)O(n) remove SLL linear search + deletionO(n)O(n) equals pairwise comparisonO(n2)O(n2) containsAll variant of pairwise comparisonO(n2)O(n2) addAll SLL mergeO(n1+n2)O(n1+n2) removeAll variant of SLL mergeO(n1+n2)O(n1+n2) retainAll variant of SLL mergeO(n1+n2)O(n1+n2)

9-20 Implementation of small-integer sets using boolean arrays (1) If the members are known to be small integers, in the range 0…m–1, represent the set by:  a boolean array b of length m, such that b[i] is true if and only if i is a member of the set. 01m–1 Invariant: bool. 2 Empty set: 01m–1 false 2 Illustration (m = 10): falsetrue false true 5 3 false 6 true 7 represents the set {2, 3, 5, 7}

9-21 Implementation using boolean arrays (2) Summary of algorithms and time complexities: OperationAlgorithmTime complexity contains test array componentO(1) add set array component to trueO(1) remove set array component to falseO(1) equals pairwise equality testO(m)O(m) containsAll pairwise implication testO(m)O(m) addAll pairwise disjunctionO(m)O(m) removeAll pairwise negation + conjunctionO(m)O(m) retainAll pairwise conjunctionO(m)O(m)

9-22 Summary of set implementations (1) OperationArray representation SLL representation Boolean array representation contains O(log n)O(n)O(n)O(1) add O(n)O(n)O(n)O(n)O(1) remove O(n)O(n)O(n)O(n)O(1) equals O(n2)O(n2)O(n2)O(n2)O(m)O(m) containsAll O(n2)O(n2)O(n2)O(n2)O(m)O(m) addAll O(n1+n2)O(n1+n2)O(n1+n2)O(n1+n2)O(m)O(m) removeAll O(n1+n2)O(n1+n2)O(n1+n2)O(n1+n2)O(m)O(m) retainAll O(n1+n2)O(n1+n2)O(n1+n2)O(n1+n2)O(m)O(m)

9-23 Summary of set implementations (2) The array representation is suitable only for small or static sets.  A static set is one in which members are never/infrequently added or removed. The SLL representation is suitable only for small sets. The boolean-array representation is suitable only for dense sets of small integers.  A dense set is one where most potential members are actually present. For general applications, we need a more efficient set representation: search tree (see 10) or hash table (see 12).

9-24 Sets in the Java class library The java.util.Set interface is similar to the Set interface above. The java.util.TreeSet class implements the java.util.Set interface, representing each set by a search tree (see 10). The java.util.HashSet class implements the java.util.Set interface, representing each set by an open-bucket hash table (see 12).

9-25 Example 2: information retrieval (1) Consider a very simple information retrieval system. A query is a set of key words. Each document in the document base is viewed as a set of words. The order of words in a document is of no significance. In response to a query, the system identifies each document that contains all or some of the key words.

9-26 Example 2 (2) Outline of implementation: public static final int NONE=0, SOME=1, ALL=2; public static int score (String name, Set keywords) { // Return a score reflecting whether the document named name // contains all, some, or none of the words in keywords. Set docwords = readAllWords(name); if (docwords.containsAll(keywords)) return ALL; else if (disjoint(docWords, keywords)) return NONE; else return SOME; }

9-27 Example 2 (3) Outline of implementation (continued): private static boolean disjoint ( Set docwords, Set keywords) { // Return true if and only if the sets docwords and keywords // have no common words. Iterator iter = keywords.iterator(); while (iter.hasNext()) { String keyword = (String) iter.next(); if (docwords.contains(word)) return false; } return true; }

9-28 Example 2 (4) Outline of implementation (continued): private static Set readAllWords (String name) { // Return the set of all words occurring in the document name. BufferedReader doc = new BufferedReader( new InputStreamReader( new FileInputStream(name))); Set words = new TreeSet(); for (;;) { String word = readWord(doc); if (word == null) break; // end of document words.add(word.toLowerCase()); } doc.close(); return words; } or: new HashSet()

9-29 Example 2 (5) Outline of implementation (continued): private static String readWord ( BufferedReader doc) throws IOException { // Read and return the next word from doc, skipping any preceding // white space or punctuation. Return null if no word remains to be // read. … }