"He's off the map!" - Eternal Sunshine of the Spotless Mind

Slides:



Advertisements
Similar presentations
© 2004 Goodrich, Tamassia Hash Tables1  
Advertisements

Dictionaries Chapter Chapter Contents Specifications for the ADT Dictionary Entries and methods Using the ADT Dictionary English Dictionary Telephone.
© 2004 Goodrich, Tamassia Maps1. © 2004 Goodrich, Tamassia Maps2 A map models a searchable collection of key-value entries The main operations of a map.
Dictionaries1 © 2010 Goodrich, Tamassia m l h m l h m l.
Maps. Hash Tables. Dictionaries. 2 CPSC 3200 University of Tennessee at Chattanooga – Summer 2013 © 2010 Goodrich, Tamassia.
© 2006 Pearson Addison-Wesley. All rights reserved12 B-1 Chapter 12 (continued) Tables and Priority Queues.
© 2006 Pearson Addison-Wesley. All rights reserved12-1 Chapter 12 Tables and Priority Queues CS102 Sections 51 and 52 Marc Smith and Jim Ten Eyck Spring.
© 2004 Goodrich, Tamassia Maps1. © 2004 Goodrich, Tamassia Maps2 A map models a searchable collection of key-value entries The main operations of a map.
Data Structures Hash Table (aka Dictionary) i206 Fall 2010 John Chuang Some slides adapted from Marti Hearst, Brian Hayes, Andreas Veneris, Glenn Brookshear,
1 Topic 9 Maps "He's off the map!" -Stan (Mark Ruffalo) Eternal Sunshine of the Spotless Mind.
CSE 373 Data Structures and Algorithms Lecture 18: Hashing III.
Java's Collection Framework
Dictionaries CS 105. L11: Dictionaries Slide 2 Definition The Dictionary Data Structure structure that facilitates searching objects are stored with search.
CS212: DATA STRUCTURES Lecture 10:Hashing 1. Outline 2  Map Abstract Data type  Map Abstract Data type methods  What is hash  Hash tables  Bucket.
D ESIGN & A NALYSIS OF A LGORITHM 07 – M AP Informatics Department Parahyangan Catholic University.
Hash Tables1   © 2010 Goodrich, Tamassia.
1 TCSS 143, Autumn 2004 Lecture Notes Java Collection Framework: Maps and Sets.
Sets and Maps Chris Nevison. Set Interface Models collection with no repetitions subinterface of Collection –has all collection methods has a subinterface.
© 2004 Goodrich, Tamassia Hash Tables1  
Sets and Maps Computer Science 4 Mr. Gerb Reference: Objective: Understand the two basic applications of searching.
Map ADT by Dr. Bun Yue Professor of Computer Science CSCI 3333 Data Structures.
Maps & dictionaries Go&Ta A map models a searchable collection of key-value entries The main operations of a map are for searching, inserting,
1 i206: Lecture 12: Hash Tables (Dictionaries); Intro to Recursion Marti Hearst Spring 2012.
Maps Nick Mouriski.
Week 9 - Friday.  What did we talk about last time?  Collisions  Open addressing ▪ Linear probing ▪ Quadratic probing ▪ Double hashing  Chaining.
Dictionaries CS 110: Data Structures and Algorithms First Semester,
1 Maps, Stacks and Queues Maps Reading:  2 nd Ed: 20.4, 21.2, 21.7  3 rd Ed: 15.4, 16.2, 16.7 Additional references: Online Java Tutorial at
CSE 143 Lecture 11: Sets and Maps reading:
Maps1 © 2010 Goodrich, Tamassia. Maps2  A map models a searchable collection of key-value entries  The main operations of a map are for searching, inserting,
3-1 Java's Collection Framework Another use of polymorphism and interfaces Rick Mercer.
© 2004 Goodrich, Tamassia Maps1. © 2004 Goodrich, Tamassia Maps2 A map models a searchable collection of key-value entries The main operations of a map.
Lecture14: Hashing Bohyung Han CSE, POSTECH CSED233: Data Structures (2014F)
Tables and Priority Queues
Maps Rem Collier Room A1.02 School of Computer Science and Informatics
Maps 1/28/2018 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M. H. Goldwasser,
Design & Analysis of Algorithm Map
Efficiency of in Binary Trees
Binary Search Trees (10.1) CSE 2011 Winter August 2018.
Cinda Heeren / Geoffrey Tien
Dictionaries Dictionaries 07/27/16 16:46 07/27/16 16:46 Hash Tables 
Hash Tables 3/25/15 Presentation for use with the textbook Data Structures and Algorithms in Java, 6th edition, by M. T. Goodrich, R. Tamassia, and M.
JCF Hashmap & Hashset Winter 2005 CS-2851 Dr. Mark L. Hornick.
Road Map CS Concepts Data Structures Java Language Java Collections
Dictionaries < > = Dictionaries Dictionaries
Binary Search Trees (10.1) CSE 2011 Winter November 2018.
Data Structures Maps and Hash.
Hashing II CS2110 Spring 2018.
TCSS 342, Winter 2006 Lecture Notes
"He's off the map!" Topic 9 Maps
CSE 373: Data Structures and Algorithms
Java Collections Framework
Maps.
Maps "He's off the map!" -Stan (Mark Ruffalo) Eternal Sunshine of the Spotless Mind.
Ordered Maps & Dictionaries
Implementing Hash and AVL
Copyright © Aiman Hanna All rights reserved
CS2110: Software Development Methods
CH 9 : Maps And Dictionary
L5. Necessary Java Programming Techniques
Dictionaries 4/5/2019 1:49 AM Hash Tables  
Lecture 3: Maps and Iterators
Hashing in java.util
Collision Handling Collisions occur when different elements are mapped to the same cell.
CS210- Lecture 17 July 12, 2005 Agenda Collision Handling
CS210- Lecture 16 July 11, 2005 Agenda Maps and Dictionaries Map ADT
CS210- Lecture 15 July 7, 2005 Agenda Median Heaps Adaptable PQ
Data Structures II AP Computer Science
Lecture 3: Maps and Iterators
Lecture 11 CS
Dictionaries < > = Dictionaries Dictionaries
Presentation transcript:

"He's off the map!" - Eternal Sunshine of the Spotless Mind Map ADT "He's off the map!" - Eternal Sunshine of the Spotless Mind

Data Structures Given a program that counts the frequency of all the words in a file. Assume words are anything set off by whitespace. CS 321 - Maps

Performance using ArrayList Title Size (kb) Total Words Distinct Words Time (sec) small sample 0.6 89 25 0.001 2BR02B 34 5,638 1,975 0.051 Alice in Wonderland 120 29,460 6,017 0.741 Adventures of Sherlock Holmes 581 107,533 15,213 4.144 2008 CIA Factbook 10,030 1,330,100 74,042 173.000 CS 321 - Maps

Order? Title Size Total Words Distinct Words Time Express change in size as factor of previous file. Title Size Total Words Distinct Words Time small sample 0.6 89 25 0.001 2BR02B 57x 63x 79x 51x Alice in Wonderland 3.5x 5.2x 3x 14.5x Adventures of Sherlock Holmes 4.8x 3.7x 2.5x 5.6x 2008 CIA Factbook 17.3x 12.4x 4.9x 41.7x O(Total Words * Distinct Words) ?? CS 321 - Maps

Question Given it takes 2.88 minutes for the 2008 CIA Factbook with 1,330,100 total words and 74,042 distinct words, how long will take for another text with 1,000x total words and 100x distinct words? an hour a day a week a month half a year 2.88 / (1,330,100 * 74, 042) = ? / (1,330,100,000 * 7,404,200) 2.88 = ? / (1,000 * 100) = 288, 000 minutes = 28.5 days CS 321 - Maps

Question Given it takes 2.88 minutes for the 2008 CIA Factbook with 1,330,100 total words and 74,042 distinct words, how long will take for another text with 1,000x total words and 100x distinct words? an hour a day a week a month half a year CS 321 - Maps

Why So Slow? Write a contains method for array-based list. public boolean contains(E target) { } CS 321 - Maps

Why So Slow? Write a contains method for array-based list. public boolean contains(E target) { boolean result = false; int i = 0; while(!result && i < size()) if(list[i] == target) result = true; i++; } return result; CS 321 - Maps

A Faster Way – Maps A -> 65 A data structure optimized for a very specific kind of search / access. Also known as: Dictionary, search table, or associative array. A -> 65 CS 321 - Maps

Maps Models a searchable collection of key-value entries. Main operations of a map are for searching, inserting, and deleting items Multiple entries with the same key are not allowed. In a map we access by asking "give me the value associated with this key." Applications: address book student-record database CS 321 - Maps

Keys and Values Dictionary Analogy: The key in a dictionary is a word: foo The value in a dictionary is the definition: First on the standard list of metasyntactic variables used in syntax examples. A key and its associated value form a pair that is stored in a map. To retrieve a value, need the key for that value. An indexed List can be viewed as a Map with integer keys. CS 321 - Maps

More on Keys and Values Keys must be unique. A given key can only represent one value. But one value may be represented by multiple keys. Like synonyms in the dictionary. Example: factor: n. See coefficient of X. factor is a key associated with the same value (definition) as the key coefficient of X. CS 321 - Maps

The Map ADT get(k): if the map M has an entry with key k, return its associated value; else, return null. put(k, v): insert entry (k, v) into the map M; if key k is not already in M, then return null; else, return old value associated with k. remove(k): if the map M has an entry with key k, remove it from M and return its associated value; else, return null. size, isEmpty entrySet(): return an iterable collection of the entries in M. keySet(): return an iterable collection of the keys in M. values(): return an iterator of the values in M. CS 321 - Maps

Example Operation Output Map isEmpty() true Ø put(5,A) null (5,A) put(7,B) null (5,A),(7,B) put(2,C) null (5,A),(7,B),(2,C) put(8,D) null (5,A),(7,B),(2,C),(8,D) put(2,E) C (5,A),(7,B),(2,E),(8,D) get(7) B (5,A),(7,B),(2,E),(8,D) get(4) null (5,A),(7,B),(2,E),(8,D) get(2) E (5,A),(7,B),(2,E),(8,D) size() 4 (5,A),(7,B),(2,E),(8,D) remove(5) A (7,B),(2,E),(8,D) remove(2) E (7,B),(8,D) get(2) null (7,B),(8,D) isEmpty() false (7,B),(8,D)

Implementing a Map One easy implementation uses an Unsorted List ADT that uses a double-linked list. Other common implementations of maps use a binary search tree or a hash table as the internal storage container. CS 321 - Maps

A Simple List-Based Map We can implement a Map ADT using an Unsorted List. Store the items of the map in a list S in arbitrary order. tail head nodes/positions entries 9 c 6 5 8 CS 321 - Maps

The get(k) Algorithm Algorithm get(k) itr <- S.iterator while itr.hasNext do entry <- itr.next // next entry in list if entry.getKey = k then return entry.getValue return null //no entry with key value of k CS 321 - Maps

The put(k,v) Algorithm Algorithm put(k,v): itr <- S.iterator while itr.hasNext do entry <- B.next // next entry in list if entry.getKey = k then value <- entry.getValue entry.setValue(v) entry.setKey(k) return value // return the old value S.addLast((k,v)) count <- count + 1 // increment num of entries return null // no entry with key value k CS 321 - Maps

The remove(k) Algorithm Algorithm remove(k): itr <- S.iterator while itr.has do entry <- itr.next if entry.getKey = k then value <- entry.getValue S.remove(entry) count = count – 1 return value // return value return null // no entry with key value k CS 321 - Maps

Performance of a List-Based Map Takes O(n) time in the worst-case for all operations. Same for average case. Why? The Unsorted List implementation is effective only for Maps of small size. More often use Binary Trees or Hash Tables. CS 321 - Maps

Results with HashMap Title Size (kb) Total Words Distinct Words Time List Time Map small sample 0.6 89 25 0.001 0.0008 2BR02B 34 5,638 1,975 0.051 0.0140 Alice in Wonderland 120 29,460 6,017 0.741 0.0720 Adventures of Sherlock Holmes 581 107,533 15,213 4.144 0.2500 2008 CIA Factbook 10,030 1,330,100 74,042 173.000 4.0000 CS 321 - Maps

Order? Title Size Total Words Distinct Words Time List Time Map small sample 0.6 89 25 0.001 0.0008 2BR02B 57x 63x 79x 51x 18x Alice in Wonderland 3.5x 5.2x 3.0x 14.5x 5x Adventures of Sherlock Holmes 4.8x 3.7x 2.5x 5.6x 2008 CIA Factbook 17x 12.3x 42x 16x O(Total Words)? CS 321 - Maps

Question Given it takes 4 seconds for the 2008 CIA Factbook with 1,330,100 total words and 74,042 distinct words, how long will take for another text with 1,000x total words and 100x distinct words? an hour a day ½ a week a week a month 4 / (1,330,100 * 74, 042) = ? / (1,330,100,000 * 7,404,200) 4 = ? / (1,000 * 100) = 400, 000 seconds = 4.6 days CS 321 - Maps

Question Given it takes 4 seconds for the 2008 CIA Factbook with 1,330,100 total words and 74,042 distinct words, how long will take for another text with 1,000x total words and 100x distinct words? an hour a day ½ a week a week a month 4 / (1,330,100 * 74, 042) = ? / (1,330,100,000 * 7,404,200) 4 = ? / (1,000 * 100) = 400, 000 seconds = 4.6 days CS 321 - Maps

CS 321 - Maps

The Map<K, V> Interface in Java void clear() Removes all mappings from this map (optional operation). boolean containsKey(Object key) Returns true if this map contains a mapping for the specified key. boolean containsValue(Object value) Returns true if this map maps one or more keys to the specified value. Set<K> keySet() Returns a Set view of the keys contained in this map. CS 321 - Maps

The Map Interface Continued V get(Object key) Returns the value to which this map maps the specified key. boolean isEmpty() Returns true if this map contains no key-value mappings. V put(K key, V value) Associates the specified value with the specified key in this map CS 321 - Maps

The Map Interface Continued V remove(Object key) Removes the mapping for this key from this map if it is present int size() Returns the number of key-value mappings in this map. Collection<V> values() Returns a collection view of the values contained in this map. CS 321 - Maps