Download presentation
Presentation is loading. Please wait.
Published byTyler Poole Modified over 8 years ago
1
CIS 068 Welcome to CIS 068 ! Lesson 10: Data Structures
2
CIS 068 Overview Description, Usage and Java- Implementation of Collections Lists Sets Hashing
3
CIS 068 Definition Data Structures Definition (www.nist.gov):www.nist.gov “An organization of information, usually in memory, for better algorithm efficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person.”
4
CIS 068 Efficiency “An organization of information …for better algorithm efficiency...”: Isn’t the efficiency of an algorithm defined by the order of magnitude O( )?
5
CIS 068 Efficiency Yes, but it is dependent on its implementation.
6
CIS 068 Introduction Data structures define the structure of a collection of data types, i.e. primitive data types or objects The structure provides different ways to access the data Different tasks need different ways to access the data Different tasks need different data structures
7
CIS 068 Introduction Typical properties of different structures: fixed length / variable length access by index / access by iteration duplicate elements allowed / not allowed
8
CIS 068 Examples Tasks: Read 300 integers Read an unknown number of integers Read 5th element of sorted collection Read next element of sorted collection Merge element at 5th position into collection Check if object is in collection
9
CIS 068 Examples Although you can invent any datastructure you want, there are ‘classic structures‘, providing: Coverage of most (classic) problems Analysis of efficience Basic implementation in modern languages, like JAVA
10
CIS 068 Data Structures in JAVA Let‘s see what JAVA has to offer:
11
CIS 068 The Collection Hierarchy Collection: top interface, specifying requirements for all collections
12
CIS 068 Collection Interface
13
CIS 068 Collection Interface !
14
CIS 068 Iterator Interface Purpose: Sequential access to collection elements Note: the so far used technique of sequentially accessing elements by sequentially indexing is not reasonable in general (why ?) ! Methods:
15
CIS 068 Iterator Interface Iterator points ‘between‘ the elements of collection: 12345 first position, hasNext() = true, remove() throws error Current position (after 2 calls to next() ), remove() deletes element 2 Position after next() hasNext() = false Returned element
16
CIS 068 Iterator Interface Usage Typical usage of iterator:
17
CIS 068 Back to Collections AbstractCollection
18
CIS 068 AbstractCollection Facilitates implementation of Collection interface Providing a skeletal implementation Implementation of a concrete class: Provide data structure (e.g. array) Provide access to data structure
19
CIS 068 AbstractCollection Concrete class must provide implementation of Iterator To maintain ‘abstract character‘ of data in AbstractClass implemented (non abstract) methods use Iterator-methods to access data AbstractCollection myCollection add(){ Iterator i=iterator(); … } Clear(){ Iterator i=iterator(); … } implements Iterator; int[ ] data; Iterator iterator(){ return this; } hasNext(){ … } …
20
CIS 068 Back to Collections List Interface
21
CIS 068 List Interface Extends the Collection Interface Adds methods to insert and retrieve objects by their position (index) Note: Collection Interface could NOT specify the position A new Iterator, the ListIterator, is introduced ListIterator extends Iterator, allowing for bidirectional traversal (previousIndex()...)
22
CIS 068 List Interface Incorporates index ! A new Iterator Type (can move forward and backward)
23
CIS 068 Example: Selection-Sorting a List Part 1: call to selection sort Actual implementation of List does not matter ! Call to SelectionSort Use only Iterator- properties of ListIterator (upcasting)
24
CIS 068 Example: Selection-Sorting a List Part 2: Selection sort access at index ‘fill‘ Inner loop swap
25
CIS 068 Back to Collections AbstractList:...again the implementation of some methods... Note: Still ABSTRACT !
26
CIS 068 Concrete Lists ArrayList and Vector: at last concrete implementations !
27
CIS 068 ArrayList and Vector Vector: For compatibility reasons (only) Use ArrayList ArrayList: Underlying DataStructure is Array List-Properties add advantage over Array: Size can grow and shrink Elements can be inserted and removed in the middle
28
CIS 068 An Alternative Implementation (1)
29
CIS 068 An Alternative Implementation (2)
30
CIS 068 An Alternative Implementation (3)
31
CIS 068 Collections The underlying array-datastructure has advantages for index-based access disadvantages for insertion / removal of middle elements (copy), insertion/removal with O(n) Alternative: linked lists
32
CIS 068 Linked List Flexible structure, providing Insertion and removal from any place in O(1), compared to O(n) for array-based list Sequential access Random access at O(n), compared to O(1) for array-based list
33
CIS 068 Linked List List of dynamically allocated nodes Nodes arranged into a linked structure Data Structure ‘node‘ must provide Data itself (example: the bead-body) A possible link to another node (ex.: the link) Children’s pop-beads as an example for a linked list
34
CIS 068 Linked List Old nodeNew node next (null)
35
CIS 068 Connecting Nodes creating the nodes connecting
36
CIS 068 Inserting Nodes p.link = r r.link = q q can be accessed by p.link.link r
37
CIS 068 Removing Nodes pq
38
CIS 068 Traversing a List (null)
39
CIS 068 Double Linked Lists Single linked list Double linked list (null) data successor predecessor data successor predecessor data successor predecessor (null)
40
CIS 068 Back to Collections AbstractSequentialList and LinkedList
41
CIS 068 LinkedList An implementation example: See textbook
42
CIS 068 Sets Example task: Examine, collection contains object o Solution using a List: -> O(n) operation !
43
CIS 068 Sets Comparison to List: Set is designed to overcome the limitation of O(n) Contains unique elements contains() / remove() operate in O(1) or O(log n) No get() method, no index-access......but iterator can (still) be used to traverse set
44
CIS 068 Back to Collections Interface Set
45
CIS 068 Hashing How can method ‘contain()‘ be implemented to be an O(1) operation ? http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/hash_tables.html
46
CIS 068 Hashing How can method ‘contain()‘ be implemented to be an O(1) operation ? Idea: Retrieving an object of an array can be done in O(1) if the index is known Determine the index to store and retrieve an object by the object itself !
47
CIS 068 Hashing Determine the index... by the object itself: Example: Store Strings “Apu“, “Bob“, “Daria“ as Set. Define function H: String -> integer: Take first character, A=1, B=2,... Store names in String array at position H(name)
48
CIS 068 Hashing Apu: first character: A H(A) = 1 Bob:first character: B H(B) = 2 Daria:first character: D H(D) = 4... Apu Bob (unused) Daria (unused) …
49
CIS 068 Hashing The Function H(o) is called the HashCode of the object o Properties of a hashcode function: If a.equals(b) then H(a) = H(b) BUT NOT NECESSARILY VICE VERSA: H(a) = H(b) does NOT guarantee a.equals(b) ! If H() has ‘sufficient variation‘, then it is most likely, that different objects have different hashcodes
50
CIS 068 Hashing Additionally an array is needed, that has sufficient space to contain at least all elements. The hashcode may not address an index outside the array, this can easily be achieved by: H1(o) = H(o) % n % = modulo-function, n = array length The larger the array, the more variates H1() ! Apu Bob (unused) Daria (unused) …
51
CIS 068 Hashing Back to the example: Insert ‘Abe‘ First character:AH(A) = 1 H(Apu) = H(Abe), this is called a Collision Apu Bob (unused) Daria (unused) …
52
CIS 068 Solving Collisions Method 1: Don‘t use array of objects, but arrays of linked lists ! Apu Bob (unused) Daria (unused) Abe ARRAY Array contains (start of) linked lists
53
CIS 068 Solving Collisions Drawback: Objects must be ‘wrapped‘ in node structure, to provide links, introducing a huge overhead ’Apu’ Node link ’Apu’ wrap
54
CIS 068 Solving Collisions Method 2: Iteratively apply different hashcodes H0, H1, H2,.. to object o, until collision is solved As long as the different hashcodes are used in the same order, the search is guaranteed to be consistent Apu Bob (unused) Daria (unused) ARRAY Apu H0 H1 H2
55
CIS 068 Solving Collisions The easiest hashcode-series H inc : H(0) = H H i = H i-1 + i http://ciips.ee.uwa.edu.au/~morris/Year2/PLDS210/hash_tables.html Apu Bob (unused) Daria (unused) ARRAY Apu H0 H1 H2
56
CIS 068 add Example implementation of ‘add(Object o)‘ using H inc (assume array A has length n, H as given above) determine index = H(o) % n while ( A[index] != null ) if o.equals(A[index]) break; else index = (index +1) % n; end } add element at position a[index]
57
CIS 068 Example implementation of ‘contains(Object o)‘ using H inc (assume array A has length n, H as given above) determine index = H(o) % n found = false; while ( A[index] != null ) if o.equals(A[index]) found = true; break; else index = (index +1) % n; end } // ‘found‘ is true if set contains object o contains
58
CIS 068 If there is no collision, contains() operates in O(1) If the set contains elements having the same hashcode, there is a collision. Being dup max the maximum value of elements having the same hash code, contains() operates in O(dup max ) If dup max is near n, there is no increase in speed, since contains() operates in O(n) Analysis
59
CIS 068 JAVA provides a hashcode for every object The implementation for hashCode for e.g. String is computed by: S[0]*31^(n-1) + s[1]*31^(n-2) +... + s[n-1] n = length of string, s[i] = character at position i A Real Hashcode Method hashCode in java.lang.Object
60
CIS 068 What happens if the array is full ? Create new array, e.g. double size, and insert all elements of old table into new table Note: the elements won‘t keep their index, since the modulo-function applied to the hashing has changed ! Rehashing a table
61
CIS 068 Hashtable provides Set-operations add(), contains() in O(1) if hashcode is chosen properly and array allows for sufficient variation Speed is gained by usage of more memory If multiple collisions occur, hashtable might be slower than list due to overhead (computation of H,...) Hashcode Resume
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.