Data Structures and Algorithms I

Data Structures and Algorithms I
CMP 338 Data Structures and Algorithms I Day 19, 11/8/11 Review Chapters 3 and 4

Symbol Table A symbol table is a mapping from Key's to Value's
Conventions: (At most) one Value per Key null is not a legal Key null is not a legal Value Key's should be immutable Shorthand implementations: delete(Key k) { put(k, null); } contains(Key k) { return get(k) != null; } IsEmpty() { return size() == 0 }

Symbol Table API public class SymbolTable<Key, Value>
void put(Key k, Value v) Value get(Key k) void delete(Key k) boolean contains(Key k) boolean isEmpty() int size() Iterable<Key> keys()

Sequential Search Implementation
public Value get(Key k) for (Node n=first; n!=null; n=next) if (k.equals(n.key)) return n.val; return null; public void put(Key k, Value v) n.val = v; return first = new Node(k, v, first)

Hash Table Implementations
Hash function: hash() Maps Key's to small int's Java hashCode() maps Object's to 32-bit ints Collision resolution: Strategy for handling two Key's mapped to the same int Closed-addressing (e.g., separate chaining) Array entries point to secondary symbol table Open-addressing (e.g. linear probing) All Key-Value pairs stored in the same array

Hash Functions Uniform hashing assumption:
hash: Key → 0..M-1 uniform and independent Implementing hashCode() for user-defined types Combine hashCodes of each field (array entry) Start with a small prime (e.g., 17) Multiply accummulating hash by small prime (e.g. 31) Add hashCode() of next field (or array entry) Box primitive values (e.g., ((Integer) 14),hashCode()) Requirement: x.equals(y) => x.hashCode()==y.hashCode() Hash function: hash(Key k) return k.hashCode() && 0x7FFFFFFF % M;

Separate Chaining Hash Table
SeparateChainingHashTable(int size) M = size; for int i=0; i<M; i++ st[i] = new SequentialSearchST() public Value get(Key k) return (Value) st[hash(k)].get(k) public void put(Key k, Value v) st[hash(k)].put(k, v) private int hash(Key k) return k.hashCode() & 0x7FFFFFFF % M

Linear Probing Hash Table
public Value get(Key k) for int i=hash(k); null!=key[i]; i=i+1 % M if (keys[i] == k) return vals[i] return null public void put(key k, Value v) int i if (keys[i] == k) vals[i] = v; return keys[i]=k; vals[i]=v

Separate Chaining vs. Linear Probing
Easier to implement Performance degrades gracefully Clustering less sensitive to poor hash() Linear probing Wastes less space However, need to implement array resizing Better cache performance

Hashing vs. Balanced Search Trees
Simpler to code No effective alternative for unordered keys Faster (assuming efficient hash function) Better system support for Java Strings Balanced search trees Stronger performance guarantees Support for ordered operations Easier to implement compareTo correctly Than equals() and hashCode()

Java Symbol Tables Map<K, V> Interface
TreeMap<K, V> implements SortedMap<K, V> O(lg N) order operations (worst-case) HashMap<K, V> implements Map<K, V> O(1) put() and get() operations (average-case) Set<K> Interface TreeSet<K> implements SortedSet<K> HashSet<K> implements Set<K>

Graphs (Mathematics) (Directed) Graph <V, E>
V is a set of vertices E is a set of edges E  V x V DAG Directed Acyclic Graph Undirected Graph E is symmetric Edge-Weighted Graph weight: E → R

Graph Vocabulary A path is a sequence edges connecting vertices
simple path: no vertex appears twice A cycle is a path from a vertex to itself simple cycle: removing final edge leaves a simple path A connected component (undirected graph): A maximal set of connected vertices A strongly connected component (directed graph): A maximal set of vertices such that there is a directed path from any vertex to any other vertex

Depth-First Search Each node visited exactly once.
Visit each neighbor during visit to a node void visit (Node n) if (visited(n)) return; mark n visited do stuff for each neighbor m of n visit(m) maybe do more stuff Trace: (1(2(3)(4(5)(6))(7))(8(9)(A(B)(C))))(D(E)(F)) Example: ConnectedComponent.java

Breadth-First Search Each node visited exactly once.
Schedule visit to each neighbor during visit to a node void visit (Node n) if (visited(n)) return; mark n visited do stuff for each neighbor m of n put m on queue of nodes to visit maybe do more stuff Trace: (1)(2)(3)(6)(7)(8)(4)(5)(9)(A)(C)(B)(D)(E)(F) Example: ShortestPath.java

Spanning Tree (Undirected Graph)
A tree in an undirected graph: A set of connected edges not containing a cycle A spanning tree or an undirected graph: A tree that connects each vertex of the graph A spanning forest of an undirected graph: Set of spanning trees of the connected components A minimum spanning tree (MST) of a weighted graph The spanning tree with minimum total weight

Prim's MST Algorithm Use priority queue to keep track of the edges
mark any node while exists an edge from marked to unmarked pick the shortest such edge add the edge to the MST mark the unmarked vertex Use priority queue to keep track of the edges Optimization: only 1 edge per unmarked node Need to be able to reduce a key in a priority queue Running-time: ||E|| + ||V|| lg ||V||

Kruskal's MST Algorithm
while exists an unconsidered edge consider the shortest unconsidered edge if it would not create a cycle add the edge to MST Use priority queue to keep track of the edges Cycle detection: Disjoint Union / Find algorithm Edge will create a cycle iff both end-points in the same set Adding an edge to MST requires union of two sets Running-time: ||E|| lg ||E||

Shortest Path Algorithms
Shortest paths in edge-weighted directed graphs Problem ill-formed if any negative cycle is reachable If graph is a DAG Relax nodes in topological order O(||E|| + ||V||) If all edges are non-negative (Dijkstra) Mark and relax nearest unmarked node O(||E||+||V|| lg ||V||) General edge-weighted directed graphs (Bellman-Ford) Repeat up to ||V|| times: Relax nodes changed in previous iteration O(||E|| ||V||)

Relaxation void relax (Node n) for Node m in edgeFrom(n) relax(n, m)
void relax (Node n, Node m) If dist(s, n) + w(n, m) < dist(s, m) dist(s, m) = dist(s, n) + w(n, m) parent(m) = n dist is a Map<Node, Double> parent is a Map<Node, Node>

Dijkstra's Shortest Path Algorithm
mark the source node while exists an edge from marked to unmarked pick closest unmarked node n to source pick shortest edge from marked to n add edge to Shortest-Path tree mark and relax n Use priority queue to order unmarked nodes Running-time: ||E|| + ||V|| lg ||V||

Data Structures and Algorithms I

Similar presentations

Presentation on theme: "Data Structures and Algorithms I"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structures and Algorithms I

Similar presentations

Presentation on theme: "Data Structures and Algorithms I"— Presentation transcript:

Similar presentations

About project

Feedback