Data Structure – Final Review

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

David Luebke 1 6/7/2014 ITCS 6114 Skip Lists Hashing.
COL 106 Shweta Agrawal and Amit Kumar
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Transform and Conquer Chapter 6. Transform and Conquer Solve problem by transforming into: a more convenient instance of the same problem (instance simplification)
Trees Types and Operations
David Luebke 1 5/9/2015 CS 332: Algorithms Graph Algorithms.
Graph Searching (Graph Traversal) Algorithm Design and Analysis Week 8 Bibliography: [CLRS] – chap 22.2 –
1 Graph Programming Gordon College. 2 Graph Basics A graph G = (V, E) –V = set of vertices, E = set of edges –Dense graph: |E|  |V| 2 ; Sparse graph:
Advanced Data Structures
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
Binary Trees, Binary Search Trees COMP171 Fall 2006.
Balanced Binary Search Trees
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Lecture 14: Graph Algorithms Shang-Hua Teng. Undirected Graphs A graph G = (V, E) –V: vertices –E : edges, unordered pairs of vertices from V  V –(u,v)
Lec 15 April 9 Topics: l binary Trees l expression trees Binary Search Trees (Chapter 5 of text)
Course Review COMP171 Spring Hashing / Slide 2 Elementary Data Structures * Linked lists n Types: singular, doubly, circular n Operations: insert,
Department of Computer Eng. & IT Amirkabir University of Technology (Tehran Polytechnic) Data Structures Lecturer: Abbas Sarraf Search.
Trees and Red-Black Trees Gordon College Prof. Brinton.
Transforming Infix to Postfix
David Luebke 1 7/2/2015 ITCS 6114 Red-Black Trees.
1 7/2/2015 ITCS 6114 Graph Algorithms. 2 7/2/2015 Graphs ● A graph G = (V, E) ■ V = set of vertices ■ E = set of edges = subset of V  V ■ Thus |E| =
David Luebke 1 8/7/2015 CS 332: Algorithms Graph Algorithms.
Review of Graphs A graph is composed of edges E and vertices V that link the nodes together. A graph G is often denoted G=(V,E) where V is the set of vertices.
Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6.
COSC2007 Data Structures II
Compiled by: Dr. Mohammad Alhawarat BST, Priority Queue, Heaps - Heapsort CHAPTER 07.
CS 3610 Midterm Review.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
David Luebke 1 10/16/2015 CS 332: Algorithms Go Over Midterm Intro to Graph Algorithms.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Mudasser Naseer 1 10/20/2015 CSC 201: Design and Analysis of Algorithms Lecture # 11 Red-Black Trees.
Tonga Institute of Higher Education Design and Analysis of Algorithms IT 254 Lecture 4: Data Structures.
David Luebke 1 10/25/2015 CS 332: Algorithms Skip Lists Hash Tables.
Starting at Binary Trees
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 9.
1 Chapter 22 Elementary Graph Algorithms. 2 Introduction G=(V, E) –V = vertex set –E = edge set Graph representation –Adjacency list –Adjacency matrix.
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
CSC 413/513: Intro to Algorithms Graph Algorithms.
Graph Algorithms Searching. Review: Graphs ● A graph G = (V, E) ■ V = set of vertices, E = set of edges ■ Dense graph: |E|  |V| 2 ; Sparse graph: |E|
David Luebke 1 12/23/2015 Heaps & Priority Queues.
David Luebke 1 1/6/2016 CS 332: Algorithms Graph Algorithms.
Heapsort. What is a “heap”? Definitions of heap: 1.A large area of memory from which the programmer can allocate blocks as needed, and deallocate them.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
Mudasser Naseer 1 1/9/2016 CS 201: Design and Analysis of Algorithms Lecture # 17 Elementary Graph Algorithms (CH # 22)
October 19, 2005Copyright © by Erik D. Demaine and Charles E. LeisersonL7.1 Introduction to Algorithms LECTURE 8 Balanced Search Trees ‧ Binary.
CS 2133: Algorithms Intro to Graph Algorithms (Slides created by David Luebke)
David Luebke 1 1/25/2016 CSE 207: Algorithms Graph Algorithms.
CS6045: Advanced Algorithms Data Structures. Hashing Tables Motivation: symbol tables –A compiler uses a symbol table to relate symbols to associated.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Binary Search Trees.  Understand tree terminology  Understand and implement tree traversals  Define the binary search tree property  Implement binary.
Shahed University Dr. Shahriar Bijani May  A path is a sequence of vertices P = (v 0, v 1, …, v k ) such that, for 1 ≤ i ≤ k, edge (v i – 1, v.
CSC 413/513: Intro to Algorithms Hash Tables. ● Hash table: ■ Given a table T and a record x, with key (= symbol) and satellite data, we need to support:
David Luebke 1 3/19/2016 CS 332: Algorithms Augmenting Data Structures.
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
64 Algorithms analysis and design BY Lecturer: Aisha Dawood.
CS 332: Algorithms Hash Tables David Luebke /19/2018.
CS 332: Algorithms Red-Black Trees David Luebke /20/2018.
Week 11 - Friday CS221.
Hashing Exercises.
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Elementary Graph Algorithms
Graphs A graph G = (V, E) V = set of vertices, E = set of edges
Intro to Graph Algorithms (Slides originally created by David Luebke)
Graphs Chapter 15 explain graph-based algorithms Graph definitions
Heaps and the Heapsort Heaps and priority queues
Binary Trees, Binary Search Trees
Binary SearchTrees [CLRS] – Chap 12.
CSC 325: Algorithms Graph Algorithms David Luebke /24/2019.
Binary Trees, Binary Search Trees
Presentation transcript:

Data Structure – Final Review 27-Apr-2009 SUNY Buffalo

About this review I’ve been asked to review several data structures covered in class. May not be totally complete as it is unrealistic to cover all the materials in ~40mins! Exam may ask questions that weren’t covered in this review but were covered in class. If you have questions, ask your instructor ASAP. I’ve used a different book than the one in this class. Materials were mostly from “Data structure with C”. I have years of hands on experiences with data structures/algorithms. If you wonder how data structures are used in the “real world”, ask them. 27-Apr-2009 Data Structure Review

Review Topics Tree ADT: Dictionary (map) ADT: Graph ADT: Heap, AVL Tree, Red-Black Tree, and 2-3 Tree (B-tree). Dictionary (map) ADT: Hash tables and hash functions. Graph ADT: (?) Breadth-First Search (BFS), and (?) Depth-First Search (DFS). For each topic, you should prepare to answer: What is it? How to represent it? What operations does it support? How each operation works? Practice your drawing; do as much examples as you can! How long each operation takes? Best-case, Average-case, and Worst-case. 27-Apr-2009 Data Structure Review

Review: Trees Operations: Binary tree-walks: Time complexity: Terminology: Size, height, depth (level), link (edge), path. Root, parent, children, sibling, leaves, ancestor, descendant, etc. Representation: Node structure. Storage: Array, Linked list. Types: Binary tree: Binary Heap. Binary search tree (BST): AVL and R-B. B-tree: 2-3 tree. Operations: insert(), delete(), search(), sort() and etc. Binary tree-walks: Pre-order (Root,L,R), In-order (L,Root,R), Post-order (L,R, Root), Level-order. Time complexity: Insertion: O (log n), Searching: O (log n), Deletion: O (log n), Sorting: O (n log n). 27-Apr-2009 Data Structure Review

Binary Tree: Importance of Balance Binary tree, in general, is useful for implementing many operations: For examples, search(), successor(), predecessor(), minimum(), maximum(), insert(), and delete() can be achieved in O(h) time, where h is the height of the tree. That is, the average running time of above operations on a balanced tree is h = O(lg n). But, the insert() and delete() alter the shape of the tree and can result in an unbalanced tree. In the worst case, h = O(n)  no better than a linked list! So, we want to correct the imbalance in at most O(lg n) time  no complexity overhead. 27-Apr-2009 Data Structure Review

Review: Balanced Trees To make sure a binary tree is balanced, add a requirement, called the heap property, to the binary tree. Binary heap is commonly use for implementing Priority Queue ADT. Aside, heap could also mean the memory space used for dynamic allocation. To make sure a BST is balanced, add a constrain on the height of BST trees. The most popular data structures are AVL and Red-Black trees. 27-Apr-2009x Data Structure Review

Review: Binary Heap A binary heap extends binary tree data structure and has the following properties: Each node has a key <greater|less> than or equal to the key of its children. Greater - Max heap; Less - Min heap; The tree is a complete binary tree. A complete binary tree is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible. Longest path is ceiling(lg n) for n nodes. 27-Apr-2009 Data Structure Review

Heap: Maintaining the Heap Property heapifyUp() and heaifyDown() are the key operations for maintaining the heap property in O(lg n) time. How does heapifyDown() work? Given a node i in the heap. For maxheap: A[i] < A[left(i)] or A[i] < A[right(i)], swap A[i] with the largest of A[left(i)] and A[right(i)]. Recurs on that sub-tree. How does heapifyUp() work? For maxheap: A[i] > A[parent[i]], swap A[i] with A[parent[i]. Recurs on parent[i]. What about other operations and their running time? delete(), insert(), buildHeap(), heapSort(). 27-Apr-2009 Data Structure Review

AVL: Adelson-Velsky and Landis, 1962 Review: AVL Tree An AVL tree extends BST data structure and include the following property: Any node in the tree has the height difference between its left and right sub-trees is at most one. Observe that: The smallest AVL tree of depth 1 has 1 node. The smallest AVL tree of depth 2 has 2 nodes. Th-1 Th-2 h-2 h-1 h Th x Th = Th–1 + Th–2 + 1 Size of tree: AVL: Adelson-Velsky and Landis, 1962 27-Apr-2009 Data Structure Review

AVL: Maintaining the AVL Property Tree rotation is the key operations for maintaining the AVL property in O(lg n) time. If a node is not balanced, the difference between its children heights is 2. 4 possible cases with a height difference of 2. y x B C A x y B A C x y C A B y x C A B (1) (2) (3) (4) 27-Apr-2009 Data Structure Review

AVL: Maintaining the AVL Property (2) Case 1: rightRotate (y); x = y.getLeftChild(); y.setLeftChild(x.getRightChild()); x.setRightChild(y); x = y; Case 3: leftRotate (y); rightRotate (x); Case 2: leftRotate(x); y= x.getRightChild(); x.setRightChild(y.getLeftChild()); y.setLeftChild(x) x = y; Case 4: rightRotate(x); leftRotate (y); C y x A B rightRotate(y) leftRotate(x) 27-Apr-2009 Data Structure Review

AVL: Insert/Delete Insertion is similar to a regular BST Insert: Search for the position : Keep going left (or right) in the tree until a null child is reached. Insert a new node in this position. An inserted node is always a leaf. Rebalance the tree: Search from inserted node to root looking for any node that violate the AVL property. Use rotation to fix. Only require to find the first unbalanced node. Deletion is similar to a regular BST Delete: Search for the node. Remove it : 0 children: replace it with null 1 child: replace it with the only child 2 children: replace it with right-most node in the left subtree Rebalance the tree: Search from inserted node to root for all node that violate the AVL property. Use rotation to fix. Require to work all the way back to the root. 27-Apr-2009 Data Structure Review

Review: Red-Black Trees Red-black trees extends BST data structure and include the following properties: The root is always black. Every node is either red or black. Every leaf (NULL pointer) is black (every “real” node has 2 children). Both children of every red node are black (can’t have 2 consecutive reds on a path). Every simple path from node to descendent leaf contains the same number of black nodes. RB tree has height h  2 lg(n+1). So, operation is guaranteed to be the height h = O(lg n). 27-Apr-2009 Data Structure Review

RB Trees: Maintaining RB Tree Property Tree rotation is the key operation for maintaining a RB tree property in O(lg n) time: Rotation preserves in-order key ordering Rotation takes O(1) time (just swaps pointers) C y x A B rightRotate(y) leftRotate(x) 27-Apr-2009 Data Structure Review

RB Trees: Insert/Delete Insertion is similar to BST’s insert: BST Insert. Color the new node red. Rebalance the tree: If parent is black, done. Otherwise: Parent’s sibling is red. Parent’s sibling is black and new node is a right child. Parent’s sibling is black and new node is a left child. Repeat, moving up the tree until there are no violation. Deletion is similar to BST’s delete: BST Delete; Rebalance the tree: If node is red, color black, done. Otherwise: Sibling has two black children. Sibling’s children are both black. Sibling's left child is red. sibling's right child is black, Sibling is black, sibling's right child is red. Repeat, moving up the tree until there are no violation. 27-Apr-2009 Data Structure Review

Review: 2-3 B-Trees A B-tree extends tree data structure and has the following properties: The root is either a leaf or has between 2 and m children. Each internal node has between ceiling(m/2) and m children. Each internal node has between ceiling(m/2)-1 and m-1 keys. A leaf node has between 1 and m-1 keys. The tree is perfectly balanced. So, a 2-3 B-tree is a B-tree of 3 order. A node can have 2 or 3 children, which means that a node can have 1, 2 or 3 keys. R-B tree is a B-tree with degree 2. < x , y> <=x >x and <=y >y 27-Apr-2009 Data Structure Review

2-3: Insert/Delete Deletion is similar to delete in a BST. Insertion is similar to insert in a BST: Searching for the item. If found, done. Otherwise, Stop at a 2-node? Upgrade the 2-node to a 3-node. Stop at a 3-node? Replace the 3-node by 2 2-nodes and push the middle value up to the parent node. Repeat recursively until you upgrade a 2-node or create a new root. When is a new root created? Deletion is similar to delete in a BST. Start deletion at a leaf. Swap the value to be deleted with its immediate successor in the tree. Delete the value from the node. If the node still has a value, done. We’ve changed a 3-node into a 2-node; Otherwise, Find a value from sibling or parent. 27-Apr-2009 Data Structure Review

Review: Hash Tables Given n elements, each with a key and satellite data, we need to support: insert (T, x), delete (T, x), and search(T, x), But, don’t care about sorting the elements. Suppose no two elements have the same key and the range of keys is 0…m-1, where m is not too large. Set up an array T[0…m-1] in which. T[i] = x if x T and i=h(key(x)); T[i] = NULL otherwise. h() is called the hash function (or hashing) and T is called a direct-address table. Hash tables support insert, delete, and search in O(1) expected time. 27-Apr-2009 Data Structure Review

Hash: Resolving Collisions Collision happens when two keys hash to the same memory location. Two ways to resolve collisions: Open addressing: To insert, if slot is full, try another slot, and another, until an open slot is found (probing). To search, follow same sequence of probes as would be used when inserting the element. Chaining: To insert, keep linked list of elements in slots. Upon collision, just add new element to list. To search: search the linked list. 27-Apr-2009 Data Structure Review

Hash: Choosing A Hash Function Choosing a good hash function is crucial. Bad hash function puts all elements in same slot. A good hash function: Should distribute keys uniformly into slots. Should not depend on patterns in the data. There are three common hash functions: Division method: h(k) = k mod m, m is prime number; Multiplication method: h(k) = floor (m * (kA mod 1)); Universal method: h(k) = f(g(k) and g(k) = ((ak +b) mod p); 27-Apr-2009 Data Structure Review

Review: Graphs A graph G = (V, E), where V = set of vertices, E = set of edges. Dense graph: |E|  |V|2 Sparse graph: |E|  |V| Undirected graph: Edge (u,v) = edge (v,u) No self-loops Directed graph: Edge (u,v) goes from vertex u to vertex v, notated uv A weighted graph associates weights with either the edges or the vertices. 27-Apr-2009 Data Structure Review

Graphs: Adjacency Matrix Assume V = {1, 2, …, n}. An adjacency matrix represents the graph as a n x n matrix A: A[i, j] = 1 if edge (i, j)  E (or weight of edge) = 0 if edge (i, j)  E 1 2 4 3 a d b c A 1 2 3 4 27-Apr-2009 Data Structure Review

Graphs: Adjacency List An adjacency list represents the graph as an array of linked list. For each vertex v  V, store a list of vertices adjacent to v. Example: Adj[1] = {2,3} Adj[2] = {3} Adj[3] = {} Adj[4] = {3} Variation: can also keep a list of edges coming into vertex. 1 2 4 3 27-Apr-2009 Data Structure Review

Graphs: Storage Adjacency matrix takes O(V2) storage. Usually too much storage for large graphs. But can be very efficient for small graphs. Adjacency list takes O(V+E) storage: The degree of a vertex v = # incident edges. For directed graphs, # of items in adjacency lists is:  out-degree(v) = |E| takes (V + E) storage. For undirected graphs, # items in adjacency lists is:  degree(v) = 2 |E| (handshaking lemma) also takes (V + E) storage. Most large interesting graphs are sparse. E.g., planar graphs, in which no edges cross, have |E| = O(|V|) by Euler’s formula; So, the adjacency list is often a more appropriate representation. 27-Apr-2009 Data Structure Review

Review: Graph Searching Given: a graph G = (V, E), directed or undirected. Goal: systematically explore every vertex and every edge. General idea: build a tree on the graph. Pick a vertex as the root, Choose certain edges to produce a tree. Note: might also build a forest if graph is not connected. 27-Apr-2009 Data Structure Review

Breadth-First Search General idea: Associate vertex “colours”: Expand frontier of explored vertices across the breadth of the frontier. Pick a source vertex to be the root. Find (“discover”) its children, then their children, etc. Associate vertex “colours”: White vertices have not been discovered. All vertices start out white. Grey vertices are discovered but not fully explored. They may be adjacent to white vertices. Black vertices are discovered and fully explored. They are adjacent only to black and grey vertices. Explore vertices by scanning the FIFO queue of grey vertices. 27-Apr-2009 Data Structure Review

BFS and Shortest-path BFS can thought of as being like Dijkstra’s for shortest-path except every edge has the same weight. BFS calculates the shortest-path distance to the source node. Shortest-path distance (s,v) = minimum number of edges from s to v, or  if v not reachable from s. Proof should be in the book. BFS builds breadth-first tree, in which paths to root represent shortest paths in G. Thus can use BFS to calculate shortest path from one vertex to another in O(V+E) time. 27-Apr-2009 Data Structure Review

Depth-First Search General idea: Like BFS, associate vertex “colours”: Explore “deeper” in the graph whenever possible. Edges are explored out of the most recently discovered vertex v that still has unexplored edges. When all of v’s edges have been explored, backtrack to the vertex from which v was discovered. Like BFS, associate vertex “colours”: Vertices initially white. Then coloured grey when discovered. Then coloured black when finished. Explore vertices by scanning the stack of grey vertices 27-Apr-2009 Data Structure Review

DFS And Cycles An undirected graph is acyclic iff a DFS yields no back edges If acyclic, no back edges (because a back edge implies a cycle). If no back edges, acyclic. No back edges implies only tree edges (Why?) Only tree edges implies we have a tree or a forest, which by definition is acyclic. Thus, can run DFS to find whether a graph has a cycle. We can actually determine if cycles exist in O(V) time: In an undirected acyclic forest, |E|  |V| - 1. So count the edges: if ever see |V| distinct edges, must have seen a back edge along the way. 27-Apr-2009 Data Structure Review

Remarks (1) Clearly data structures and algorithms are closely related: Selecting the most efficient data structure and algorithm will almost always be the best way to proceed. However, consideration of many factors are required to produce a good implementation: The obvious solution isn’t always the best. Sometimes it makes sense to have multiple data structures, each with different properties, to represent a single object. Factors to be considered are: The memory footprint implied by a given representation. The cost of operations in that representation. The cost of converting to another representation. The amount of computation expected using a given representation. 27-Apr-2009 Data Structure Review

Remarks (2) When it comes to the implementation of an algorithm, the main point is that constant factors matter. Mapping algorithms and data structures in a way that matches the architecture characteristics is VERY important ! Often, require to restructure a program, not functionally but behaviourally, to get better performance. However, restructuring code, can be a bit more involved than just performing optimisations. So, the bottom line is to think about trade-offs that could change the quality of an implementation. Direct, obvious algorithm translations don’t always mean good performance; Best performance comes from considering the many aspects of execution, e.g., memory access, processor characteristics, language overheads. 27-Apr-2009 Data Structure Review

Good Luck! 27-Apr-2009 Data Structure Review