Prelim 2 Review CS 2110 Fall 2009.

Slides:



Advertisements
Similar presentations
Lecture 15. Graph Algorithms
Advertisements

Algorithms (and Datastructures) Lecture 3 MAS 714 part 2 Hartmut Klauck.
Introduction to Algorithms Quicksort
CSE 390B: Graph Algorithms Based on CSE 373 slides by Jessica Miller, Ruth Anderson 1.
SEARCHING AND SORTING HINT AT ASYMPTOTIC COMPLEXITY Lecture 9 CS2110 – Spring 2015 We may not cover all this material.
Graphs Chapter 12. Chapter Objectives  To become familiar with graph terminology and the different types of graphs  To study a Graph ADT and different.
Chapter 5 Decrease and Conquer. Homework 7 hw7 (due 3/17) hw7 (due 3/17) –page 127 question 5 –page 132 questions 5 and 6 –page 137 questions 5 and 6.
Priority Queues. 2 Priority queue A stack is first in, last out A queue is first in, first out A priority queue is least-first-out The “smallest” element.
CS 307 Fundamentals of Computer Science 1 Abstract Data Types many slides taken from Mike Scott, UT Austin.
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
Spring 2010CS 2251 Graphs Chapter 10. Spring 2010CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs.
Course Review COMP171 Spring Hashing / Slide 2 Elementary Data Structures * Linked lists n Types: singular, doubly, circular n Operations: insert,
Dynamic Sets and Data Structures Over the course of an algorithm’s execution, an algorithm may maintain a dynamic set of objects The algorithm will perform.
CS 206 Introduction to Computer Science II 11 / 05 / 2008 Instructor: Michael Eckmann.
Blind Search-Part 2 Ref: Chapter 2. Search Trees The search for a solution can be described by a tree - each node represents one state. The path from.
CS 206 Introduction to Computer Science II 03 / 25 / 2009 Instructor: Michael Eckmann.
Fall 2007CS 2251 Graphs Chapter 12. Fall 2007CS 2252 Chapter Objectives To become familiar with graph terminology and the different types of graphs To.
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 11 Instructor: Paul Beame.
More Graph Algorithms Weiss ch Exercise: MST idea from yesterday Alternative minimum spanning tree algorithm idea Idea: Look at smallest edge not.
Data Structures, Spring 2004 © L. Joskowicz 1 DAST – Final Lecture Summary and overview What we have learned. Why it is important. What next.
Review of Graphs A graph is composed of edges E and vertices V that link the nodes together. A graph G is often denoted G=(V,E) where V is the set of vertices.
C o n f i d e n t i a l HOME NEXT Subject Name: Data Structure Using C Unit Title: Graphs.
Abstract Data Types (ADTs) Data Structures The Java Collections API
1 HEAPS & PRIORITY QUEUES Array and Tree implementations.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
CS112A1 Spring 2008 Practice Final. ASYMPTOTIC NOTATION: a)Show that log(n) and ln(n) are the same in terms of Big-Theta notation b)Show that log(n+1)
Graphs CS 400/600 – Data Structures. Graphs2 Graphs  Used to represent all kinds of problems Networks and routing State diagrams Flow and capacity.
Course notes CS2606: Data Structures and Object-Oriented Development Graphs Department of Computer Science Virginia Tech Spring 2008 (The following notes.
1 GRAPHS - ADVANCED APPLICATIONS Minimim Spanning Trees Shortest Path Transitive Closure.
Comp 249 Programming Methodology Chapter 15 Linked Data Structure - Part B Dr. Aiman Hanna Department of Computer Science & Software Engineering Concordia.
Chapter 9 – Graphs A graph G=(V,E) – vertices and edges
Chapter 3 List Stacks and Queues. Data Structures Data structure is a representation of data and the operations allowed on that data. Data structure is.
Data Structures and Abstract Data Types "Get your data structures correct first, and the rest of the program will write itself." - David Jones.
MA/CSSE 473 Day 12 Insertion Sort quick review DFS, BFS Topological Sort.
2IL05 Data Structures Fall 2007 Lecture 13: Minimum Spanning Trees.
Minimum Spanning Trees CSE 2320 – Algorithms and Data Structures Vassilis Athitsos University of Texas at Arlington 1.
Sorting. Pseudocode of Insertion Sort Insertion Sort To sort array A[0..n-1], sort A[0..n-2] recursively and then insert A[n-1] in its proper place among.
Lecture 13 Jianjun Hu Department of Computer Science and Engineering University of South Carolina CSCE350 Algorithms and Data Structure.
Graphs. Definitions A graph is two sets. A graph is two sets. –A set of nodes or vertices V –A set of edges E Edges connect nodes. Edges connect nodes.
CS 206 Introduction to Computer Science II 11 / 16 / 2009 Instructor: Michael Eckmann.
Priority Queues and Heaps. October 2004John Edgar2  A queue should implement at least the first two of these operations:  insert – insert item at the.
Java Methods Big-O Analysis of Algorithms Object-Oriented Programming
PRIORITY QUEUES AND HEAPS Slides of Ken Birman, Cornell University.
 Today, we will cover ◦ Typing ◦ Induction and Recursion ◦ Asymptotic Complexity ◦ Data Structures ◦ Abstract Data Types and Implementing ADTs ◦ Searching.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley. Ver Chapter 13: Graphs Data Abstraction & Problem Solving with C++
COSC 2007 Data Structures II
Graphs Chapter 12. Chapter 12: Graphs2 Chapter Objectives To become familiar with graph terminology and the different types of graphs To study a Graph.
© 2006 Pearson Addison-Wesley. All rights reserved 14 A-1 Chapter 14 Graphs.
FALL 2005CENG 213 Data Structures1 Priority Queues (Heaps) Reference: Chapter 7.
Graphs Upon completion you will be able to:
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
Chapter 20: Graphs. Objectives In this chapter, you will: – Learn about graphs – Become familiar with the basic terminology of graph theory – Discover.
Course: Programming II - Abstract Data Types HeapsSlide Number 1 The ADT Heap So far we have seen the following sorting types : 1) Linked List sort by.
1 Priority Queues (Heaps). 2 Priority Queues Many applications require that we process records with keys in order, but not necessarily in full sorted.
Recitation 9 Prelim Review.
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008
May 12th – Minimum Spanning Trees
Heaps And Priority Queues
Cse 373 April 26th – Exam Review.
I206: Lecture 15: Graphs Marti Hearst Spring 2012.
CSE 373 Data Structures and Algorithms
Graphs.
Chapter 11 Graphs.
CSE 373: Data Structures and Algorithms
Lecture 22: Implementing Dijkstra’s
INTRODUCTION A graph G=(V,E) consists of a finite non empty set of vertices V , and a finite set of edges E which connect pairs of vertices .
Presentation transcript:

Prelim 2 Review CS 2110 Fall 2009

Overview Complexity and Big-O notation ADTs and data structures Linked lists, arrays, queues, stacks, priority queues, hash maps Searching and sorting Graphs and algorithms Searching Topological sort Minimum spanning trees Shortest paths Miscellaneous Java topics Dynamic vs static types, casting, garbage collectors, Java docs

Big-O notation Big-O is an asymptotic upper bound on a function “f(x) is O(g(x))” Meaning: There exists some constant k such that f(x) ≤ k g(x) …as x goes to infinity Often used to describe upper bounds for both worst-case and average-case algorithm runtimes Runtime is a function: The number of operations performed, usually as a function of input size

Big-O notation For the prelim, you should know… Worst case Big-O complexity for the algorithms we’ve covered and for common implementations of ADT operations Examples Mergesort is worst-case O(n log n) PriorityQueue insert using a heap is O(log n) Average case time complexity for some algorithms and ADT operations, if it has been noted in class Quicksort is average case O(n log n) HashMap insert is average case O(1)

Big-O notation For the prelim, you should know… How to estimate the Big-O worst case runtimes of basic algorithms (written in Java or pseudocode) Count the operations Loops tend to multiply the loop body operations by the loop counter Trees and divide-and-conquer algorithms tend to introduce log(n) as a factor in the complexity Basic recursive algorithms, i.e., binary search or mergesort

Abstract Data Types What do we mean by “abstract”? Defined in terms of operations that can be performed, not as a concrete structure Example: Priority Queue is an ADT, Heap is a concrete data structure For ADTs, we should know: Operations offered, and when to use them Big-O complexity of these operations for standard implementations

ADTs:The Bag Interface interface Bag<E> { void insert(E obj); E extract(); //extract some element boolean isEmpty(); E peek(); // optional: return next element without removing } Examples: Queue, Stack, PriorityQueue

Queues First-In-First-Out (FIFO) Linked List implementation Objects come out of a queue in the same order they were inserted Linked List implementation insert(obj): O(1) Add object to tail of list Also called enqueue, add (Java) extract(): O(1) Remove object from head of list Also called dequeue, poll (Java)

Stacks Last-In-First-Out (LIFO) Linked List implementation Objects come out of a queue in the opposite order they were inserted Linked List implementation insert(obj): O(1) Add object to tail of list Also called push (Java) extract(): O(1) Remove object from head of list Also called pop (Java)

Priority Queues Objects come out of a Priority Queue according to their priority Generalized By using different priorities, can implement Stacks or Queues Heap implementation (as seen in lecture) insert(obj, priority): O(log n) insert object into heap with given priority Also called add (Java) extract(): O(log n) Remove and return top of heap (minimum priority element) Also called poll (Java)

Heaps Concrete Data Structure Balanced binary tree Obeys heap order invariant: Priority(child) ≥ Priority(parent) Operations insert(value, priority) extract()

Heap insert() Put the new element at the end of the array If this violates heap order because it is smaller than its parent, swap it with its parent Continue swapping it up until it finds its rightful place The heap invariant is maintained!

Heap insert() 4 6 14 21 8 19 35 22 38 55 10 20

Heap insert() 4 6 14 21 8 19 35 22 38 55 10 20 5

Heap insert() 4 6 14 21 8 5 35 22 38 55 10 20 19

Heap insert() 4 6 5 21 8 14 35 22 38 55 10 20 19

Heap insert() 4 6 5 21 8 14 35 22 38 55 10 20 19

insert() Time is O(log n), since the tree is balanced size of tree is exponential as a function of depth depth of tree is logarithmic as a function of size

extract() Remove the least element – it is at the root This leaves a hole at the root – fill it in with the last element of the array If this violates heap order because the root element is too big, swap it down with the smaller of its children Continue swapping it down until it finds its rightful place The heap invariant is maintained!

extract() 4 6 5 21 8 14 35 22 38 55 10 20 19

extract() 4 6 5 21 8 14 35 22 38 55 10 20 19

extract() 4 6 5 21 8 14 35 22 38 55 10 20 19

extract() 4 19 6 5 21 8 14 35 22 38 55 10 20

extract() 4 5 6 19 21 8 14 35 22 38 55 10 20

extract() 4 5 6 14 21 8 19 35 22 38 55 10 20

extract() 4 5 6 14 21 8 19 35 22 38 55 10 20

extract() 4 5 6 14 21 8 19 35 22 38 55 10 20

extract() 4 5 6 14 21 8 19 35 22 38 55 10 20

extract() 4 5 20 6 14 21 8 19 35 22 38 55 10

extract() 4 5 6 20 14 21 8 19 35 22 38 55 10

extract() 4 5 6 8 14 21 20 19 35 22 38 55 10

extract() 4 5 6 8 14 21 10 19 35 22 38 55 20

extract() 4 5 6 8 14 21 10 19 35 22 38 55 20

extract() Time is O(log n), since the tree is balanced

Store in an ArrayList or Vector Elements of the heap are stored in the array in order, going across each level from left to right, top to bottom The children of the node at array index n are found at 2n + 1 and 2n + 2 The parent of node n is found at (n – 1)/2

Sets ADT Set No duplicates allowed Operations: void insert(Object element); boolean contains(Object element); void remove(Object element); int size(); iteration No duplicates allowed Hash table implementation: O(1) insert and contains SortedSet tree implementation: O(log n) insert and contains A set makes no promises about ordering, but you can still iterate over it.

A HashMap is a particular implementation of the Map interface Dictionaries ADT Dictionary (aka Map) Operations: void insert(Object key, Object value); void update(Object key, Object value); Object find(Object key); void remove(Object key); boolean isEmpty(); void clear(); Think of: key = word; value = definition Where used: Symbol tables Wide use within other algorithms A HashMap is a particular implementation of the Map interface

A HashMap is a particular implementation of the Map interface Dictionaries Hash table implementation: Use a hash function to compute hashes of keys Store values in an array, indexed by key hash A collision occurs when two keys have the same hash How to handle collisions? Store another data structure, such as a linked list, in the array location for each key Put (key, value) pairs into that data structure insert and find are O(1) when there are no collisions Expected complexity Worst case, every hash is a collision Complexity for insert and find comes from the tertiary data structure’s complexity, e.g., O(n) for a linked list A HashMap is a particular implementation of the Map interface

For the prelim… Don’t spend your time memorizing Java APIs! If you want to use an ADT, it’s acceptable to write code that looks reasonable, even if it’s not the exact Java API. For example, Queue<Integer> myQueue = new Queue<Integer>(); myQueue.enqueue(5); … int x = myQueue.dequeue(); This is not correct Java (Queue is an interface! And Java calls enqueue and dequeue “add” and “poll”) But it’s fine for the exam.

Searching Find a specific element in a data structure Linear search, O(n) Linked lists Unsorted arrays Binary search, O(log n) Sorted arrays Binary Search Tree search O(log n) if the tree is balanced O(n) worst case

Sorting Comparison sorting algorithms Sort based on a ≤ relation on objects Examples QuickSort HeapSort MergeSort InsertionSort BubbleSort

QuickSort Given an array A to sort, choose a pivot value p Partition A into two sub-arrays, AX and AY AX contains only elements ≤ p AY contains only elements ≥ p Recursively sort sub-arrays separately Return AX + p + AY O(n log n) average case runtime O(n2) worst case runtime But in the average case, very fast! So people often prefer QuickSort over other sorts.

partition pivot QuickSort concatenate 20 31 24 19 45 56 4 65 5 72 14 99 pivot partition 5 19 14 4 31 72 56 65 45 24 99 20 4 5 14 19 24 31 45 56 65 72 99 QuickSort 4 5 14 19 20 24 31 45 56 65 72 99 concatenate

QuickSort Questions Key problems Partitioning in place How should we choose a pivot? How do we partition an array in place? Partitioning in place Can be done in O(n) Choosing a pivot Ideal pivot is the median, since this splits array in half Computing the median of an unsorted array is O(n), but algorithm is quite complicated Popular heuristics: Use first value in array (usually not a good choice) Use middle value in array Use median of first, last, and middle values in array Choose a random element

Heap Sort Insert all elements into a heap Extract all elements from the heap Each extraction returns the next largest element; they come out in sorted order! Heap insert and extract are O(log n) Repeated for n elements, we have O(n log n) worst-case runtime

Graphs Set of vertices (or nodes) V, set of edges E Number of vertices n = |V| Number of edges m = |E| Upper bound O(n2) on number of edges A complete graph has m = n(n-1)/2 Directed or undirected Directed edges have distinct head and tail Weighted edges Cycles and paths Connected components DAGs Degree of a node (in- and out- degree for directed graphs)

Graph Representation You should be able to write a Vertex class in Java and implement standard graph algorithms using this class However, understanding the algorithms is much more important than memorizing their code

Topological Sort A topological sort is a partial order on the vertices of a DAG Definition: If vertex A comes before B in topological sort order, then there is no path from B to A A DAG can have many topological sort orderings; they aren’t necessarily unique (except, say, for a singly linked list) No topological sort possible for a graph that contains cycles One algorithm for topological sorting: Iteratively remove nodes with no incoming edges until all nodes removed (see next slide) Can be used to find cycles in a directed graph If no remaining node has no incoming edges, there must be a cycle

Topological Sorting B A B C E D A E C D

Graph Searching Works on directed and undirected graphs You have a start vertex which you visit first You want to visit all vertices reachable from the start vertex For directed graphs, depending on your start vertex, some vertices may not be reachable You can traverse an edge from an already visited vertex to another vertex

Graph Searching Why is choosing any path on a graph risky? Cycles! Could traverse a cycle forever Need to keep track of vertices already visited No cycles if you do not visit a vertex twice Might also help to keep track of all unvisited vertices you can visit from a visited vertex

Graph Searching Add the start vertex to the collection of vertices to visit Pick a vertex from the collection to visit If you have already visited it, do nothing If you have not visited it: Visit that vertex Follow its edges to neighboring vertices Add unvisited neighboring vertices to the set to visit (You may add the same unvisited vertex twice) Repeat until there are no more vertices to visit

Graph Searching Runtime analysis Visit each vertex only once When you visit a vertex, you traverse its edges You traverse all edges once on a directed graph Twice on an undirected graph At worst, you add a new vertex to the collection to visit for each edge (collection has size of O(m)) Lower bound is Ω(n + m) Actual results depends on cost to add/delete vertices to/from the collection of vertices to visit

Graph Searching Depth-first search and breadth-first search are two graph searching algorithms DFS pushes vertices to visit onto a stack Examines a vertex by popping it off the stack BFS uses a queue instead Both have O(n + m) running time Push/enqueue and pop/dequeue have O(1) time

Graph Searching: Pseudocode Node search(Node startNode, List<Node> nodes) { BagType<Node> bag = new BagType<Node>(); Set<Node> visited = new Set<Node>(); bag.insert(startNode); while(!bag.isEmpty()) { node = bag.extract(); if(visited.contains(node)) continue; // Already visited if(found(node)) return node; // Search has found its goal visited.add(node); // Mark node as visited for(Node neighbor : node.getNeighbors()) bag.insert(neighbor); } If generic type BagType is a Queue, this is BFS If it’s a Stack, this is DFS

Graph Searching: DFS B A B E D C A E C B-E E-D B-C A-B ∅-A D Stack

Graph Searching: BFS B A B C E D E-D A E C C-D B-E A-B B-C ∅-A D Queue

Spanning Trees A spanning tree is a subgraph of an undirected graph that: Is a tree Contains every vertex in the graph Number of edges in a tree m = n-1

Minimum Spanning Trees (MST) Spanning tree with minimum sum edge weights Prim’s algorithm Kruskal’s algorithm Not necessarily unique

Prim’s algorithm Graph search algorithm, builds up a spanning tree from one root vertex Like BFS, but it uses a priority queue Priority is the weight of the edge to the vertex Also need to keep track of which edge we used Always picks smallest edge to an unvisited vertex Runtime is O(m log m) O(m) Priority Queue operations at log(m) each

Prim’s Algorithm B E A D G C F C-D 3 A-C 2 ∅-A C-B 4 A-B 5 B-E 7 D-E 8 9 8 A D G 12 4 3 2 C F 1 10 C-D 3 A-C 2 ∅-A C-B 4 A-B 5 B-E 7 D-E 8 E-G 9 G-F 1 D-F 10 D-G 12 Priority Queue

Kruskal’s Algorithm Idea: Find MST by connecting forest components using shortest edges Process edges from least to greatest Initially, every node is its own component Either an edge connects two different components or it connects a component to itself Add an edge only in the former case Picks smallest edge between two components O(m log m) time to sort the edges Also need the union-find structure to keep track of components, but it does not change the running time

Kruskal’s Algorithm B E A D G C F 7 5 9 8 A D G 12 4 3 2 C F 1 10 Edges are darkened when added to the tree

Dijkstra’s Algorithm Compute length of shortest path from source vertex to every other vertex Works on directed and undirected graphs Works only on graphs with non-negative edge weights O(m log m) runtime when implemented with Priority Queue, same as Prim’s

Dijkstra’s Algorithm Similar to Prim’s algorithm Difference lies in the priority Priority is the length of shortest path to a visited vertex + cost of edge to unvisited vertex We know the shortest path to every visited vertex On unweighted graphs, BFS gives us the same result as Dijkstra’s algorithm

Dijkstra’s Algorithm B E A D G C F A-C 2 ∅-A A-B 5 C-D 5 C-B 6 B-E 12 7 5 9 8 A D G 15 4 3 2 C 5 F 1 16 10 2 15 A-C 2 ∅-A A-B 5 C-D 5 C-B 6 B-E 12 D-E 13 F-G 16 D-F 15 D-G 20 E-G 21 Priority Queue

Dijkstra’s Algorithm Computes shortest path lengths Optionally, can compute shortest path as well Normally, we store the shortest distance to the source for each node But if we also store a back-pointer to the neighbor nearest the source, we can reconstruct the shortest path from any node to the source by following pointers Example (from previous slide): When edge [B-E 12] is removed from the priority queue, we mark vertex E with distance 12. We can also store a pointer to B, since it is the neighbor of E closest to the source A.