Compsci 201 Trees, Tries, Tradeoffs

Slides:



Advertisements
Similar presentations
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Advertisements

Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Binary Trees, Binary Search Trees COMP171 Fall 2006.
CSE 326: Data Structures Binary Search Trees Ben Lerner Summer 2007.
CS 146: Data Structures and Algorithms June 18 Class Meeting Department of Computer Science San Jose State University Summer 2015 Instructor: Ron Mak
SIGCSE Tradeoffs, intuition analysis, understanding big-Oh aka O-notation Owen Astrachan
Data Structures - CSCI 102 Binary Tree In binary trees, each Node can point to two other Nodes and looks something like this: template class BTNode { public:
Lecture Objectives  To learn how to use a tree to represent a hierarchical organization of information  To learn how to use recursion to process trees.
CPS Wordcounting, sets, and the web l How does Google store all web pages that include a reference to “peanut butter with mustard”  What efficiency.
1 COP 3538 Data Structures with OOP Chapter 8 - Part 2 Binary Trees.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
CPS 100, Spring Trees: no data structure lovelier?
S EARCHING AND T REES COMP1927 Computing 15s1 Sedgewick Chapters 5, 12.
CompSci 100e Program Design and Analysis II March 29, 2011 Prof. Rodger CompSci 100e, Spring20111.
CompSci 100e 7.1 From doubly-linked lists to binary trees l Instead of using prev and next to point to a linear arrangement, use them to divide the universe.
CompSci 100e 7.1 Plan for the week l Review:  Union-Find l Understand linked lists from the bottom up and top-down  As clients of java.util.LinkedList.
CPS Scoreboard l What else might we want to do with a data structure? l What are maps’ limitations? AlgorithmInsertionDeletionSearch Unsorted.
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
1/14/20161 BST Operations Data Structures Ananda Gunawardena.
Binary Search Trees (BST)
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
1 Joe Meehean. A A B B D D I I C C E E X X A A B B D D I I C C E E X X  Terminology each circle is a node pointers are edges topmost node is the root.
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
18-1 Chapter 18 Binary Trees Data Structures and Design in Java © Rick Mercer.
CS6045: Advanced Algorithms Data Structures. Dynamic Sets Next few lectures will focus on data structures rather than straight algorithms In particular,
CSE 373 Data Structures Lecture 7
From doubly-linked lists to binary trees
Chapter 12 – Data Structures
Non Linear Data Structure
Instructor: Lilian de Greef Quarter: Summer 2017
Data Structures and Design in Java © Rick Mercer
Trees Chapter 15.
CC 215 Data Structures Trees
COMP 53 – Week Fourteen Trees.
Compsci 201 Linked Lists from A-Z
PFTFBH Binary Search trees Fundamental Data Structure
Binary Trees Linked lists: efficient insertion/deletion, inefficient search ArrayList: search can be efficient, insertion/deletion not Binary trees: efficient.
Data Structures revisited
Plan for Week Linear structures: Stack, Queue
Binary Trees Linked lists: efficient insertion/deletion, inefficient search ArrayList: search can be efficient, insertion/deletion not Binary trees: efficient.
PFTWBH More examples of recursion and alternatives
Compsci 201 Trees and Tradeoffs
What’s the Difference Here?
Week 6 - Wednesday CS221.
Trees Lecture 12 CS2110 – Fall 2017.
Week 11 - Friday CS221.
CSE 373 Data Structures Lecture 7
COMP 103 Binary Search Trees.
Cse 373 April 26th – Exam Review.
i206: Lecture 13: Recursion, continued Trees
Binary Search Trees Why this is a useful data structure. Terminology
Binary Trees, Binary Search Trees
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Trees Lecture 12 CS2110 – Spring 2018.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees
Algorithm design and Analysis
Find in a linked list? first last 7  4  3  8 NULL
Trees CSE 373 Data Structures.
Trees Lecture 9 CS2110 – Fall 2009.
common code “abstracted out” same code written several times: don’t!
Binary Search Trees.
CS6045: Advanced Algorithms
Binary Trees, Binary Search Trees
Announcements Prelim 1 on Tuesday! A4 will be posted today
CSC 143 Binary Search Trees.
Compsci 201 Binary Trees Recurrence Relations
Compsci 201 Binary Trees Stacks+Queues
Trees CSE 373 Data Structures.
Binary Trees, Binary Search Trees
Tree (new ADT) Terminology: A tree is a collection of elements (nodes)
Presentation transcript:

Compsci 201 Trees, Tries, Tradeoffs Owen Astrachan Jeff Forbes November 1, 2017 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Compsci 201, Fall 2017, Tree, Tries, Tradoffs R is for … R Programming language of choice in Stats Random From Monte-Carlo to [0,1) Recursion Base case of wonderful stuff Refactoring Better not different 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Compsci 201, Fall 2017, Tree, Tries, Tradoffs Plan for the Day Tree Review From Theory to Practice From Recurrences to Code What are the main ideas in DNA LinkStrand Why Splicing matters with low level links Trees, Tries, Tradeoffs 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Compsci 201, Fall 2017, Tree, Tries, Tradoffs From Links to … What is the DNA/LinkStrand assignment about? Why do we study linked lists How do you work in a group? 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Basics of cutAndSplice Find enzyme like ‘gat’ Replace with splicee like ‘gggtttaaa’ Strings and StringBuilder for creating new strings Complexity of “hello” + “world”, or A+B String: A + B, StringBuilder: B 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

What do linked lists get us? Faster run-time, much better use of memory We splice in constant time? Re-use strings 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

String Concatenation Examined https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/StringPlay.java Runtime of stringConcat(“hello”,N) Depends on size of ret, 5, 10, 15, … 5*N public String stringConcat(String s, int reps) { String ret = ""; for(int k=0; k < reps; k++) { ret += s; } return ret; 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

StringBuilder Examined https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/StringPlay.java Runtime of builderConcat(“hello”,N) 5 + 5 + 5 + … + 5 a total of N times public String builderConcat(String s, int reps) { StringBuilder ret = new StringBuilder(); for(int k=0; k < reps; k++) { ret.append(s); } return ret.toString(); 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Compsci 201, Fall 2017, Tree, Tries, Tradoffs Theory and Practice The JVM can sometimes optimize your code Don’t optimize what you don’t have to … http://www.pellegrino.link/2015/08/22/string-concatenation-with-java-8.html WOTO http://bit.ly/201fall17-nov1-strings 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Compsci 201, Fall 2017, Tree, Tries, Tradoffs dana boyd Dr. danah boyd is a Senior Researcher at Microsoft Research, … a Visiting Researcher at Harvard Law School, …Her work examines everyday practices involving social media, with specific attention to youth engagement, privacy, and risky behaviors. She recently co-authored Hanging Out, Messing Around, and Geeking Out: Kids Living and Learning with New Media "From day one, Mark Zuckerberg wanted Facebook to become a social utility. He succeeded. Facebook is now a utility for many. The problem with utilities is that they get regulated." http://bit.ly/ySwjyl 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Trees: no data structure lovelier?

Trees from Bottom to Top Trees!! Trying to get the best of a few worlds: efficient lookup: like sorted array efficient add: like linked list range-queries: see java.util.NavigableSet Reminder: hashing is really, really fast O(1) add, search, delete independent of N BUT! No order info, worst case can be bad

Binary Trees Binary trees: Not just for searching, used in many contexts, Game trees, collisions, … Cladistics, genomics, quad trees, … Search is O(log n) like sorted array Average case. Note: worst case also be O(log n), e.g., use a balanced tree insertion/deletion O(1), once location found

How do search trees work? Change doubly-linked lists, no longer linear Similar to binary search, everything less goes left, everything greater goes right How do we search? How do we insert? Insert: “koala” “koala” “llama” “tiger” “monkey” “jaguar” “elephant” “giraffe” “pig” “hippo” “leopard” “koala” “koala” “koala” “koala”

Review: tree terminology A B E D F C G Binary tree is a structure: empty root node with left and right subtrees Tree Terminology parent and child: A is parent of B, E is child of B leaf node has no children, internal node has 1 or 2 children path is sequence of nodes (edges), N1, N2, … Nk Ni is parent of Ni+1 depth (level) of node: length of root-to-node path level of root is 1 (measured in nodes) height of node: length of longest node-to-leaf path height of tree is height of root

A TreeNode by any other name… What does this look like? Doubly linked list? public class TreeNode { TreeNode left; TreeNode right; String info; TreeNode(String s, TreeNode llink, TreeNode rlink){ info = s; left = llink; right = rlink; } “llama” “tiger” “giraffe”

Tree function: Tree height Compute tree height (longest root-to-leaf path) int height(Tree root) { if (root == null) return 0; else { return 1 + Math.max(height(root.left), height(root.right)); } Find height of left subtree, height of right subtree Use results to determine height of tree

Tree function: Leaf Count Calculate Number of Leaf Nodes int leafCount(Tree root) { if (root == null) return 0; if (root.left == null && root.right == null) return 1; return leafCount(root.left) + leafCount(root.right); } Similar to height: but has two base case(s) Use results of recursive calls to determine # leaves

Tree functions Analyzed int height(Tree root) { if (root == null) return 0; else { return 1 + Math.max(height(root.left), height(root.right)); } Let T(n) be time for height to run on n-node tree T(n) = 2T(n/2) + O(1) - roughly balanced T(n) = T(n-1) + T(1) + O(1) = T(n-1) + O(1) - unbalanced

Good Search Trees and Bad Trees http://www.9wy.net/onlinebook/CPrimerPlus5/ch17lev1sec7.html

Bad Trees and Good Trees 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Don’t do this at home? Let T(n) be time for height to execute (n-node tree) T(n) = T(n/2) + T(n/2) + O(1) T(n) = 2 T(n/2) + 1 T(n) = 2 [2(T(n/4) + 1] + 1 T(n) = 4T(n/4) + 2 + 1 T(n) = 8T(n/8) + 4 + 2 + 1, eureka! T(n) = 2kT(n/2k) + 2k-1 why true? T(n) = nT(1) + O(n) is O(n) if we let n=2k Different recurrence, same solution if unbalanced

Compsci 201, Fall 2017, Tree, Tries, Tradoffs Recurrence Table Recurrence Algorithm Solution T(n) = T(n/2) + O(1) Binary search O(log n) T(n) = T(n-1) + O(1) Sequential search O(n) T(n) = T(n-1) + O(n) Selection sort O(n2) T(n) = T(n/2) + O(n) Find median or kth T(n) = 2T(n/2) + O(1) Tree traversal T(n) = 2T(n/2) + O(n) Quick or merge sort O(n log n) T(n) = 2T(n-1) + O(1) Towers of Hanoi O(2n) Develop recurrence, look up solution Remember: goal is big-Oh, recurrence helps 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

Balanced Trees and Complexity A tree is height-balanced if Left and right subtrees are height-balanced Left and right heights differ by at most one boolean isBalanced(Tree root){ if (root == null) return true; return isBalanced(root.left) && isBalanced(root.right) && Math.abs(height(root.left)–height(root.right)) <= 1; }

What is complexity? Assume trees “balanced” in analyzing complexity Roughly half the nodes in each subtree Leads to easier analysis How to develop recurrence relation? What is T(n)? Time func executes on n-node tree What other work? Express recurrence, solve it How to solve recurrence relation Plug, expand, plug, expand, find pattern Proof requires induction to verify correctness

Recurrence relation T(n): time for isBalanced to execute (n-node tree) T(n) = T(n/2) + T(n/2) + O(n) T(n) = 2 T(n/2) + n T(n) = 2 [2(T(n/4) + n/2] + n T(n) = 4T(n/4) + n + n = 4T(n/4) + 2n T(n) = 8T(n/8) + 3n, eureka! T(n) = 2kT(n/2k) + kn why true? T(n) = nT(1) + n log(n) let n=2k, so k=log n So, solution for T(n) = 2T(n/2) + O(n) is O(n log n) -- base 2, but base doesn't matter

Printing a search tree in order When is root printed? After left subtree, before right subtree. void visit(TreeNode t){ if (t != null) { visit(t.left); System.out.println(t.info); visit(t.right); } Inorder traversal “llama” “tiger” “monkey” “jaguar” “elephant” “giraffe” “pig” “hippo” “leopard”

Tree traversals Inorder visits search tree in order Visit left-subtree, process root, visit right-subtree elephant, giraffe, jaguar, llama, monkey, tiger Navigate following nodes/links? Visit on passing under Second time by node “llama” “tiger” “monkey” “jaguar” “elephant” “giraffe”

Tree traversals Preorder good for reading/writing trees Process root, then Visit left-subtree, visit right-subtree llama, giraffe,elephant jaguar, tiger, monkey Navigate following nodes/links? Visit on passing-by to left First time by node “llama” “tiger” “monkey” “jaguar” “elephant” “giraffe”

Tree traversals Post order good for deleting tree Visit left-subtree, visit right-subtree, process root elephant, jaguar, giraffe, monkey, tiger, llama Navigate following nodes/links? Visit on passing up Third time by node “llama” “tiger” “monkey” “jaguar” “elephant” “giraffe”

Compsci 201, Fall 2017, Tree, Tries, Tradoffs Tree WOTO http://bit.ly/201f17-nov1-trees Pride in social group? Urban dictionary? 11/1/17 Compsci 201, Fall 2017, Tree, Tries, Tradoffs

What does insertion look like? Simple recursive insertion into tree (accessed by root) root = insert("foo", root); TreeNode insert(TreeNode t, String s) { if (t == null) t = new Tree(s,null,null); else if (s.compareTo(t.info) <= 0) t.left = insert(t.left,s); else t.right = insert(t.right,s); return t; }

Notes on tree insert and search Ineach recursive insert call Tree parameter in the call is either the left or right field of some node in the original tree Will be assignment to a .left or .right field! Idiom t = treeMethod(t,…) used Good trees go bad, what happens and why? Insert alpha, beta, gamma, delta, epsilon, … https://coursework.cs.duke.edu/201fall17/d9-linked-trees/blob/master/src/TreePlay.java

Insert and Removal For insertion we can use iteration (see BSTSet) Traverse left or right and “look ahead” to add Removal is tricky, depends on number of children Straightforward when zero or one child Complicated when two children, find successor See set code for complete cases If right child, straightforward Otherwise find node that’s left child of its parent (why?)

Compsci 201, Fall 2017, Linked Lists & More Wordladder Story Ladder from ‘white’ to ‘house’ white, while, whale, shale, … I can do that… optimally My brother was an English major My ladder is 16, his is 15, how? There's a ladder that's 14 words! The key is ‘sough’ Guarantee optimality! QUEUE I heard the puzzle on the way into Duke in the mid 90's. I called my brother and said I'd solve the problem. He said he would too. He used his brain, I wrote a program. Called him an hour alter and said "got it". He said "got it". I said my code was provably optimal and had a 16-word ladder. He said he had 15. WHAT? I added "sough" to dictionary and got a 14-word ladder  10/20/17 Compsci 201, Fall 2017, Linked Lists & More

Queue for shortest path public boolean findLadder(String[] words, String first, String last){ Queue<String> qu = new LinkedList<>(); Set<String> set = new HashSet<>(); qu.add(first); while (qu.size() > 0){ String current = qu.remove(); if (oneAway(current,last)) return true; for(String s : words){ if (! set.contains(s) && oneAway(from,s)){ qu.add(s); set.(s); } } return false; 10/20/17 Compsci 201, Fall 2017, Linked Lists & More

Shortest Path reprised How does Queue ensure we find shortest path? Where are words one away from first? Where are words two away from first? Why do we need to avoid revisiting a word, when? Why do we use a set for this? Why a HashSet? Alternatives? If we want the ladder, not just whether it exists What's path from white to house? We know there is one. All words one-away from first are on Q and are added when first is dequeued/removed first time through loop. Each of these 1-away-from-start words comes off the queue and words that are one-away from these, or 2-away from start, are put on the Q. BUT it's a queue so all 2-away words go on after the 1-away. In general, we remove N-away words before (N+1)-away words because of FIFO structure 10/20/17 Compsci 201, Fall 2017, Linked Lists & More

Compsci 201, Fall 2017, Linked Lists & More Shortest path proof All words one away from start on queue first iteration What is first value of current when loop entered? All one-away words dequeued before two-away See previous assertion, property of queues Two-away before 3-away, … Each 2-away word is one away from a 1-away word So all enqueued after one-away, before three-away Any w seen/dequeued that's n:away is: Seen before every n+k:away word for k >=1! Don't dwell on this, but walk through it as a semi-formal argument, akin to a hand-wavy proof 10/20/17 Compsci 201, Fall 2017, Linked Lists & More

Keeping track of ladder Find w, a one-away word from current Enqueue w if not seen Call map.put(w,current) Remember keys are unique! Put word on queue once! map.put("lot", "hot") map.put("dot", "hot") map.put("hat", "hot") Remind students that we need to keep track of how words are connected to reconstruct the ladder. Keys in a map are unique, but we only put a word on the queue once! Why? Using a set. We can make this one time the time we make a word the key in a map. The value is what caused the word to be put onto the queue, see examples 10/20/17 Compsci 201, Fall 2017, Linked Lists & More

Reconstructing Word Ladder Run WordLaddersFull https://coursework.cs.duke.edu/201fall17/wordladders/blob/master/src/WordLaddersFull.java See map and call to map.put(word,current) What about when returning the ladder, why is the returned ladder in reverse order? What do we know about code when statement adding (key,value) to map runs? Run program with "white" "house" and with "voter" "crazy" and with "voter" "smart" 10/20/17 Compsci 201, Fall 2017, Linked Lists & More