An Introduction to Programming though C++

Slides:



Advertisements
Similar presentations
Comp 122, Spring 2004 Binary Search Trees. btrees - 2 Comp 122, Spring 2004 Binary Trees  Recursive definition 1.An empty tree is a binary tree 2.A node.
Advertisements

S. Sudarshan Based partly on material from Fawzi Emad & Chau-Wen Tseng
Tree Data Structures &Binary Search Tree 1. Trees Data Structures Tree  Nodes  Each node can have 0 or more children  A node can have at most one parent.
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
Binary Trees, Binary Search Trees CMPS 2133 Spring 2008.
Binary Trees, Binary Search Trees COMP171 Fall 2006.
BST Data Structure A BST node contains: A BST contains
Lec 15 April 9 Topics: l binary Trees l expression trees Binary Search Trees (Chapter 5 of text)
1 abstract containers hierarchical (1 to many) graph (many to many) first ith last sequence/linear (1 to 1) set.
1 Joe Meehean.  Important and common problem  Given a collection, determine whether value v is a member  Common variation given a collection of unique.
1 Trees 3: The Binary Search Tree Section Binary Search Tree A binary tree B is called a binary search tree iff: –There is an order relation
Data Structures Using C++1 Chapter 11 Binary Trees.
CSCE 3110 Data Structures & Algorithm Analysis Binary Search Trees Reading: Chap. 4 (4.3) Weiss.
Nirmalya Roy School of Electrical Engineering and Computer Science Washington State University Cpt S 122 – Data Structures Templatized Tree.
Chapter 19: Binary Trees. Objectives In this chapter, you will: – Learn about binary trees – Explore various binary tree traversal algorithms – Organize.
1 CSE 1342 Programming Concepts Trees. 2 Basic Terminology Trees are made up of nodes and edges. A tree has a single node known as a root. –The root is.
1 Trees A tree is a data structure used to represent different kinds of data and help solve a number of algorithmic problems Game trees (i.e., chess ),
BINARY SEARCH TREE. Binary Trees A binary tree is a tree in which no node can have more than two children. In this case we can keep direct links to the.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
1 Chapter 10 Trees. 2 Definition of Tree A tree is a set of linked nodes, such that there is one and only one path from a unique node (called the root.
Lec 15 Oct 18 Binary Search Trees (Chapter 5 of text)
Review 1 Queue Operations on Queues A Dequeue Operation An Enqueue Operation Array Implementation Link list Implementation Examples.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
CMSC 341 Introduction to Trees. 2/21/20062 Tree ADT Tree definition –A tree is a set of nodes which may be empty –If not empty, then there is a distinguished.
Trees 3 The Binary Search Tree Section 4.3. Binary Search Tree Also known as Totally Ordered Tree Definition: A binary tree B is called a binary search.
Binary Search Trees (BST)
1 C++ Classes and Data Structures Jeffrey S. Childs Chapter 15 Other Data Structures Jeffrey S. Childs Clarion University of PA © 2008, Prentice Hall.
1 Trees. Objectives To understand the concept of trees and the standard terminology used to describe them. To appreciate the recursive nature of a tree.
1 CSE 326: Data Structures Trees Lecture 6: Friday, Jan 17, 2003.
(c) University of Washington20c-1 CSC 143 Binary Search Trees.
BSTs, AVL Trees and Heaps Ezgi Shenqi Bran. What to know about Trees? Height of a tree Length of the longest path from root to a leaf Height of an empty.
Priority Queues and Heaps. John Edgar  Define the ADT priority queue  Define the partially ordered property  Define a heap  Implement a heap using.
Priority Queues and Heaps Tom Przybylinski. Maps ● We have (key,value) pairs, called entries ● We want to store and find/remove arbitrary entries (random.
Binary Search Trees What is a binary search tree?
CSC317 Selection problem q p r Randomized‐Select(A,p,r,i)
Data Structure and Algorithms
CSCE 3110 Data Structures & Algorithm Analysis
BCA-II Data Structure Using C
CSCE 3110 Data Structures & Algorithm Analysis
Binary Search Trees A binary search tree is a binary tree
S. Sudarshan Based partly on material from Fawzi Emad & Chau-Wen Tseng
UNIT III TREES.
Binary Search Tree (BST)
Lecture 22 Binary Search Trees Chapter 10 of textbook
CMSC 341 Introduction to Trees.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
abstract containers sequence/linear (1 to 1) hierarchical (1 to many)
CS 213: Data Structures and Algorithms
ITEC 2620M Introduction to Data Structures
Tree data structure.
Binary Search Trees Why this is a useful data structure. Terminology
Binary Trees, Binary Search Trees
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Binary Trees.
Binary Search Trees.
Map interface Empty() - return true if the map is empty; else return false Size() - return the number of elements in the map Find(key) - if there is an.
Lec 12 March 9, 11 Mid-term # 1 (March 21?)
Priority Queues and Heaps
Tree data structure.
CMSC 341 Binary Search Trees.
Trees CMSC 202, Version 5/02.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
CMSC 202 Trees.
B-Trees Disk Storage What is a multiway tree? What is a B-tree?
Binary Trees, Binary Search Trees
CSC 143 Binary Search Trees.
Binary SearchTrees [CLRS] – Chap 12.
Binary Search Trees.
Binary Trees, Binary Search Trees
CS 261 – Data Structures AVL Trees.
Presentation transcript:

An Introduction to Programming though C++ Abhiram G. Ranade Ch. 24: Structural Recursion (Part 1)

The Dictionary problem Store a set S on a computer such that insertions and lookups can be performed quickly S.insert(e) : insert e into the set S (and modify S). S.find(e) : Return true iff e is a member of S. We have seen 3 ways of storing sets Elements stored in sorted order in an array/vector Same, but unsorted Elements stored Heap/Priority Queue For all 3, either insert or find will take time O(n) i.e. proportional to the size n of the set. Goal: both operations should happen fast, i.e. in O(log n) time. Solution: Next. used in the Standard Library class set, and also map.

A technical assumption Elements to be stored in the set are “ordered”, i.e. they can be compared using the operator < Numbers are “ordered” string objects are also ordered (lexicographical order). You will see soon how this assumption helps.

Overview The set is represented as a binary tree “binary search tree” A binary search tree is a type of a graph A binary search tree is represented on a computer in a natural way Inserting into set, finding element First understand as mathematical operations on the graph Then we see procedures which operate on the computer representation.

Binary search tree A rooted tree with At most 2 children An element stored at each vertex “vertex attribute” Search tree property: For all vertices v Elements stored In left subtree Element Stored at v Elements stored In right subtree < <

Binary search trees can be used to represent sets S = {18, 34, 40, 56, 70} can be represented using many search trees. 34 56 18 56 18 70 40 70 34 40

More on binary search trees A vertex, its children, their children, and so on are said to constitute a subtree of the vertex. The left subtree of a vertex is the subtree of its left child; similarly right. We can have an empty tree, i.e. a tree without even one node. This will be useful to represent empty sets. Subtrees of a search tree are also search trees. Recursive structure. Hence recursive algorithms!

Example Right subtree of ‘56’ Left subtree of ‘56’ = Subtree of ’70’ a BST . Left subtree of ‘56’ = Subtree of ’18’ a BST 18 56 70 34 40 60

Computer implementation:rooted tree A tree node can be stored as a struct. It will hold the value associated with it, and pointers to left and right children. struct Node{ Node *left, *right; int value; }; If the left or the right child is absent, the corresponding pointer is NULL. We could have used: vector<Node*> chidren;

Computer implementation: set Set = tree How do I refer to a tree? Refer using the root node. Once you are given the root node you can follow the left and right pointers to get to other nodes.. However, with this, the empty set cannot be represented. Better: identify a set using a pointer to root of the tree holding the elements. If the set is empty, the pointer will be NULL.

A sample main program int main(){ Node* myset = NULL; insert(myset, 40); insert(myset, 20); cout <<“Finding 30: ” << find(myset, 30) << endl; // Should print 0. cout <<“Finding 40: ” << find(myset, 40) << endl; // Should print 1. }

Implementing find(myset,x) If myset == NULL, the set is empty, so we can return false immediately. Else we know that the values stored in the set represented by myset consist of values in subtree of myset->left. myset->value values in subtree of myset->right. Because of the search tree property we know that the values in 1 are smaller than the value 2, which in turn is smaller than the values in 3. If x == myset->value, then we can return true. If x < myset->value, then x, if present, must be in myset->left. So we search that recursively. If x > myset->value, then x, if present, must be in myset->right. So we search that recursively.

Code for find bool find(Node* myset, int x){ if(myset == NULL) return false; if(x == myset->value) return true; if(x < myset->value) return find(myset->left, x); else return find(myset->right, x); }

Specification of insert(myset,x) Preconditions: “What holds before” myset must point to root of a BST, say T0. x must be an integer. Post condition: “What must hold after” myset must point to the root of a BST, say T1 Nodes in T1 should contain values held in nodes in T0 and also x.

Design choice Post conditions do not tell shape of T1 Our choice: Modify T0 as little as possible Nodes and edges of T0 don’t change New leaf created—at suitable position—to hold x Other, better choices exist: Discussed later

Inserting x into an empty set Initially myset == NULL Create a new node, Place x there Let myset point to new node myset gets modified within insert() Must pass myset by reference void insert(Node* &myset, int x);

Inserting x into a non-empty set Initially myset != NULL If x == myset->value, we can return immediately, we dont want to insert anything twice. If x < myset-> value, we must insert x into myset->left Can be done recursively, because myset->left is also a search tree. Correctness: Assume insert works on small trees, e.g. left subtree or right subtree. Ensures call myset->left will correctly modify left subtree. But this is what we need for the overall tree as well. Base case: myset is empty – node must be created. If x > myset->value, we must insert x into myset->right Correctness: similar.

The implementation of insert void insert(Node* &myset, int x){ if(myset == NULL){ myset = new Node; myset->left = myset->right = NULL; myset->value = x; } else{ if(x == myset->value) return; else if(x < myset->value) insert(myset->left, x); else insert(myset->right, x);

Our sample main program will now run int main(){ Node* myset = NULL; insert(myset, 40); insert(myset, 20); cout <<“Finding 30: ” << find(myset, 30) << endl; // Should print 0. cout <<“Finding 40: ” << find(myset, 40) << endl; // Should print 1. }

Time analysis for find Count no. of times each statement is executed. How to count without confusion: when a statement is executed in the recursive call find(v,x), place a rupee on vertex v of the tree. count the total amount of money we have placed on the tree

Time analysis for find (contd.) When find executes, vertices on some root to leaf path are visited, i.e. appear as first argument of find. On each vertex, we place some fixed number of rupees, e.g. 1 for the first statement, 1 for the second, ... When we make the recursive call: we must count the time spent (place some rupees) in copying the arguments and creating the activation frame of the call For the time spent in the recursive call itself: , the rupees will be placed on the child vertex and so on. We will place a fixed number (independent of the tree height, number of nodes) of rupees on each vertex The total number of rupees is O(path length). Time = O(length of longest path) ... For worst x

Time taken by find The number of recursive calls made by find = number of nodes visited. Example: Find 9 in set {1,2,4,5,6,7,9} Find will be called on every node. 1 2 4 5 6 7 9

Time taken by find The number of recursive calls made by find = number of nodes visited. Example: Find 9 in set {1,2,4,5,6,7,9} Find will be called on nodes 5, 7, 9. 5 2 7 1 4 6 9

Time taken by find Worst case: time taken = O(n), where n is the number of nodes. Best case: time taken = O(log n). Analysis for insert: similar. Since we care about the worst case, this looks bad. Remark 1: With cleverer insertion algorithm height remains small. Time = O(logn) for insertion and find LATER Remark 2: Theorem: If nodes are inserted in random order, h will be smaller than 2 log n on the average even for simple insertion algorithm. “Average” taken over all possible orders of inserting the elements. Probability[time to insert n elements is > 4nlogn] is very small. “Average case analysis” as opposed to usual worst case.

Exercises In what order should insertions be made so that the binary search tree on the left results? What search tree would you get if you inserted keys in order 2,7,1,5,6,9,4 5 2 7 1 4 6 9

Deleting a value - 1 Use find code to locate node v where the value is present. If v = leaf, then v can be deleted. Example: delete 4. If v has 1 child, shortcut. Example delete 2. What v has 2 children? 5 2 7 1 4 6 9

Deleting a value - 2 Suppose we have to delete a value x stored in a node v having 2 children. Find node v’ holding next largest value x’ in the tree. Leftmost node in right subtree under v. Recursively delete x’. Infinite recursion possible? x’ is easy to delete. v’ has 0 or 1 children. Replace x by x’. Exercise: write the code.

Printing the values stored in a search tree in sorted order Exercise void print(Node* myset){ if(myset == NULL) return; print(myset->left); cout << myset->value <<endl; print(myset->right); }

Time taken to print Each statement executed at most once for each tree node. = Proportional to number of nodes in tree = O(n)

A new sorting algorithm Assumption: keys to be sorted are distinct. Insert keys into a binary search tree Then print out the values. Assuming input order is random, or additional balancing operations are done: Time = O(nlogn) to insert + O(n) to print, so total O(nlong). If keys are not distinct: We need to enable a multiset to be stored. Keys in left subtree ≤ key at root ≤ Keys in right subtree Insert procedure will have to be slightly modified.

A packaging question When writing a program, it is nicer to declare sets by writing “Set myset;” rather than “Node* myset;” We can do this by using two classes, Node as before and another Set class. They refer to each other, and so a forward declaration is needed.

The class Set struct Node; // forward declaration class Set{ Node *pRoot; // pointer to root public: Set(){pRoot = NULL;} // empty set void insert(int x); bool find(int x); }; struct Node{ Set left, right; int value; Node(int x){value = x;} }; // left, right: initialized automatically!

Find bool Set::find(int x){ if(pRoot == NULL) return false; if(x == pRoot->value) return true; else if(x < pRoot->value) return pRoot->left.find(x); else return pRoot->right.find(x); }

Insert void Set::insert(int x){ if(pRoot == NULL) pRoot = new Node(x); else if(x < pRoot->value) pRoot->left.insert(x); else pRoot->right.insert(x); }

The new main program int main(){ Set myset; // constructor makes it empty myset.insert(40); myset.insert(20); cout <<“Finding 30: ” << myset.find(30) << endl; // Should print 0. cout <<“Finding 40: ” << myset.find(40) << endl; // Should print 1. }

Remarks The new main program is nicer to read. The user does not know that there is a tree implementing the set The new function for insert is nicer No references to pointers to Node... The standard library class set uses the ideas we have discussed. Also keeps tree balanced. Time for insert, delete, find = O(longest path * time to compare) The map class also uses these ideas + balancing. Each node holds a (key,value) pair, s.t. keys in left subtree < key at node < keys in right subtree Time for insert, find, delete operations = O(longest path length * time for comparing keys)

Exercises Write a function that returns the largest value in the tree that is no larger than a given x. If no such value, return – HUGE_VAL Suppose each node also contains a member pointing to the parent. Assume insert function updates this correctly. Write a function which returns the value y which is smallest among all values larger than given x. If no such value, return HUGE_VAL Instead of a binary search tree, it is sometimes useful to use a tree with a larger degree d. In such a tree, we will store d-1 values, where ith value will be larger than all values in ith subtree. Write the code for find, insert, delete for such trees. Suppose we implement the graph of friends using a map from strings to vectors of strings. How long would it take to print the friends of a given person? Assume string comparison takes time proportional to the length of the shorter string. Write the code for delete.