Tree data structure
How should I decide which data structure to use? What needs to be stored? Cost of operations Memory usage Ease of implementation What do we use a tree for? Organizational Hierarchy
ROOT Height of x = no. of edges in longest path from x to a leaf For example: Height of 3 = 2 Height of root = 3 = height of tree
Binary Tree Definition: A tree in which each node can ROOT Definition: A tree in which each node can Have at most 2 children
Pointer Tree data Left Childs NULL Right Childs Struct node { Address of Right Child Address of Left Child Struct node { int data; struct node* left; struct node* right } data Applications: Storing naturally hierarchical data -> e.g. file system. Organize data for quick search, insertion, deletion. Network routing algorithm. AND MORE… Left Childs NULL Right Childs
Binary Tree Binary Tree, each node can have at most 2 children
Strict / Proper Binary Tree Each node can have either 2 or 0 children
Complete Binary Tree All levels except possibly the last are completely filled, and all nodes are as left as possible Maximum number of nodes at any level (i) is 2i
Perfect Binary Tree =2No. of levels-1 Maximum number of nodes Maximum number of nodes in a binary tree with height (h) = 20+21+…….+2h =2h+1-1 =2No. of levels-1 Maximum number of nodes = 24-1 = 15
Perfect Binary Tree A complete binary tree in which all levels are full Number of nodes will be maximum for a height
Binary Search Tree Height of a perfect binary tree with n nodes Now, if n = 15 then h = 𝑙𝑜𝑔 2 15+1 −1 0 = 𝑙𝑜𝑔 2 16 – 1 = 4 – 1 = 3 Height : number of edges in longest path from root to a leaf Height of an empty tree = -1 Height of a tree with 1 node = 0
We can implement binary tree using: Dynamically created nodes struct node { int data; struct node *left; Struct node *right; }; b) Arrays 0 1 2 3 4 5 6 0 root But how to store information about the links 1 2 For node at index c in a complete binary tree Left-child index = 2i+1 3 4 5 6 Right-childe index = 2i +2 2 4 6 8 2 4 1 5 8 7 9 2 4 1 5 8 7 9
Binary Search Tree What data structure will you use to store a modifiable collection of data? We want to be able to perform the following operations: search (x) //search for an item x insert(x) // insert an item x remove(x) // delete an item x Choices: array, linked list….
Array (unsorted) search(x): O(n) end insert(x): O(1) insert(7) remove(x): O(n) end remove(3) Linked list: search(x): O(n) insert(x): O(1) //at the head remove(x): O(n) head 1 3 5 7 1 5 7 3 5 7
We can perform binary search in a sorted array in O(logn) end Array (sorted array) Search(x) : O(logn) sorted array Insert(x): O(n) end Remove(x): O(n) insert(6) BST (Balanced)For n records, log n comparisons if 1 comparison = 10 −6 𝑠𝑒𝑐 Search(x): O(logn) n = 2 30 = 30 x 10 −6 𝑠𝑒𝑐 // 2 30 ≈ 1 billion Insert(x): O(logn) Remove(x): O(logn) 3 5 7 9 3 5 6 7 9
Binary Search Tree (BST) root A Binary Tree in which for each node, the value of all the nodes in left subtree is lesser or equal and the value of all the nodes in the right subtree is greater. BST BST root Right-Subtree (greater) Left-Subtree (lesser or equal) 15 9 19 root 7 11 16 24 BST?
So how do we achieve O(log n) in BST for search, insert, and remove? start end end new 7 9 11 15 16 19 24 Search(9) Search Space Reduction Process BST n n/2 log _2 n steps n/4 n/2^k = 1 . 1 2^k = n Search(11) k = log_2 n
root BST (unbalanced) root BST (unbalanced) 24 Search Space Reduction 19 root O(log n) (avg. case) 16 7 15 9 O(n) (worst case) 11 BST (unbalanced) 11 9 15 7 16 19 24
Insert (18) O(log n) Similarly delete() is going to take O(log n)