Design & Analysis of Algorithm n-ary Tree & Binary Tree Informatics Department Parahyangan Catholic University
Tree Representation How do we store a tree in a computer software ? Store as a graph ? Hard to tell the parent-child relationship between its vertices Store in an array of parents ? (just like a DFS/BFS traversal tree) Only able to tell which vertex is the parent of a given vertex But we often need to know which is/are the child/children of a given vertex
Tree Representation A Tree has either An empty tree (has no vertex) [Recursive definition] A tree is either : An empty tree (has no vertex) A root with zero or more tree children [Recursive tree representation] A node (vertex) of a tree can have: A parent Zero or more node(s) as its children A Tree has either Null root, means it’s an empty tree (0 vertex) One root, mean it’s not an empty tree (>0 vertex)
Example in Java class Node{ Info info; Node parent; LinkedList<Node> children; } class Tree{ Node root;
N-Ary & Binary Tree A tree is called n-ary tree if every node may have no more than n children. 2-ary tree is called binary tree Why is n important ? By limiting the number of children, the tree data structure is easier to implement Instead of a linked list of children, we can use a static array Instead of traversing through a linked list, we can directly access the k-th children by using the array’s index etc.
Why are binary trees special ? Binary representation of computer data Every other trees can be represented as binary tree, which is more efficient if the average number of children is << n
Example in Java Binary tree N-ary tree class Node{ Info info; Node parent; Node left, right; } class Tree{ Node root; class Node{ Info info; Node parent; Node children[]; } class Tree{ Node root;
The root has a null sibling N-ary to Binary Tree 1 2 4 5 3 6 7 8 9 class Node{ Info info; Node parent; Node children[]; } 1 2 4 5 3 6 7 8 9 The root has a null sibling class Node{ Info info; Node parent; Node firstChild; Node nextSibling; }
Same as DFS on a binary tree Tree Traversal :: DFS 1 2 4 5 3 6 7 8 9 Visit first child and all its descendant first, then visit the second sibling, etc. DFS(x) visit(x) for each v child of x DFS(v) 1 2 4 5 3 6 7 8 9 DFS(x) if(x not NULL) visit(x) DFS(x.firstChild) DFS(x.nextSibling) Same as DFS on a binary tree
Tree Traversal :: DFS There are basically 3 variants of DFS Preorder visit the root, then the left subtree, then the right subtree Inorder visit the left subtree, then the root, then the right subtree Postorder visit the left subtree, then the right subtree, then the root Preorder, inorder, and postorder on n-ary tree Left subtree = firstChild subtree Right subtree = firstChild.nextSibling subtree DFS_PRE(x) if(x not NULL) visit(x) DFS(x.left) DFS(x.right) Preorder DFS_IN(x) if(x not NULL) DFS(x.left) visit(x) DFS(x.right) Inorder DFS_POST(x) if(x not NULL) DFS(x.left) DFS(x.right) visit(x) Postorder
Tree Traversal :: BFS Similar to BFS traversal on a graph BFS() Q.enqueue(root) while not Q.isEmpty() x = Q.dequeue() visit(x) if(x.left not NULL) Q.enqueue(x.left) if(x.right not NULL) Q.enqueue(x.right)
Some Basic Methods Counting the number of nodes Computing depth COUNT(x) if (x == NULL) return 0 else return 1 + COUNT(x.left) + COUNT(x.right) DEPTH(x) if (x == NULL) return 0 else return 1 + MAX(DEPTH(x.left),DEPTH(x.right))
Some Basic Methods Searching info in a tree rooted at x SEARCH(x, info) if (x == NULL) return NULL else if (x.info == info) return x else s = SEARCH(x.left, info) if(s not NULL) return s return SEARCH(x.right, info)
Binary Search Tree BST is a binary tree which has the property that for any node x in the tree If y is a node in the left subtree of x then y.info < x.info If y is a node in the right subtree of x then y.info ≥ x.info Basic Methods Querying Searching Finding minimum / maximum Finding successor / predecessor Insertion & Deletion Sorting
Searching Similar to Binary Search on an array x < x ≥ x SEARCH(x, info) if (x == NULL) return NULL else if (x.info == info) return x else if(info < x.info) return SEARCH(x.left, info) return SEARCH(x.right, info)
Minimum / Maximum The smallest element in a BST must be stored on the left most node, and similarly, the largest element is stored on the right most node MIN(x) if (x == NULL) return x else while(x.left not NULL) x = x.left MAX(x) if (x == NULL) return x else while(x.right not NULL) x = x.right
Finding Successor Successor = the smallest element among elements that is greater than x Case 1 : x has a right subtree Successor of x is the minimum of x’s right subtree x
Finding Successor Case 2 : x doesn’t have a right subtree x yn y1 z (y1<…<yn<x) < z x z (y<x) < z y x z x < z
Finding predecessor is very similar Finding Successor Case 3 : x doesn’t have a right subtree, and x is the right most element of the tree X doesn’t have a successor x SUCCESSOR(x) if (x.right not NULL) return MIN(x.right) else y = x.parent while(y not NULL AND x == y.right) x = y y = y.parent return y Finding predecessor is very similar
BST Insertion TREE_INSERT(T, info) x = Node(info) if (T is an empty tree) T.root = x else INSERT(T.root, x) INSERT(curr, x) if(x.info < curr.info) if (curr.left == NULL) x.parent = curr curr.left = x INSERT(curr.left, x) if(curr.right == NULL) curr.right = x INSERT(curr.right, x)
Pseudocode is left as an exercise BST Deletion It has three basic cases If the node to be deleted has no children, then just remove it If the node to be deleted has one child, then replace the node with its only child If the node to be deleted has two children, then replace with its successor Pseudocode is left as an exercise
Sorting Given a list of data L = {a1, a2, …, an} Successively insert the data into BST To view the sorted list just use DFS (inorder)
Time Complexity For a tree with n nodes Traversal visits every node, so it takes O(n) time Insertion can insert on the bottom most location of the tree, so it is proportional to the tree’s depth/height. Suppose the tree’s height is h, then Insertion takes O(h) time Searching, finding Maximum / Minimum, finding Successor / Predecessor are also O(h) Deletion might call successor, so it also O(h)
So sorting takes O(n.h + n) = O(n.h) Given a list of data L = {a1, a2, …, an} Successively insert the data into BST To view the sorted list just use DFS (inorder) What is the complexty of Sorting ? Inserting n elements is O(n.h) Traversal takes O(n) So sorting takes O(n.h + n) = O(n.h)
Tree’s Height / Depth The tree’s height determine the efficiency of BST’s operations 1 2 3 14 15 Worst case: h = n 8 4 12 2 6 10 14 1 3 5 7 9 11 13 15 Best case: h = lg(n)
Balanced Tree It is clear that a more balanced tree gives a better performance than the unbalanced one There are various attempts to build a balanced tree data structure, such as: Red-Black tree Self Balancing BST B-Tree Treap etc.