Searching (and into Trees)

Searching (and into Trees)

Searching techniques Searching algorithms in unordered “lists”
“Lists” may be arrays or linked lists The algorithms are adaptable to either Searching algorithms in ordered “lists” Primarily arrays with values sorted into order We can exploit the order to search more efficiently Let’s start with arrays The data held in the array might be Simple (e.g. numbers, strings) or more complex objects: the search will probably be based on a chosen key field in the objects (e.g., search student records by registration number) We will only consider simple data – the searching techniques are the same Trees 1

Sequential search in an unordered list
In these algorithms we will assume: The data is integers Held in an array variable numbers, Is in random order The number of data values is indicated by a variable size The data is in elements indexed 0 to (size-1) We are seeking the integer held in variable val The basic technique to be used is “sequential search” Compare val with the value in numbers[0], then with that in numbers[1], etc We will look at two versions of an algorithm encoding this Other adaptations are possible Trees

Algorithm 1: Standard sequential search
Here is a basic search algorithm. It leaves its result in a variable called position: int position = 0; while (position < size) { if (numbers[position] == val) break; // Exit loop if found position++; } If val is not present: The entire array will be scanned - taking size steps position will have a final value of size But if val is present: break; -> the while loop terminates immediately The average number of scanning steps expected is size/2 Easy to adapt to return a boolean, or throw an exception Trees

If we are careful, we can combine the loop test and the array element check:
int position = 0; while (numbers[position] != val && position < size) position++; Is this correct? int position = 0; while (numbers[position++] != val && position < size); Trees

Corrected Version If we are careful, we can combine the loop test and the array element check: int position = 0; while (position < size && numbers[position] != val) position++; The && test checks position < size first, and if it is false does not check numbers[position] != val otherwise would get ArrayIndexOutOfBoundsException if val is not present! This is called "conditional" or "short-circuit" behaviour: it applies to && and || Trees

Algorithm 2: Sequential search with a “sentinel”
We can improve the basic search algorithm if the array numbers has one extra element, numbers[size], that is never used for actual data Instead we place a copy of the sought value there, so the search always succeeds. This means that the loop does not need to carry out the “end of array” test - less work, so quicker. int[] numbers = new int[size+1]; ... int position = 0; numbers[size] = -1; // Insert "sentinel" while (numbers[position] != val) position++; return position; As before, position has the final value size if val is not present Trees

For the sequential search algorithm (with or without sentinel):
We may be interested in an algorithm’s best case, worst case or average execution time: For the sequential search algorithm (with or without sentinel): Best case is 1 step: O(1) Worst case is N steps: O(N) The actual average number of steps depends on ratio of successful/unsuccessful searches: The average of successful searches is N/2 steps, and so is O(N) All unsuccessful searches take N steps, which is O(N), So overall the average complexity is O(N) Trees

Searching an Ordered List
Again we will assume: The data is integers, held in an array sequence, So the data is in elements indexed 0 to (length-1) But this time we assume that the values are held in ascending numerical order We are seeking the integer held in val We could use the sequential search algorithm, but this does not take advantage of the knowledge that the data is ordered. (The complexity remains O(N). ) Instead, we will take advantage of the ordering to improve search efficiency (i.e., to reduce the complexity) Trees 1

Binary Search If the data is already ordered, we can do much better than a linear time algorithm. Here is the scheme: Pick the middle element in the array If it is equal to val, stop the search If it is greater than val, search the lower half of the remaining array If it is less than val, search the remaining upper half At each iteration: We are searching in a remaining partition of the array We cut the remaining partition in half, rather than just removing one element Example: Searching for 11 in 1, 3, 5, 7, 9, 11, 13 First compare with 7, so search in 9, 11, 13 Now compare with 11 - found it - in two steps Trees

Not found, and 11>7, so low=(m+1)=4 h=6
Concretely: Let variable low indicate the lowest element of the partition (index 0 initially) high (h) indicate the highest element (size-1 initially) middle (m) indicate the next element being tested The search for 11 proceeds like this: 1 3 5 9 7 11 13 low h Not found, and 11>7, so low=(m+1)=4 h=6 Found it, at index 5 low=0 h=6  m=3 m  m=5 2 4 6 Trees

An unsuccessful search: search for 10
3 5 9 7 11 13 low h Not found, and 10>7, so low=(m+1)=4 h=6 Now low>h, and the partition has “vanished”: the search has failed low=0 h=6  m=3 m  m=5 Not found, and 10<11, so low=4 h=(m-1)=4  m=4 Not found, and 10>9, so low=(m+1)=5 h=4 2 4 6 Trees

Algorithm binarySearch:
INPUT: val – value of interest, sequence – sorted data OUTPUT: object or value of interest if exists, null otherwise int low = 0, middle = 0, high = seq.length; while (high >= low) { middle = (high + low) / 2; if (sequence[middle] == val) return sequence[middle]; // Found it else if (sequence[middle] < val) low = middle + 1; // Search upper half else high = middle - 1; // Search lower half } return -1; // or null if an object-type The outcomes: Ordinary loop exit when the indexes “cross”  not found (i.e. high < low) Loop exit on return  found (detect this by testing high >= low) Trees

The Complexity of Binary Search
Best case: val is exactly sequence[middle] at the first step The search stops after first step, so complexity O(1) Worst case: This will be when we continue dividing until the “partition” contains only one value: then it is either equal to val or not For 250 elements this turns out to be about 8 iterations For 500 it is about 9 For 1000 it is about 10 Double the amount of data  Add one step! In general: the size is approximately 2steps So the number of steps is approximately log2 size Complexity is O(logN) For emphasis: double the amount of data  Add one step! Average case: Don’t need to consider this: the worst case is very good! N log2N 64 6 128 7 256 8 512 9 Trees

Trees Make Money Fast! Stock Fraud Ponzi Scheme Bank Robbery Trees
Search and into Trees Trees Make Money Fast! Stock Fraud Ponzi Scheme Bank Robbery Trees

What is a Tree In computer science, a tree is an abstract model of a hierarchical structure A tree consists of nodes with a parent-child relation Applications: Organization charts File systems Programming environments Computers”R”Us Sales R&D Manufacturing Laptops Desktops US International Europe Asia Canada Trees

Tree Terminology subtree Root: node without parent (A)
Internal node: node with at least one child (A, B, C, F) External node (a.k.a. leaf ): node without children (E, I, J, K, G, H, D) Ancestors of a node: parent, grandparent, grand-grandparent, etc. Depth of a node: number of ancestors Height of a tree: maximum depth of any node (3) Descendant of a node: child, grandchild, grand-grandchild, etc. Subtree: tree consisting of a node and its descendants A B D C G H E F I J K subtree Trees

Tree ADT We use positions to abstract nodes Generic methods:
integer size() boolean isEmpty() Iterator iterator() Iterable positions() Accessor methods: position root() position parent(p) Iterable children(p) Query methods: boolean isInternal(p) boolean isExternal(p) boolean isRoot(p) Update method: element replace (p, o) Additional update methods may be defined by data structures implementing the Tree ADT Trees

Preorder Traversal Algorithm preOrder(v) visit(v)
Search and into Trees Preorder Traversal A traversal visits the nodes of a tree in a systematic manner In a preorder traversal, a node is visited before its descendants Application: print a structured document Algorithm preOrder(v) visit(v) for each child w of v preorder (w) 1 Make Money Fast! 2 5 9 1. Motivations 2. Methods References 6 7 8 3 4 2.1 Stock Fraud 2.2 Ponzi Scheme 2.3 Bank Robbery 1.1 Greed 1.2 Avidity Trees

Postorder Traversal Algorithm postOrder(v) for each child w of v
In a postorder traversal, a node is visited after its descendants Application: compute space used by files in a directory and its subdirectories Algorithm postOrder(v) for each child w of v postOrder (w) visit(v) 9 cs16/ 8 3 7 todo.txt 1K homeworks/ programs/ 1 2 4 5 6 h1c.doc 3K h1nc.doc 2K DDR.java 10K Stocks.java 25K Robot.java 20K Trees

Ordered Binary Trees A binary tree is a tree with the following properties: Each internal node has at most two children (exactly two for proper binary trees) The children of a node are an ordered pair We call the children of an internal node left child and right child Alternative recursive definition: a binary tree is either a tree consisting of a single node, or a tree whose root has an ordered pair of children, each of which is a binary tree Applications: arithmetic expressions decision processes searching A B C D E F G H I Trees

Arithmetic Expression Tree
Binary tree associated with an arithmetic expression internal nodes: operators external nodes: operands Example: arithmetic expression tree for the expression (2  (a - 1) + (3  b)) +  - 2 a 1 3 b Trees

Decision Tree Binary tree associated with a decision process
internal nodes: questions with yes/no answer external nodes: decisions Example: dining decision Want a fast meal? Yes No How about coffee? On expense account? Yes No Yes No Starbucks Spike’s Al Forno Café Paragon Trees

Properties of Proper Binary Trees
Notation n number of nodes e number of external nodes i number of internal nodes h height Properties: e = i + 1 n = 2e - 1 h  i h  (n - 1)/2 e  2h h  log2 e h  log2 (n + 1) - 1 Trees

BinaryTree ADT The BinaryTree ADT extends the Tree ADT, i.e., it inherits all the methods of the Tree ADT Additional methods: position left(p) position right(p) boolean hasLeft(p) boolean hasRight(p) Update methods may be defined by data structures implementing the BinaryTree ADT Trees

Inorder Traversal Algorithm inOrder(v) if hasLeft (v)
In an inorder traversal a node is visited after its left subtree and before its right subtree Application: draw a binary tree x(v) = inorder rank of v y(v) = depth of v Algorithm inOrder(v) if hasLeft (v) inOrder (left (v)) visit(v) if hasRight (v) inOrder (right (v)) 6 2 8 1 4 7 9 3 5 Trees

Print Arithmetic Expressions
Algorithm printExpression(v) if hasLeft (v) print(“(’’) inOrder (left(v)) print(v.element ()) if hasRight (v) inOrder (right(v)) print (“)’’) Specialization of an inorder traversal print operand or operator when visiting node print “(“ before traversing left subtree print “)“ after traversing right subtree +  - 2 a 1 3 b ((2  (a - 1)) + (3  b)) Trees

Evaluate Arithmetic Expressions
Specialization of a postorder traversal recursive method returning the value of a subtree when visiting an internal node, combine the values of the subtrees Algorithm evalExpr(v) if isExternal (v) return v.element () else x  evalExpr(leftChild (v)) y  evalExpr(rightChild (v))   operator stored at v return x  y +  - 2 5 1 3 Trees

Euler Tour Traversal +   2 - 3 2 5 1
Generic traversal of a binary tree Includes a special cases the preorder, postorder and inorder traversals Walk around the tree and visit each internal node three times: on the left (preorder) from below (inorder) on the right (postorder) + L  R  B 2 - 3 2 5 1 Trees

Linked Structure for Trees
A node is represented by an object storing Element Parent node Sequence of children nodes  B   A D F B A D F   C E C E Trees

Linked Structure for Binary Trees
A node is represented by an object storing Element Parent node Left child node Right child node  B   A D B A D     C E C E Trees

Array-Based Representation of Binary Trees
Nodes are stored in an array A 1 A A B D … G H … 2 3 1 2 3 10 11 B D Node v is stored at A[rank(v)] rank(root) = 1 if node is the left child of parent(node), rank(node) = 2  rank(parent(node)) if node is the right child of parent(node), rank(node) = 2  rank(parent(node)) + 1 4 5 6 7 E F C J 10 11 G H Trees

Searching (and into Trees)

Similar presentations

Presentation on theme: "Searching (and into Trees)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Searching (and into Trees)

Similar presentations

Presentation on theme: "Searching (and into Trees)"— Presentation transcript:

Similar presentations

About project

Feedback