1 CSC 222: Computer Programming II Spring 2004 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search Class design  templated.

Slides:



Advertisements
Similar presentations
Copyright © 2014, 2008 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Eighth Edition by Tony Gaddis,
Advertisements

1 CSC 222: Computer Programming II Spring 2004  vectors  library, templated  advantages over arrays:  const-reference parameters  growable vectors.
Chapter 9: Searching, Sorting, and Algorithm Analysis
Searching Kruse and Ryba Ch and 9.6. Problem: Search We are given a list of records. Each record has an associated key. Give efficient algorithm.
hashing1 Hashing It’s not just for breakfast anymore!
Algorithm Efficiency and Sorting
Computer Science II Exam I Review Monday, February 6, 2006.
Searching Arrays Linear search Binary search small arrays
C++ for Engineers and Scientists Third Edition
1 Fall Chapter 4 ADT Sorted List. 2 Goals Describe the Abstract Data Type Sorted List from three perspectives Implement the following Sorted List.
Searching1 Searching The truth is out there.... searching2 Serial Search Brute force algorithm: examine each array item sequentially until either: –the.
1 Joe Meehean.  Important and common problem  Given a collection, determine whether value v is a member  Common variation given a collection of unique.
1 CSC 222: Computer Programming II Spring 2005 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search  example: spell checker.
Abstract Data Types (ADTs) Data Structures The Java Collections API
CSIS 123A Lecture 12 Templates. Introduction  C++ templates  Allow very ‘general’ definitions for functions and classes  Type names are ‘parameters’
Chapter 7: Arrays. In this chapter, you will learn about: One-dimensional arrays Array initialization Declaring and processing two-dimensional arrays.
CPT: Search/ Computer Programming Techniques Semester 1, 1998 Objectives of these slides: –to discuss searching: its implementation,
Copyright © 2012 Pearson Education, Inc. Chapter 8: Searching and Sorting Arrays.
C++ Arrays. Agenda What is an array? What is an array? Declaring C++ arrays Declaring C++ arrays Initializing one-dimensional arrays Initializing one-dimensional.
1 CSC 222: Object-Oriented Programming Spring 2012 Searching and sorting  sequential search  algorithm analysis: big-Oh, rate-of-growth  binary search.
1 CSC 222: Computer Programming II Spring 2004 Pointers and linked lists  human chain analogy  linked lists: adding/deleting/traversing nodes  Node.
Today  Table/List operations  Parallel Arrays  Efficiency and Big ‘O’  Searching.
INTRODUCTION TO BINARY TREES P SORTING  Review of Linear Search: –again, begin with first element and search through list until finding element,
Chapter 13 Recursion. Learning Objectives Recursive void Functions – Tracing recursive calls – Infinite recursion, overflows Recursive Functions that.
Chapter 8 Searching and Sorting Arrays Csc 125 Introduction to C++ Fall 2005.
 Pearson Education, Inc. All rights reserved Searching and Sorting.
C++ Programming: From Problem Analysis to Program Design, Second Edition Chapter 19: Searching and Sorting.
DATA STRUCTURE & ALGORITHMS (BCS 1223) CHAPTER 8 : SEARCHING.
CSC 211 Data Structures Lecture 13
1 CSC 427: Data Structures and Algorithm Analysis Fall 2008 Algorithm analysis, searching and sorting  best vs. average vs. worst case analysis  big-Oh.
CMSC 341 B- Trees D. Frey with apologies to Tom Anastasio.
Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy Walters, and Godfrey Muganda Modified for use by MSU Dept. of Computer Science.
Topic 1 Object Oriented Programming. 1-2 Objectives To review the concepts and terminology of object-oriented programming To discuss some features of.
1 CSC 222: Computer Programming II Spring 2004 Sorting and recursion  insertion sort, O(N 2 )  selection sort  recursion  merge sort, O(N log N)
11-1 Computing Fundamentals with C++ Object-Oriented Programming and Design, 2nd Edition Rick Mercer Franklin, Beedle & Associates, 1999 ISBN
Review 1 Arrays & Strings Array Array Elements Accessing array elements Declaring an array Initializing an array Two-dimensional Array Array of Structure.
Lecture on Binary Search and Sorting. Another Algorithm Example SEARCHING: a common problem in computer science involves storing and maintaining large.
1 Chapter 13-1 Applied Arrays: Lists and Strings Dale/Weems.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
1 CSC 222: Computer Programming II Spring 2004 Pointers and dynamic memory  pointer type  dereference operator (*), address-of operator (&)  sorting.
Algorithm Analysis (Big O)
1 Chapter 13-2 Applied Arrays: Lists and Strings Dale/Weems.
Prof. amr Goneid, AUC1 CSCE 110 PROGRAMMING FUNDAMENTALS WITH C++ Prof. Amr Goneid AUC Part 15. Dictionaries (1): A Key Table Class.
1. Searching The basic characteristics of any searching algorithm is that searching should be efficient, it should have less number of computations involved.
Searching Topics Sequential Search Binary Search.
1 CSC 222: Computer Programming II Spring 2004 Stacks and recursion  stack ADT  push, pop, top, empty, size  vector-based implementation, library 
Computer Science 1620 Sorting. cases exist where we would like our data to be in ascending (descending order) binary searching printing purposes selection.
Searching CSE 103 Lecture 20 Wednesday, October 16, 2002 prepared by Doug Hogan.
CSCI-235 Micro-Computers in Science Algorithms Part II.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Starting Out with C++ Early Objects Seventh Edition by Tony Gaddis, Judy.
ECE 264 Object-Oriented Software Development Instructor: Dr. Honggang Wang Fall 2012 Lecture 19: Container classes; strings.
CSC 427: Data Structures and Algorithm Analysis
Chapter 13 Recursion Copyright © 2016 Pearson, Inc. All rights reserved.
16 Searching and Sorting.
CSCE 210 Data Structures and Algorithms
CSC 427: Data Structures and Algorithm Analysis
CSC 427: Data Structures and Algorithm Analysis
Introduction to Search Algorithms
CSC 222: Object-Oriented Programming
Chapter 9: Searching, Sorting, and Algorithm Analysis
Announcements HW6 due this Wednesday
Introduction to Algorithms
Linked Lists.
B- Trees D. Frey with apologies to Tom Anastasio
Searching and Sorting Arrays
Announcements HW6 due this Wednesday
Searching and Sorting 1-D Arrays
Data Structures Sorted Arrays
CSC 427: Data Structures and Algorithm Analysis
Presentation transcript:

1 CSC 222: Computer Programming II Spring 2004 Searching and efficiency  sequential search  big-Oh, rate-of-growth  binary search Class design  templated classes  inheritance, virtual, protected

2 Searching a list suppose you have a list, and want to find a particular item, e.g.,  lookup a word in a dictionary  find a number in the phone book  locate a student's exam from a pile searching is a common task in computing  searching a database  checking a login password  lookup the value assigned to a variable in memory if the items in the list are unordered (e.g., added at random)  desired item is equally likely to be at any point in the list  need to systematically search through the list, check each entry until found  sequential search

3 Sequential search sequential search traverses the list from beginning to end  check each entry in the list  if matches the desired entry, then FOUND  if traverse entire list and no match, then NOT FOUND bool IsStored(const vector & words, string desired) // postcondition: returns true if desired in words, else false { for(int k=0; k < words.size(); k++) { if (words[k] == word) { return true; } return false; } recall WordList class (from the word frequency program)  stored words and frequencies in parallel vectors  Locate member function used sequential search to locate a word in the vector

4 How efficient is sequential search? in the worst case:  the item you are looking for is in the last position of the list (or not found)  requires traversing and checking every item in the list  if 100 or 1,000 entries  NO BIG DEAL  if 10,000 or 100,000 entries  NOTICEABLE for this algorithm, the dominant factor in execution time is checking an item  the number of checks will determine efficiency in the average case? in the best case?

5 Big-Oh notation to represent an algorithm’s performance in relation to the size of the problem, computer scientists use Big-Oh notation an algorithm is O(N) if the number of operations required to solve a problem is proportional to the size of the problem sequential search on a list of N items requires roughly N checks (+ other constants)  O(N) for an O(N) algorithm, doubling the size of the problem requires double the amount of work (in the worst case)  if it takes 1 second to search a list of 1,000 items, then it takes 2 seconds to search a list of 2,000 items it takes 4 seconds to search a list of 4,000 items it takes 8 seconds to search a list of 8,000 items...

6 Searching an ordered list when the list is unordered, can't do any better than sequential search  but, if the list is ordered, a better alternative exists e.g., when looking up a word in the dictionary or name in the phone book  can take ordering knowledge into account  pick a spot – if too far in the list, then go backward; if not far enough, go forward binary search algorithm  check midpoint of the list  if desired item is found there, then DONE  if the item at midpoint comes after the desired item in the ordering scheme, then repeat the process on the left half  if the item at midpoint comes before the desired item in the ordering scheme, then repeat the process on the right half

7 Binary search note: each check reduces the range in which the item can be found by half  see for demowww.creighton.edu/~davereed/csc107/search.html bool IsStored(const vector & words, string desired) // postcondition: returns true if desired in words, else false { int left = 0, right = items.size()-1; // maintain bounds on where item must be while (left <= right) { int mid = (left+right)/2; if (item == items[mid]) { // if at midpoint, then DONE return true; } else if (item < items[mid]) { // if less than midpoint, focus on left half right = mid-1; } else { // otherwise, focus on right half left = mid + 1; } return false; // if reduce to empty range, NOT FOUND }

8 How efficient is binary search? in the worst case:  the item you are looking for is in the first or last position of the list (or not found) start with N items in list after 1 st check, reduced to N/2 items to search after 2 nd check, reduced to N/4 items to search after 3 rd check, reduced to N/8 items to search... after  log 2 N  checks, reduced to 1 item to search again, the dominant factor in execution time is checking an item  the number of checks will determine efficiency in the average case? in the best case?

9 Big-Oh notation an algorithm is O(log N) if the number of operations required to solve a problem is proportional to the logarithm of the size of the problem binary search on a list of N items requires roughly log 2 N checks (+ other constants)  O(log N) for an O(log N) algorithm, doubling the size of the problem adds only a constant amount of work  if it takes 1 second to search a list of 1,000 items, then searching a list of 2,000 items will take time to check midpoint + 1 second searching a list of 4,000 items will take time for 2 checks + 1 second searching a list of 8,000 items will take time for 3 checks + 1 second...

10 Comparison: searching a phone book Number of entries in phone book Number of checks performed by sequential search Number of checks performed by binary search , ……… 10, , , ……… 1,000, to search a phone book of the United States (~280 million) using binary search? to search a phone book of the world (6 billion) using binary search?

11 Searching example for small N, the difference between O(N) and O(log N) may be negligible consider the following large-scale application: a spell checker  want to read in and store a dictionary of words  then, process a text file one word at a time if word is not found in dictionary, report as misspelled  since the dictionary file is large (~60,000 words), the difference is clear we could define a Dictionary class to store & access words  constructor, Add, IsStored, NumStored, DisplayAll, … better yet, define a generic List class to store & access any type of data  utilize template (as in vector) to allow for any type List words;List grades;

12 Templated classes template class List { public: List () // constructor, creates an empty list { //does nothing } void Add(const ItemType & item) // Results: adds item to the end of the list { items.push_back(item); } bool IsStored(const ItemType & item) const // Returns: true if item is stored in list, else false { for (int i = 0; i < items.size(); i++) { if (item == items[i]) { return true; } return false; } int NumItems() const // Returns: number of items in the list { return items.size(); } void DisplayAll() const // Results: displays all items in the list, one per line { for (int i = 0; i < items.size(); i++) { cout << items[i] << endl; } private: vector items; }; when defining a templated class begin class definition with template which gives a name to the arbitrary type involved within the class definition, use ItemType for the type when an actual instance is declared, e.g., List words; the specified type will be substituted for ItemType templated classes must be defined in one file

13 Spell checker (spell1.cpp) #include #include "List.h" using namespace std; const string DICTIONARY_FILE = "dict.txt"; void OpenFile(ifstream & myin); void ReadDictionary(List & dict); string Normalize(string word); int main() { List dictionary; ReadDictionary(dictionary); ifstream textFile; OpenFile(textFile); cout << endl << "MISSPELLED WORDS" << endl << " " << endl; string word; while (textFile >> word) { word = Normalize(word); if (!dictionary.IsStored(word)) { cout << word << endl; } return 0; } void OpenFile(ifstream & myin) // Results: myin is opened to the user's file { // AS BEFORE } void ReadDictionary(List & dict) { ifstream myDict(DICTIONARY_FILE.c_str()); assert(myDict); cout << "Please wait while file loads... "; string word; while (myDict >> word) { dict.Add(word); } myDict.close(); cout << "DONE!" << endl << endl; } string Normalize(string word) { string copy; for (int i = 0; i < word.length(); i++) { if (!ispunct(word[i])) { copy += tolower(word[i]); } return copy; }

14 In-class exercise copy the following files     create a project and spell check the Gettysburg address  is the delay noticeable? now download and spell check the Gettysburg address with that  doubledict.txt is twice as big (dummy value inserted between each word)  does it take roughly twice as long?

15 SortedList class template class SortedList { public: SortedList () { /* DOES NOTHING */ } void Add(const ItemType & item) // Results: adds item to the list in order { items.push_back(item); int i; for (i = items.size()-1; i > 0 && items[i-1] > item; i--) { items[i] = items[i-1]; } items[i] = item; } bool IsStored(const ItemType & item) const // Returns: true if item is stored in list, else false { int left = 0, right = items.size()-1; while (left <= right) { int mid = (left+right)/2; if (item == items[mid]) { return true; } else if (item < items[mid]) { right = mid-1; } else { left = mid + 1; } return false; } int NumItems() const { /* AS BEFORE */ } void DisplayAll() const { /* AS BEFORE */ } private: vector items; /* AS BEFORE */ }; could define another class to represent sorted lists Add must add items in sorted order IsStored can use binary search data field and other member functions identical to List code duplication is troubling…

16 Inheritance C++ provides a better mechanism: inheritance  if have an existing class, and want to define a new class that extends it  can derive a new class from the existing ( parent ) class  a derived class inherits all data fields and member functions from its parent  can add new member functions (or reimplement existing ones) in the derived class  inheritance defines an "IS A" relationship : an instance of the desired class IS A instance of the parent class, just more specific example: List  SortedList  List & SortedList have the same data field (vector of ItemType )  constructor, NumItems, DisplayAll are identical  Add & IsStored differ ( List : add at end, sequential search; SortedList : add in order, binary search)  a SortedList IS A List

17 Inheritance template class SortedList : public List { public: SortedList () // constructor (note: List constructor implicitly called first) { // does nothing } void Add(const ItemType & item) // Results: adds item to the list in order { items.push_back(item); int i; for (i = items.size()-1; i > 0 && items[i-1] > item; i--) { items[i] = items[i-1]; } items[i] = item; } bool IsStored(const ItemType & item) const // Returns: true if item is stored in list, else false { int left = 0, right = items.size()-1; while (left <= right) { int mid = (left+right)/2; if (item == items[mid]) { return true; } else if (item < items[mid]) { right = mid-1; } else { left = mid + 1; } return false; } }; to define a derived class at the end of the first line of the class definition, add : public ParentClass which specifies the parent from which to inherit then, only define new data/functions, or functions being overridden parent class constructor is automatically called when constructing an object of derived class (but should always have constructor even if does nothing)

18 Inheritance vs. reimplementation using inheritance is preferable to reimplementing the class from scratch  derived class is simpler/cleaner  no duplication of code  if change implementation of code in parent class, derived class automatically updated  object of derived class is still considered to be an object of parent class e.g., could pass a SortedList to a function expecting a List inheritance approach does require some modifications to List class  private data is hidden from derived class as well as client program instead, use 3 rd protection level: protected protected data is accessible to derived classes, but not client program  member functions to be overridden should be declared virtual in parent class necessary if want to have a function that works on both List and SortedList

19 parent List class template class List { public: List () // constructor, creates an empty list { //does nothing } virtual void Add(const ItemType & item) // Results: adds item to the end of the list { items.push_back(item); } virtual bool IsStored(const ItemType & item) const // Returns: true if item is stored in list, else false { for (int i = 0; i < items.size(); i++) { if (item == items[i]) { return true; } return false; } int NumItems() const // Returns: number of items in the list { return items.size(); } void DisplayAll() const // Results: displays all items in the list, one per line { for (int i = 0; i < items.size(); i++) { cout << items[i] << endl; } protected: vector items; }; protected specifies that the vector will be accessible to the SortedList class virtual specifies that Add and IsStored can be overridden and handled correctly if pass a SortedList to a function expecting a List, should still recognize it as a SortedList

20 Spell checker (spell2.cpp) #include #include "SortedList.h" using namespace std; const string DICTIONARY_FILE = "dict.txt"; void OpenFile(ifstream & myin); void ReadDictionary(List & dict); string Normalize(string word); int main() { SortedList dictionary; ReadDictionary(dictionary); ifstream textFile; OpenFile(textFile); cout << endl << "MISSPELLED WORDS" << endl << " " << endl; string word; while (textFile >> word) { word = Normalize(word); if (!dictionary.IsStored(word)) { cout << word << endl; } return 0; } void OpenFile(ifstream & myin) // Results: myin is opened to the user's file { // AS BEFORE } void ReadDictionary(List & dict) { ifstream myDict(DICTIONARY_FILE.c_str()); assert(myDict); cout << "Please wait while file loads... "; string word; while (myDict >> word) { dict.Add(word); } myDict.close(); cout << "DONE!" << endl << endl; } string Normalize(string word) { string copy; for (int i = 0; i < word.length(); i++) { if (!ispunct(word[i])) { copy += tolower(word[i]); } return copy; }

21 In-class exercise copy the following files      create a project and spell check the Gettysburg address  is it noticeably faster than sequential search? now download and spell check the Gettysburg address with that  doubledict.txt is twice as big (dummy value inserted between each word)  how much longer does it take?