Chapter 6 Using Library Algorithms. Objectives Explore the use of predefined library functions that are designed to be used with sequential container.

Chapter 6 Using Library Algorithms

Objectives Explore the use of predefined library functions that are designed to be used with sequential container classes and strings. Examine prefix and postfix increment notation. Develop an efficient method for deleting entries from a vector.

C++ For a long time it puzzled me how something so expensive, so leading edge, could be so useless. And then it occurred to me that a computer is a stupid machine with the ability to do incredibly smart things, while computer programmers are smart people with the ability to do incredibly stupid things. They are, in short, a perfect match.- Bill Bryson C++ is more of a rube-goldberg type thing full of high-voltages, large chain-driven gears, sharp edges, exploding widgets, and spots to get your fingers crushed. And because of it’s complexity many (if not most) of it’s users don’t know how it works, and can’t tell ahead of time what’s going to cause them to loose an arm. – Grant Edwards

Generic Algorithms Including gives you access to a large number of generic algorithms that can be used across many of the container classes. In the last section we concatenated two vectors. for (vector ::const_iterator it = bottom.begin(); it != bottom.end(); ++it) ret.push_back(*it); We are concatenating bottom onto the end of ret.

Copy There is a generic algorithm that does the same thing. copy(bottom.begin(), bottom.end(), back_inserter(ret)) Notice that these generic algorithms work with iterators rather than indexing because not all container classes support indexing. This function copies all the entries in the range [begin, end). back_inserter(ret) produces and iterator that can be used to append entries onto ret.

Copy back_inserter is an example of an iterator adaptor. This is an iterator with special properties. It is important that we use back_inserter(ret) instead of ret.end(). –ret.end() is one-past the last element in ret. We cannot necessarily insert new entries here. What if the memory is already being used for something else?

Copy copy(begin, end, out); This has the same effect as while (begin != end){ *out = *begin; out++; begin++; } Using a little notational magic we can shorten this to while (begin != end) *out++ = *begin++;

Postfix Increment Operator Consider the line *out++ = *begin++; Both out++ and ++out increment the iterator, the difference between then is their return value. it = begin++; This is equivalent to it = begin; ++begin; In this case it gets the value of begin before it is incremented.

Prefix Increment Operator it = ++begin; This is equivalent to ++begin; it = begin; In this case it gets the value of begin after it is incremented. One way to remember this is when the ++ is on the right (before) then the increment is done before the return. If the ++ is on the left (after) then the increment is done after the return.

Increment Operator *out++ = *begin++; This line performs *begin = *out and then increments both out and begin. The name C++ is a joke about incrementing C. C++ programmers often combine several logical operations together like this. It is fine if you are comfortable with it, but don’t be so clever that you confuse yourself. There is really no penalty for using three lines of code instead of one, if that makes the problem easier to think about.

Split Revisited Another place we could use generic functions to simplify our code would be the split program from the last chapter. A big part of that problem was keeping track of i and j which we use to delimit each word in the input. We can use the generic function find_if to do the searching for us. find_if(begin, end, predicate) This searches in the range [begin, end) for an entry where the predicate is true. The return value is an iterator for the first such entry. If there is no such entry then the return value is end.

Split Revisited Here is the code for the new split function. vector split(const string& str) { typedef string::const_iterator iter; vector ret; iter i = str.begin(); while (i != str.end()) { i = find_if(i, str.end(), not_space); iter j = find_if(i, str.end(), space); if (i != str.end()) ret.push_back(string(i, j)); i = j; } return ret; }

Split Revisited This is essentially the same algorithm as before. Instead of indexes, i and j are now iterators. We use a typedef to create a synonym for string::const_iterator. It would be nice to use isspace as the predicate, but this is a heavily overloaded function (languages other than English, etc.) there might be confusion about which one to use. We need a predicate for “not a space” too. Let’s create our own predicate functions.

Split Revisited // `true' if the argument is whitespace, `false' otherwise bool space(char c) { return isspace(c); } // `false' if the argument is whitespace, `true' otherwise bool not_space(char c) { return !isspace(c); }

Split Revisited Previously we used s.substr(i, j – i) to create a substring. In that function i and j are indexes. Since we are working with iterators, we will use the function string(i, j) that works with iterators and copies of all the entries in the range [i,j) into a new string. Notice, with this version we do not need to worry about i and j walking off the end of the string. The library algorithms take care of this for us.

Palindromes We can use a the rbegin iterator to march backward through a structure. The following function uses the equal library function to check for palindromes. bool is_palindrome(const string& s) { return equal(s.begin(), s.end(), s.rbegin()); }

equal equal(first.begin(), first.end(), second.begin()) Compares two sequences. – The first one is in the range [first.begin, first.end) – The second begins at second.begin(). – The second sequence is assumed to have the same length as the first. Since we used s.rbegin in place of second.begin, we will be marching backward through the same string and comparing it to itself.

Finding URLs Let’s create a function that does something very useful. Let’s find URLs embedded in a string. A URL has the form protocol-name://resource-name We can scan our string looking for :// with a “word” both before and after it. – Protocol-name has only letters. – Resource-name has letters, some punctuation, and numbers. We will, of course, use library functions to help.

Finding URLs There may be several URLs so let’s return a vector of strings where each string is a URL. Start by creating ret, which will be our return a vector. Next, create iterators for the beginning and end of the string. The ending iterator is fixed but we will walk the beginning iterator along the entire string until it reaches the end. When we find the substring “://” we will backup to locate the beginning of the protocol-name. Next, we create a new iterator after which we will use to indicate the end of resource-name. The entire URL is then copied to the vector and the process repeats

Finding URLs vector find_urls(const string& s) { vector ret; typedef string::const_iterator iter; iter b = s.begin(), e = s.end(); while (b != e) { b = url_beg(b, e); if (b != e) { iter after = url_end(b, e); ret.push_back(string(b, after)); b = after; } } return ret; }

Finding URLs Notice we moved much of the work to helper functions. –url_beg – finds the string ‘://’ and returns an iterator associated with the first character of the protocol-name. –url_end – returns an iterator associated with the first character after the resource-name. –not_url_char – a predicate to determine if a character could be part of a URL. The first one is the most challenging so we will leave it to last.

url_end string::const_iterator url_end(string::const_iterator b, string::const_iterator e) { return find_if(b, e, not_url_char); } This just uses find_if to find the first character that could not possibly be part of a URL. The return value is an iterator for this character because it is located one character past the end of our URL. This uses the predicate not_url_char we still need to define.

not_url_char bool not_url_char(char c) { static const string url_ch = "~;/?:@=&$-_.+!*'(),"; return !(isalnum(c) || find(url_ch.begin(), url_ch.end(), c) != url_ch.end()); } This accepts a character and then checks to see if either it is an alpha-numeric character or if it is in the string of other characters that could be in a URL. It used find which is similar to find_if except that it searches for a specific value (instead of using a predicate).

Static Variables Notice that the type of url_ch is static const string. We will use this same string the next time we call this function. static is a storage class specifier. Since this string is static it will be created the first time the function is called and will exist across repeated invocations of the function. This will save a little time. static variables can also be used to pass information from one invocation of the function to the next.

url_begin url_begin will accept two iterators as parameters, marking the beginning and end of the string to search. The return value will be an iterator indicating the beginning of a URL if there is one and the end of the string if there is not. We will search the string for the URL separator “://” and indicate its beginning location with the iterator i. We will then make sure the separator is not at the beginning or end of the string, in that case it couldn’t be a URL.

url_begin Next, we walk backward from the separator to find the first character that can’t be part of a URL. – The first character after this would be the beginning of our URL. – Create iterator beg to indicate this location. We need to make sure we actually have one or more characters in both protocol-name and the resource- name. If all this lines up we return beg as the beginning of a URL. If not, we go back and search again.

url_begin string::const_iterator url_beg(string::const_iterator b, string::const_iterator e) { static const string sep = "://”; typedef string::const_iterator iter; iter i = b; while ((i = search(i, e, sep.begin(), sep.end())) != e) { if (i != b && i + sep.size() != e) { iter beg = i; while (beg != b && isalpha(beg[-1])) --beg; if (beg != i && !not_url_char(i[sep.size()])) return beg; } i += sep.size(); } return e; }

search This function makes use of the library function search which accepts two pairs of iterators that denote two strings. – The first denote the string we are searching for. – The second denote the string we want to search in. – If search fails it returns the ending iterator of the second pair.

Iterator Magic Since strings support indexing, so do their iterators. –beg[-1] is the character before beg. – This is the same as *(beg - 1). –i[sep.size()] is the character immediately after the separator (if it exists). – This is the same as *(i + sep.size()). The decrement operator can be used with iterators. –--beg means more the iterator beg back one space in the string. – Like the increment operator we can use prefix or postfix notation with similar results.

Back to Grading Notice that using the median to determine the homework grade may not be the best method. – Any score below the median counts the same, even 0’s. Most professors use the mean. Should we even count 0’s in the mean and median? Does the choice of grading scale effect the grade?

Back to Grading Let’s write a program that will do the following. First, read in all the student records. Separate the students who did all the homework form the others. Apply the two grading schemes (median and mean) to both groups. Report the median grade for each group.

Back to Grading We already have code to read in the student grades. We need a predicate for determining which students did all the homework. bool did_all_hw(const Student_info& s) { return ((find(s.homework.begin(), s.homework.end(), 0)) == s.homework.end()); }

main The main function will read student records and place them in one of two vectors did for those with no 0’s and didnt for those with one or more 0’s. Next we check to see if either list is empty and return with an error if this is the case because our analysis would be meaningless.

main vector did, didnt; cout << "Input: "; Student_info student; while (read(cin, student)) { if (did_all_hw(student)) did.push_back(student); else didnt.push_back(student); } if (did.empty()) { cout << "No student did all the homework!" << endl; return 1; } if (didnt.empty()) { cout << "Every student did all the homework!" << endl; return 1; }

empty The only new function here is the empty method from the vector class. We use this to determine if either vector is empty. We could compare the size method to 0, but this might take longer. It is generally a good idea to use the function or method that does what you need but no more. In many cases it will be faster.

Analysis We want to do three analyses. – Compare the medians of each group. – Compare the means of each group. – Compare the medians of each group where we only count assignments turned in (omit 0’s entirely). The output will be similar in each case so let’s create a common output function and pass it the name of the appropriate function to do the analysis.

Analysis Let’s create a function with 5 arguments. – The stream on which to write. – A string that will be used as a label in the output. – The function to use for the analysis – The two vectors to analyze. write_analysis(cout, “median”, median_analysis, did, didnt);

Median Analysis So, we have a list of students and we want to find the grade for each student, and then find the median of all the grades. We could do this with a loop and a little code, but there is a library function called transform that will process all the entries between two iterators using the function of our choice. The results are stored in another container, we must provide an iterator to this new container.

Median Analysis double median_analysis(const vector & students) { vector grades; transform(students.begin(), students.end(), back_inserter(grades), grade_aux); return median(grades); }

Transform transform(begin, end, result, function) transform runs through all the entries in the range [begin, end). function is applied to each of them. The results are stored in the container associated with result. You must be sure it is OK to write the results to the location associated with result. This is why it is good to use back_inserter.

Calculating grades Notice we used a new function grade_aux to calculate the grades. We already have a grade function but it could throw an exception of the grade list for some student were empty. Also, we have several grade functions with different signatures but the compiler doesn’t know which is the right one. We need to write a wrapper for the grade function.

grade_aux double grade_aux(const Student_info& s) { try { return grade(s); } catch (domain_error) { return grade(s.midterm, s.final, 0); } } This handles any exceptions and makes it clear which version of grade to use. Students who do no homework get a 0.

write_analysis void write_analysis(ostream& out, const string& name, double analysis(const vector &), const vector & did, const vector & didnt) { out << name << ": median(did) = " << analysis(did) << ", median(didnt) = " << analysis(didnt) << endl; }

Passing a Function as a Parameter When you pass a function as a parameter, the parameter definition looks just like a function declaration. In this case the type of the parameter analysis is double analysis(const vector &) It is a function that accepts a constant reference to a vector of Student_info records and returns a double.

void Return Type This function does not need to return anything to where it was called. It’s only job is to print output. In this case there is a specific return type called void. You can end a void function in two ways. –return statement with no value. – falling off the end of the function by encountering the closing }.

Completing main The last few lines of main are the following. write_analysis(cout, "median", median_analysis, did, didnt); write_analysis(cout, "average", average_analysis, did, didnt); write_analysis(cout, "median of homework turned in", optimistic_median_analysis, did, didnt); system("pause"); return 0;

Average First we need a function to compute the average (mean) of a vector. We could easily do this with a for loop, but there is yet another library function that can help. double average(const vector & v) { return accumulate(v.begin(), v.end(), 0.0) / v.size(); } accumulate adds up the entries between begin and end beginning with the starting value 0.0. You need to include the header to use this function.

average_analysis We now need a wrapper for the grade function with average in place of median. double average_grade(const Student_info& s) { return grade(s.midterm, s.final, average(s.homework)); } Once we have this we can write average_analysis to be similar to median_analysis. double average_analysis(const vector & students) { vector grades; transform(students.begin(), students.end(), back_inserter(grades), average_grade); return median(grades); }

optimistic_median Finally, we need to compute the homework grade where only the completed homework (non-zero) is used to compute the median homework grade. double optimistic_median(const Student_info& s) { vector nonzero; remove_copy(s.homework.begin(), s.homework.end(), back_inserter(nonzero), 0); if (nonzero.empty()) return grade(s.midterm, s.final, 0); else return grade(s.midterm, s.final, median(nonzero)); }

optimistic_median All the 0’s are removed from the list and then the median is computed. The library function remove_copy is helpful here. remove_copy(begin, end, copy, item) This function runs in the range [begin, end) and copies all the entries that are not equal to item. The copies are placed in the location associated with the iterator copy. Again, we need to be careful here and use back_inserter.

optimistic_median_analysis optimistic_median_analysis is similar to median_analysis and average_analysis. double optimistic_median_analysis(const vector & students) { vector grades; transform(students.begin(), students.end(), back_inserter(grades), optimistic_median); return median(grades); }

Deleting Entries from a Vector As a final example of clever ways to use library functions, let’s revisit the problem of deleting entries from a vector. We know that deleting entries from a vector using the erase method is inefficient. Using the list structure solves this problem but at the expense of loosing easy random access. Now we develop a method for efficiently removing entries from a vector.

Deleting Entries from a Vector The problem we had before was that when we call erase, every element of the vector after the entry we are erasing gets moved. The last entry of the vector will be moved one time for every entry erased. A better way is to only move each entry one time to its final position. There are two possible library functions that we could use to do this, remove_if and stable_partition. We will give two solutions to the problem.

remove_copy_if The function remove_copy_if runs through the structure and copies all the entries with failing grades to an new vector. It uses a predicate to do this instead of just a single value (like remove_copy ). Here is our predicate. bool pgrade(const Student_info& s) { return !fgrade(s); }

remove_if Next we need to remove the failing grades from the list. To do this we use remove_if. This function runs through the list and “removes” failing grades by overwriting them with passing grades from later in the list.

remove_if The important point is that remove_if moves each entry one time to its final location. The length of the vector remains unchanged. The return value of remove_if is an iterator that is associated with the first entry after all the ones that have been moved. We use the erase method from the vector class to removed the unneeded entries.

remove_if vector extract_fails(vector & students) { vector fail; remove_copy_if(students.begin(), students.end(), back_inserter(fail), pgrade); students.erase(remove_if(students.begin(), students.end(), fgrade), students.end()); return fail; }

remove_if This method approaches the performance of the list algorithm. It does need to make two complete passes through the vector. – One to copy the failing entries – One to remove the failing entries We can use stable_partition to do everything in one pass.

stable_partition This function sorts all the entries so the passing grades are located in the vector before all the failing grades. The return value is an iterator associated with the first failing entry. The we copy the failing entries into a new vector and erase them from the original.

stable_partition vector extract_fails(vector & students) { vector ::iterator iter = stable_partition(students.begin(), students.end(), pgrade); vector fail(iter, students.end()); students.erase(iter, students.end()); return fail; }

Performance The the performance of both these algorithms is similar in nature to the performance of the list algorithm. The stable_partition method in general (for long lists) runs in about half the time of the remove_if method.

Algorithms, Containers and Iterators Algorithms in the standard libraries act on container elements, not on containers. – We need to pass them iterators (not containers) telling which elements to operate on. – The cannot change the properties (like size) of the containers. This is why we needed to use the erase method to actually change the size of the vector in the remove_if example. erase or insert may invalidate iterators associated with entries at or after the change. Functions that move elements around ( remove_if, partition, stable_partition ) may change the values associated with previously existing iterators.

Iterator Adaptors These are defined in. back_inserter(c) – an iterator to append elements to c. front_inserter(c) – an iterator to insert elements to the front of c. inserter(c, it) – an iterator to insert elements into c before iterator it.

Algorithms This is defined in. accumulate(b, e, t) – add all the entries in the range [b, e) to the value t. These are defined in. find(b, e, t) – search the range [b, e) to find the value t. find_if(b, e, p) – search the range [b, e) to find the an entry where p is true. search(b, e, b2, e2) – search the range [b, e) to find the sequence denoted by [b2, e2).

Algorithms These are defined in. copy(b, e, d) – copy entries in the range [b, e) to the location indicated by d. remove_copy(b, e, d, t) – copy entries in the range [b, e) to the location indicated by d, except for those equal to t. remove_copy_if(b, e, d, p) – copy entries in the range [b, e) to the location indicated by d, except for those where p is true.

Algorithms These are defined in. remove_if(b, e, p) – rearranges the container in the range [b,e) so that all the entries were p is false are at the beginning. Returns an iterator to the “unremoved” elements. remove(b, e, t) – rearranges the container in the range [b,e) so that all the not equal to t are at the beginning. Returns an iterator to the “unremoved” elements.

Algorithms These are defined in. transform(b, e, d, f) – runs the function f on elements in the range [b,e) and stores the results in the location d. partition(b, e, p) – partitions the container in the range [b,e) so that all the element where p is true are at the beginning and the elements for which p is false are at the end. partition_stable(b, e, p) – same as partition except that the ordering of the elements in each of the two halves is preserved.

Homework Chapter 6 (page 122) Total 40 pts possible. – 6-0 – 6-1 (email, 10 pts) – 6-3 (paper, 5 pts) – 6-4 (email, 10 pts) – 6-8 (email, 15 pts) – 6-9 (email, 15 pts)

Project 1 (part 3) Generate performance data for the two new versions of the extract failures algorithm and compare the results to the ones for the last chapter. (email and paper, 20 pts)

Chapter 6 Using Library Algorithms. Objectives Explore the use of predefined library functions that are designed to be used with sequential container.

Similar presentations

Presentation on theme: "Chapter 6 Using Library Algorithms. Objectives Explore the use of predefined library functions that are designed to be used with sequential container."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 6 Using Library Algorithms. Objectives Explore the use of predefined library functions that are designed to be used with sequential container.

Similar presentations

Presentation on theme: "Chapter 6 Using Library Algorithms. Objectives Explore the use of predefined library functions that are designed to be used with sequential container."— Presentation transcript:

Similar presentations

About project

Feedback