Working with Batches of Data Lecture 4 Hartmut Kaiser

Slides:



Advertisements
Similar presentations
Chapter 4 Computation Bjarne Stroustrup
Advertisements

Program Design and Development
Computer Science 1620 Programming & Problem Solving.
Computer Science II Exam I Review Monday, February 6, 2006.
Libraries Programs that other people write that help you. #include // enables C++ #include // enables human-readable text #include // enables math functions.
Standard library types Practical session # 3 Software Engineering
C++ crash course Class 3 designing C++ programs, classes, control structures.
Basic Elements of C++ Chapter 2.
C++ / G4MICE Course Session 3 Introduction to Classes Pointers and References Makefiles Standard Template Library.
Organizing Programs and Data 2 Lecture 7 Hartmut Kaiser
Organizing Programs and Data Lecture 5 Hartmut Kaiser
A Computer Science Tapestry 1 Recursion (Tapestry 10.1, 10.3) l Recursion is an indispensable technique in a programming language ä Allows many complex.
Working with Strings Lecture 2 Hartmut Kaiser
Chapter 9 Defining New Types. Objectives Explore the use of member functions when creating a struct. Introduce some of the concepts behind object-oriented.
ECE 264 Object-Oriented Software Development Instructor: Dr. Honggang Wang Fall 2012 Lecture 4: Continuing with C++ I/O Basics.
Goals of Course Introduction to the programming language C Learn how to program Learn ‘good’ programming practices.
CHAPTER 07 Arrays and Vectors (part I). OBJECTIVES 2 In this part you will learn:  To use the array data structure to represent a set of related data.
Basic Notions Review what is a variable? value? address? memory location? what is an identifier? variable name? keyword? what is a legal identifier? what.
CSE 332: C++ Type Programming: Associated Types, Typedefs and Traits A General Look at Type Programming in C++ Associated types (the idea) –Let you associate.
Chapter 1 Working with strings. Objectives Understand simple programs using character strings and the string library. Get acquainted with declarations,
Project 1 Due Date: September 25 th Quiz 4 is due September 28 th Quiz 5 is due October2th 1.
Summary of what we learned yesterday Basics of C++ Format of a program Syntax of literals, keywords, symbols, variables Simple data types and arithmetic.
CSC1201: Programming Language 2 Lecture 1 Level 2 Course Nouf Aljaffan Snd Term Nouf Aljaffan (C) CSC 1201 Course at KSU1.
Defining New Types Lecture 21 Hartmut Kaiser
1 INTRODUCTION TO PROBLEM SOLVING AND PROGRAMMING.
File I/O ifstreams and ofstreams Sections 11.1 &
1 09/20/04CS150 Introduction to Computer Science 1 Let ’ s all Repeat Together.
Looping and Counting Lecture 3 Hartmut Kaiser
File I/O 1 ifstreams and ofstreams Sections 11.1 & 11.2.
 2003 Prentice Hall, Inc. All rights reserved. 1 Arrays Outline Introduction Arrays Declaring Arrays Examples Using Arrays.
Slides adapted from: Bjarne Stroustrup, Programming – Principles and Practice using C++ Chapter 6 Writing a Program Hartmut Kaiser
1 CISC181 Introduction to Computer Science Dr. McCoy Lecture 13 October 13, 2009.
Writing Generic Functions Lecture 20 Hartmut Kaiser
 2003 Prentice Hall, Inc. All rights reserved. 1 Vectors.
Chapter 3 Working with Batches of Data. Objectives Understand vector class and how it can be used to collect, store and manipulate data. Become familiar.
 2008 Pearson Education, Inc. All rights reserved. 1 Arrays and Vectors.
CSC 143A 1 CSC 143 Introduction to C++ [Appendix A]
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
1 Advanced Topics in Functions Lecture Unitary Scope Resolution Operator Unary scope resolution operator ( :: )  Access global variable if.
Using Sequential Containers Lecture 8 Hartmut Kaiser
Manipulator example #include int main (void) { double x = ; streamsize prec = cout.precision(); cout
C++ Programming Lecture 14 Arrays – Part I The Hashemite University Computer Engineering Department (Adapted from the textbook slides)
1 ENERGY 211 / CME 211 Lecture 7 October 6, 2008.
CHAPTER 2.2 CONTROL STRUCTURES (ITERATION) Dr. Shady Yehia Elmashad.
CMSC 104, Section 301, Fall Lecture 18, 11/11/02 Functions, Part 1 of 3 Topics Using Predefined Functions Programmer-Defined Functions Using Input.
Lecture 2 Arrays. Topics 1 Arrays hold Multiple Values 2 Accessing Array Elements 3 Inputting and Displaying Array Contents 4 Array Initialization 5 Using.
1 COMS 261 Computer Science I Title: C++ Fundamentals Date: September 23, 2005 Lecture Number: 11.
CHAPTER 2.2 CONTROL STRUCTURES (ITERATION) Dr. Shady Yehia Elmashad.
Working with Batches of Data
Variables, Identifiers, Assignments, Input/Output
Chapter Topics The Basics of a C++ Program Data Types
Basic Elements of C++.
Working with Strings Lecture 2 Hartmut Kaiser
Bjarne Stroustrup Chapter 4 Computation Bjarne Stroustrup
Organizing Programs and Data 2
Announcements Final Exam on August 17th Wednesday at 16:00.
Basic Elements of C++ Chapter 2.
Modification of Slides by
Working with Strings Lecture 2 Hartmut Kaiser
Programming Funamental slides
Defining New Types Lecture 22 Hartmut Kaiser
CISC181 Introduction to Computer Science Dr
4.9 Multiple-Subscripted Arrays
Announcements Final Exam on August 19th Saturday at 16:00.
Variables, Identifiers, Assignments, Input/Output
Bjarne Stroustrup Chapter 4 Computation Bjarne Stroustrup
Announcements Final Exam on December 25th Monday at 16:00.
Chapter 4 Computation Bjarne Stroustrup
Bjarne Stroustrup Chapter 4 Computation Bjarne Stroustrup
Presentation transcript:

Working with Batches of Data Lecture 4 Hartmut Kaiser

Abstract So far we looked at simple ‘read a string – print a string’ problems. Now we will look at more complex problems involving multiple pieces of similar data. 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 2

Computing Student Grades Calculate the overall grade of a student (20% midterm, 40% final exam, 40% homework)  Read values interactively  Arbitrary number of homework grades 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 3

Computing Student Grades // #include directives // using directives int main() { // ask for and read the student's name // ask and read the midterm and final grades // ask for the homework grades // write the result return 0; } 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 4

#include and using directives // #include directives #include // std::setprecision #include // std::streamsize #include // using directives using std::cin; using std::setprecision; using std::cout; using std::string; using std::endl; using std::streamsize; int main() { // ask for and read the student's name cout << "Please enter your first name: "; string name; cin >> name; cout << "Hello, " << name << "!" << endl; //... return 0; } 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 5

Ask and Read the Grades int main() { //... // ask and read the midterm and final grades cout << "Please enter your midterm and final grades: "; double midterm, final; cin >> midterm >> final; // ask for the homework grades cout << "Enter all your homework grades, " "followed by end-of-file: "; int count = 0;// the number and sum of grades read so far double sum = 0, x; // a variable to read the grades into // invariant: // we have read 'count' number of grades so far // 'sum' is the sum of the first 'count' grades while (cin >> x) { ++count; sum += x; } 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 6

Computing Student Grades Calculate and output overall grade //... // write the result streamsize prec = cout.precision(); cout << name << ", your final grade is: " << setprecision(3) << 0.2 * midterm * final * sum / count << setprecision(prec) << endl; return 0; } 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 7

Testing for end of Input We have seen: while (cin >> x) { /*…*/ } Generally istreams can be used as conditions: if (cin >> x) { /*…*/ } Which is equivalent to: cin >> x; if (cin) { /*…*/ } Detects:  Operation reached end of input  Input characters are not compatible with expected type  System has detected hardware failure 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 8

Calculating the Median Value So far we throw away the values right after having read them  Fine for averages, doesn’t work for medians Median:  Sort all values and pick the middle one (or average of two values nearest the middle)  We must store all numbers now! 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 9 Elements <= Median Elements >= Median midsize-1 mid-10

Data for Iteration - Vector To do just about anything of interest, we need a collection of data to work on. We can store this data in a vector. For example: // read some grades into a std::vector: int main() { std::vector grades; // declare a vector of type double to // store grades – like 62.4 double x; // a single grade value while (std::cin >> x) // cin reads a value and stores it in x grades.push_back(x); // store the value of x in the vector // … do something … } // std::cin >> x will return true until we reach the end of // file or encounter something that isn’t a double: like the word “end” 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 10

Data for Iteration - Vector Vector is a container holding a collection of values All values in the same vector have same type, but different vectors can hold objects of different types The type of the objects held by a vector needs to be specified: vector 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 11

Vector holds Sequences of Data Vector is the most useful standard library data type  A vector holds a sequence of values of type T  Think of a vector this way:  A vector named v contains 5 elements: {1, 4, 2, 3, 5}: 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data v: v’s elements: v[0] v[1] v[2]v[3] v[4] size()

Vector Interface template class vector { // read "for all types T" (just like in math) //... Internal members public: vector(); // default constructor explicit vector(int s) // constructor vector(const vector&); // copy constructor vector& operator=(const vector&); // copy assignment ~vector(); // destructor T& operator[] (int n); // access: return reference int size() const; // the current size bool empty() const; // test, whether vector is empty void push_back(T d); // add element at the end //... more functions here }; 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 13

Vector grows if needed std::vector v; // start off empty v.push_back(1); // add an element with the value 1 v.push_back(4); // add an element with the value 4 at end (“back”) v.push_back(3); // add an element with the value 3 at end (“back”) 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 14 v: v[0] v[1] v[2]

Vectors Once you get your data into a vector you can easily manipulate it: // compute median grade: int main() { std::vector grades; // grades, e.g // invariant: grades contains all the homework grades read so far double x; while (std::cin >> x) // read next value and put into vector grades.push_back(x); // maintains loop invariant! // sort values stored in the vector std::sort(grades.begin(), grades.end()); std::cout << "Median grade: " << grades[grades.size()/2] << std::endl; } 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 15

Calculating Median properly If number of elements is odd, pick middle: auto mid = grades.size() / 2; median = grades[mid]; 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 16 Elements <= Median Elements >= Median midsize-1 mid-10 mid+1

Calculating Median properly If number is even, pick average of 2 values nearest the middle: auto mid = grades.size() / 2; median = (grades[mid] + grades[mid-1]) / 2; 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 17 Elements <= Median Elements >= Median midsize-1 mid-10

Calculating Median properly (C++11) // calculating the median grade auto size = grades.size(); auto mid = size / 2; double median = size % 2 == 0 ? (grades[mid] + grades[mid-1]) / 2 : grades[mid]; // fully equivalent to double median; if (size % 2 == 0) median = (grades[mid] + grades[mid-1]) / 2; else median = grades[mid]; 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 18

Combining Language Features You can write many new programs by combining language features, built-in types, and user-defined types in new and interesting ways.  So far, we have  Variables and literals of types bool, char, int, double  string, vector, find(), [ ] (subscripting)  !=, ==, =, +, -, +=, <, &&, ||, !  transform(), sort(), std::cin >>, std::cout <<  if, for, while  You can write a lot of different programs with these language features! Let’s try to use them in a slightly different way… 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 19

Example – Word List (C++11) // "boilerplate" left out std::vector words; std::string s; while (std::cin >> s && s != "quit") // && means AND words.push_back(s); std::sort(words.begin(), words.end()); // sort the words we read std::for_each( words.begin(), words.end(), // print all words [](std::string s) { std::cout << s << "\n"; }); // read a bunch of strings into a vector of strings, sort // them into lexicographical order (alphabetical order), // and print the strings from the vector to see what we have. 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 20

Word list – Eliminate Duplicates // Note that duplicate words were printed multiple times. For // example "the the the". That's tedious, let's eliminate duplicates: std::vector words; std::string s; while (std::cin >> s && s != "quit") words.push_back(s); std::sort(words.begin(), words.end()); for (auto i = 1; i != words.size(); ++i) if (words[i-1] == words[i]) // get rid of words[i] // (pseudocode) std::for_each( words.begin (), words.end(), // print all words [](std::string s) { std::cout << s << "\n"; }); // there are many ways to “get rid of words[i]”; many of them are messy // (that’s typical). Our job as programmers is to choose a simple clean // solution – given constraints – time, run-time, memory. 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 21

Example (cont.) Eliminate Words! // Eliminate the duplicate words by copying only unique words: std::vector words; string s; while (std::cin >> s && s != "quit") words.push_back(s); std::sort(words.begin(), words.end()); vector w2; if (0 != words.size()) { // Note style { } w2.push_back(words[0]); for (auto i = 1; i != words.size(); ++i) if (words[i-1] != words[i]) w2.push_back(words[i]); } std::cout << "found " << words.size()-w2.size() << " duplicates\n"; std::for_each( w2.begin(), w2.end(), // print all remaining words [](std::string s) { std::cout << s << "\n"; }); 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 22

C++11 Loop Alternatives The common way of iterating over all elements of a vector: for (vector ::size_type i = 0; i != words.size(); ++i) cout << words[i] << "\n"; C++11 gives simpler alternatives:  Using auto for loop variable (requires g++ 4.4, vs2010) for (auto i = 0; i != words.size(); ++i) cout << words[i] << "\n";  Range based for loop (requires g++ 4.6, vs2012) for (string w: words) cout << w << "\n";  Using lambda functions (requires g++ 4.4, vs2010) std::for_each(words.begin(), words.end(), [](string s) { cout << s << "\n"; }); 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 23

Algorithm We just used a simple algorithm An algorithm is (from Google search)  “a logical arithmetical or computational procedure that, if correctly applied, ensures the solution of a problem.” – Harper Collins  “a set of rules for solving a problem in a finite number of steps, as for finding the greatest common divisor.” – Random House  “a detailed sequence of actions to perform or accomplish some task. Named after an Iranian mathematician, Al-Khawarizmi. Technically, an algorithm must reach a result after a finite number of steps, …The term is also used loosely for any sequence of actions (which may or may not terminate).” – Webster’s We eliminated the duplicates by first sorting the vector (so that duplicates are adjacent), and then copying only strings that differ from their predecessor into another vector. 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 24

Ideal Basic language features and libraries should be usable in essentially arbitrary combinations.  We are not too far from that ideal.  If a combination of features and types make sense, it will probably work.  The compiler helps by rejecting some absurdities. 1/29/2015, Lecture 4 CSC 1254, Spring 2015, Working with Batches of Data 25