A Computer Science Tapestry 7.1 Why do we read from files (chapter 6)? l Lots of data stored in files  What's a terabyte? Petabyte? Where does this much.

Slides:



Advertisements
Similar presentations
Chapter 8 Technicalities: Functions, etc. Bjarne Stroustrup
Advertisements

Chapter 7: User-Defined Functions II
Computer Science 1620 Loops.
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 12A Separate Compilation and Namespaces For classes this time.
Your First C++ Program Aug 27, /27/08 CS 150 Introduction to Computer Science I C++  Based on the C programming language  One of today’s most.
Computer Programming and Basic Software Engineering 4. Basic Software Engineering 1 Writing a Good Program 4. Basic Software Engineering 3 October 2007.
A simple C++ program /* * This program prints the phrase "Hello world!" * on the screen */ #include using namespace std; int main () { cout
Computer Science 1620 Programming & Problem Solving.
A Computer Science Tapestry 7.1 Designing and Using Classes l Class implementation, summary of what we’ve seen ä Data is private and is accessible in each.
Guide To UNIX Using Linux Third Edition
C++ Functions. 2 Agenda What is a function? What is a function? Types of C++ functions: Types of C++ functions: Standard functions Standard functions.
Classes: From Use to Implementation
A Computer Science Tapestry 1 Recursion (Tapestry 10.1, 10.3) l Recursion is an indispensable technique in a programming language ä Allows many complex.
By Noorez Kassam Welcome to JNI. Why use JNI ? 1. You already have significantly large and tricky code written in another language and you would rather.
Chapter 06 (Part I) Functions and an Introduction to Recursion.
Libraries Making Functions Globally Reusable (§ 6.4) 1.
Project 1 Due Date: September 25 th Quiz 4 is due September 28 th Quiz 5 is due October2th 1.
224 3/30/98 CSE 143 Recursion [Sections 6.1, ]
Announcements Final NEXT WEEK (August 13 th Thursday at 16:00) Recitations will be held on August 12 th Wednesday We will solve sample final questions.
Rossella Lau Lecture 1, DCO10105, Semester B, DCO10105 Object-Oriented Programming and Design  Lecture 1: Introduction What this course is about:
Chapter 13. Procedural programming vs OOP  Procedural programming focuses on accomplishing tasks (“verbs” are important).  Object-oriented programming.
A Computer Science Tapestry 3.1 Programs that Respond to Input l Programs in chapters one and two generate the same output each time they are executed.
File Input and Output in C++. Keyboard and Screen I/O #include cin (of type istream) cout (of type ostream) Keyboard Screen executing program input data.
Fundamental Programming: Fundamental Programming Introduction to C++
Stream Processing and stream operations for I/O (Input / Output) We’ll see section 6.3 and 6.4 (some parts excluded) Different methods of word and number.
Libraries 1 Making Functions Globally Reusable (§ 4.6)
C++ Functions. Objectives 1. Be able to implement C++ functions 2. Be able to share data among functions 2.
ADTs and C++ Classes Classes and Members Constructors The header file and the implementation file Classes and Parameters Operator Overloading.
Chapter 4: Subprograms Functions for Problem Solving Mr. Dave Clausen La Cañada High School.
More About Objects and Methods Chapter 5. Outline Programming with Methods Static Methods and Static Variables Designing Methods Overloading Constructors.
CPSC 252 Operator Overloading and Convert Constructors Page 1 Operator overloading We would like to assign an element to a vector or retrieve an element.
File I/O 1 ifstreams and ofstreams Sections 11.1 & 11.2.
C++ Classes and Data Structures Jeffrey S. Childs
A Computer Science Tapestry 7.1 Designing and Using Classes l Class implementation, summary of what we’ve seen ä Data is private and is accessible in each.
1 Chapter 2 C++ Syntax and Semantics, and the Program Development Process.
A Computer Science Tapestry 7.1 Designing and Using Classes l Class implementation, summary of what we’ve seen  Data is private and is accessible in each.
CPS Inheritance and the Yahtzee program l In version of Yahtzee given previously, scorecard.h held information about every score-card entry, e.g.,
Chapter 3 Functions. 2 Overview u 3.2 Using C++ functions  Passing arguments  Header files & libraries u Writing C++ functions  Prototype  Definition.
Copyright © 2000, Department of Systems and Computer Engineering, Carleton University 1 Introduction An array is a collection of identical boxes.
C++ Programming: Program Design Including Data Structures, Fourth Edition Chapter 6: User-Defined Functions I.
1 Becoming More Effective with C++ … Day Two Stanley B. Lippman
1 Program Development  The creation of software involves four basic activities: establishing the requirements creating a design implementing the code.
Chapter 9 Separate Compilation and Namespaces. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. Slide 2 Overview Separate Compilation (9.1)
Copyright © 2008 Pearson Addison-Wesley. All rights reserved. Chapter 12 Separate Compilation and Namespaces.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 12 Separate Compilation and Namespaces.
A Computer Science Tapestry 6.1 Classes: From Use to Implementation l We’ve used several classes, a class is a collection of objects sharing similar characteristics.
Extra Recitations Wednesday 19:40-22:30 FENS L055 (tomorrow!) Friday 13:40-16:30 FENS L063 Friday 17: :30 FENS L045 Friday 19:40-22:30 FENS G032.
CS201 – Introduction to Computing – Sabancı University 1 Announcements l Homework 5 ä Due this Wednesday (November 17), 19:00 l Common Problems and Questions.
Libraries 1 Making Functions Globally Reusable (§ 6.4)
CLASSES AND OBJECTS Chapter 3 : constructor, Separate files, validating data.
1 A more complex example Write a program that sums a sequence of integers and displays the result. Assume that the first integer read specifies the number.
FUNCTIONS (C) KHAERONI, M.SI. OBJECTIVE After this topic, students will be able to understand basic concept of user defined function in C++ to declare.
C++ Functions A bit of review (things we’ve covered so far)
Announcements Extra Lecture Tuesday ( today ) 20 :40 for 1 hour Tuesday regular lecture at 16:40 for 1 hour You may see Midterm1 exam papers Wednesday.
Software Engineering Algorithms, Compilers, & Lifecycle.
Announcements Midterm2 on April 18 th Monday at 19:40 This week recitations, we will solve sample midterm questions. I will also send previous years’ exams.
Lecture 3: Getting Started & Input / Output (I/O)
Announcements Midterm2 Grades to be announced THIS WEEK
Announcements Final Exam will be on August 19 at 16:00
Designing and Using Classes
User-Defined Functions
Topics Decomposition Parameter passing Reading and writing from files
Separate Compilation and Namespaces
Introduction to Classes
Using, Understanding, Updating, Designing and Implementing Classes
Announcements Midterm2 Grades to be announced NEXT Monday
Announcements Final will be NEXT WEEK on August 17
Classes: From Use to Implementation
SPL – PS1 Introduction to C++.
Classes: From Use to Implementation
Presentation transcript:

A Computer Science Tapestry 7.1 Why do we read from files (chapter 6)? l Lots of data stored in files  What's a terabyte? Petabyte? Where does this much data come from?  What's a URL? Difference between and file://…  Can we read strings at a time, ints at a time, lines at a time, chars at a time?  Can we read objects at a time? Why? l What's a database?  Who knows about your Duke data? Other personal data?  Why are "database-backed websites" better?

A Computer Science Tapestry 7.2 Streams for reading files We’ve seen the standard input stream cin, and the standard output streams cout (and cerr )  Accessible from, used for reading from the keyboard, writing to the screen l Other streams let us read from files and write to files  Syntax of reading is the same: a stream is a stream  Syntax for writing is the same: a stream is a stream l To use a file stream the stream must be opened  Opening binds stream to a file, then I/O to/from is ok  Should close file streams (with only one or two files, doesn’t matter since closed on program exit)

A Computer Science Tapestry 7.3 Input file stream: Note similarity to cin string word; int numWords = 0; // # words read so far while (cin >> word) { // read succeeded numWords++; } cout << "number of words read = " << numWords << endl; ifstream input; // need for this string filename = "c:/data/poe.txt"; input.open(filename.c_str()); while (input >> word){ // read succeeded numWords++; } cout << "number of words read = " << numWords << endl;

A Computer Science Tapestry 7.4 Find longest word in file (most letters) l Idea for algorithm/program  Read every word, remember the longest word read so far  Each time a word is read, compare to longest-so-far, if longer then there’s a new longest-so-far l What should longest-so-far be initialized to?  Short word? Long word? First word? l In general, when solving extreme-value problems like this use the first value as the initial value  Always works, but leads to some duplicate code  For some values a default initialization is possible

A Computer Science Tapestry 7.5 Maximal and Minimal Values Largest int and double values are found in and, respectively ( and )  INT_MAX and INT_MIN are max and min ints  DBL_MAX and DBL_MIN are max and min doubles l What about maximal string value alphabetically? Minimal?  l What about longest string in length? Shortest? 

A Computer Science Tapestry 7.6 Read values, find largest (smallest) l How to write code and initialize values int current; int maxValue = ; // INT_MIN or INT_MAX? int numNums = 0; // # numbers so far while (cin >> current) { // read succeeded numNums++; if (current > maxValue) { maxValue = current; } cout << “max value = “ << maxValue << endl;

A Computer Science Tapestry 7.7 What about if ints are strings? l How to write code and initialize values string current; string maxValue = ; // what here? while (cin >> current) { // read succeeded if (current > maxValue) { maxValue = current; } cout << “max value = “ << maxValue << endl;

A Computer Science Tapestry 7.8 Using classes to solve problems l Find the word in a file that occurs most frequently  What word occurs most often in Romeo and Juliet  How do we solve this problem? l Suppose a function exists with the header below: int CountOccurrences(const string& filename, const string& s) // post: return # occurrences of s in file w/filename l How can this function be called to find the maximally occurring word in Romeo and Juliet ?  Read words, update counter?  Potential problems?

A Computer Science Tapestry 7.9 Complete the code below int CountOccurrences(const string& filename, const string& s) // post: return # occurrences of s in file w/filename int main() { string filename = PromptString("enter file: "); ifstream input(filename.c_str()); string word; while (input >> word) { int occs = CountOccurrences(filename, word); } cout << "maximally occurring word is " << maxWord << endl; cout << "which occurs " << max << " times" << endl; }

A Computer Science Tapestry 7.10 Two problems l Words appear as The and the, how can we count these as the same? Other issues?  Useful utilities in “strutils.h” ToLowerreturn a lowercase version of a string ToUpperreturn an uppercase version of a string StripPuncremove leading/trailing punctuation tostring(int)return “123” for 123, int-to-string conversion atoi(string)return 123 for “123”, string-to-int conversion l We count occurrences of “the” as many times as it occurs  Lots of effort duplicated, use the class StringSet  A set doesn’t store duplicates, read file, store in set, then loop over the set counting occurrences

A Computer Science Tapestry 7.11 StringSet and WordIterator l Both classes support access via iteration  Iterating over a file using a WordIterator returns one word at-a-time from the file  Iterating over a set using a StringSetIterator returns one word at-a-time from the set l Iteration is a common pattern: A pattern is a solution to a problem in a context  We’ll study more patterns later  The pattern has a name, and it’s an idea embodied in code rather than the code itself l We can write code without knowing what we’re iterating over if the supports generalization in some way

A Computer Science Tapestry 7.12 See setdemo.cpp #include using namespace std; #include "stringset.h" int main() { StringSet sset; sset.insert("watermelon"); sset.insert("apple"); sset.insert("banana"); sset.insert("orange"); sset.insert("banana"); sset.insert("cherry"); sset.insert("guava"); sset.insert("banana"); sset.insert("cherry"); cout << "set size = " << sset.size() << endl; StringSetIterator it(sset); for(it.Init(); it.HasMore(); it.Next()) { cout << it.Current() << endl; } if (sset.contains("banana")){ cout << "has we have a banana" << endl; } return 0; }

A Computer Science Tapestry 7.13 Designing and Using Classes l Class implementation, summary of what we’ve seen  Data is private and is accessible in each member function  Each object has it’s own data, so that each of five Dice objects has its own mySides and myRollCount  Member function implementations are in a.cpp file, interface is in a.h file l Compiling and linking, interface and implementation  Client programs #include a.h file, this is the interface  Client programs link the implementation, which is a compiled version of the.cpp file (.o or.obj suffix), implementations are often combined in a library, e.g., libtapestry, and the library is linked

A Computer Science Tapestry 7.14 Implementing Classes l Determining what classes are needed, and how they should be implemented is difficult; designing functions is difficult  Experience is a good teacher, failure is a good teacher Good design comes from experience, experience comes from bad design attributed to Fred Brooks, Henry Petroski  Design and implementation combine into a cyclical process: design, implement, re-visit design, implement, test, redesign, … Grow a working program, don’t do it all at the same time l One design methodology says “look for nouns, those are classes”, and “look for verbs or scenarios, those are member functions”  Not every noun is a class, not every verb is a method

A Computer Science Tapestry 7.15 Programming Tips, Heuristics, Help l Develop a core working program, add to it slowly  Iterative enhancement, test as you go, debug as you go l Do the hard part first, or do the easy part first  Which is best? It depends. l Concentrate on behavior first when designing classes, then on state  State is useful for communicating between method calls l If you’re using several classes, you’ll need to modify the Makefile or your project in an IDE: Visual C++/Eclipse

A Computer Science Tapestry 7.16 Common interfaces are a good thing l The class WordStreamIterator iterates over a file returning one word/string at a time string filename = PromptString("enter file name: "); WordStreamIterator ws; ws.Open(filename); for(ws.Init(); ws.HasMore(); ws.Next()) { cout << ws.Current() << endl; } l The class StringSet and StringSetIterator allow sets of strings to be iterated over one string at a time StringSet sset; sset.insert("banana"); sset.insert("cherry"); StringSetIterator it(sset); for(it.Init(); it.HasMore(); it.Next()) { cout << it.Current() << endl; }

A Computer Science Tapestry 7.17 Reuse concepts as well as code l Using the same syntax for iterating saves time in learning about new classes, will save coding when we learn how to exploit the commonality l We can develop different Question classes and “plug” them into a quiz program if the member functions have the same name  See quiz.cpp, mathquest.cpp, and capquest.cpp  Programs must #include different headers, and link in different implementations, but quiz.cpp doesn’t change l Random walk classes: one- and two-dimensional, can use the same driver program if the classes use the same method names

A Computer Science Tapestry 7.18 Random walks l Throwing darts (randomness in programs) is a technique for simulating events/phenomena that would be otherwise difficult  Molecular motion is too time-consuming to model exactly, use randomness to approximate behavior Consider the number of molecules in liters of a gas, each affects the other if we’re simulating motion 6.023x10 23 molecules/22.4 liters is (approx) 2.7e+12molecules  If we can do 100 megaflops, what does this mean? l Simulations are important in modeling nature, airplanes, airports, …, require pseudo-random numbers and some mathematics as well as programming

A Computer Science Tapestry 7.19 Simulations and solving problems We can find the area under a curve in two dimensions by integrating, see integrate.cpp  Sometimes can’t get exact solution, need approximations  Can use rectangles, trapezoids, parabolas, etc.  Can also use randomization, especially useful in multidimensional integrals where other techniques fail 3 10 y = x

A Computer Science Tapestry 7.20 Random walks l Interesting from a mathematical viewpoint, leads to understanding other random simulations l Two-dimensional random molecular simulations are fundamental to understanding physics (nuclear reactions) l Basic idea in one dimension: frog, object, molecule “decides” to step left or right at random  Where does object end up after N steps?  How many times does from visit each lily pad Assuming pads located at integer values, not real values  See frogwalk.cpp for details

A Computer Science Tapestry 7.21 Walking behavior (see frogwalk2.cpp) int main() { int numSteps = PromptRange("enter # steps",0,30000); RandomWalk frog(numSteps); // define two random walkers RandomWalk toad(numSteps); int samePadCount = 0; // # times at same location frog.Init(); // initialize both walks toad.Init(); while (frog.HasMore() && toad.HasMore()) { if (frog.Current() == toad.Current()) { samePadCount++; } frog.Next(); toad.Next(); } cout << "frog position = " << frog.Position() << endl; cout << "toad position = " << toad.Position() << endl; cout << "# times at same location = " << samePadCount << endl; return 0; }

A Computer Science Tapestry 7.22 Two-dimensional walker l One-d walker Current() returns an int as position l Two-d walker Current() returns a Point as position  Both int and Point can be compared using ==  Both int and Point can be printed using << l Same program works for two-d walker, even though underneath the implementation is very different  Since the interfaces are the same/similar, client programs are easier to write once, use many times  Client code still needs to #include a different header and must link in a different (two-d) walker implementation

A Computer Science Tapestry 7.23 What’s the Point? The two-dimensional walker uses #include " point.h "  This provides access to class Point declaration/interface  The class Point is actually defined using struct Point  In C++, a struct is a class in which everything is public by default In a class, everything is private by default A struct is really a hold-over from C, used in C++ for plain old data  Some programmers/designers don’t like to use structs in C++, but use classes only l We’ll use struct when data is public, when the state is really more important than the behavior  Guideline, data is private accept in a struct, other options?

A Computer Science Tapestry 7.24 point.h struct Point { Point(); Point(double px, double py); string tostring() const; double distanceFrom(const Point& p) const; double x; double y; }; l Two constructors, data is public, how is the (0,0) defined?  How is distance from (3,5) to (11,20) calculated?  How is a Point p printed?

A Computer Science Tapestry 7.25 Other details from point.h Points can be compared with each other using ==, =, etc. Point p can be printed using cout << p << endl;  Later we’ll learn how to overload operators like this  For now we’ll be clients, using Points like ints, BigInts, etc. The struct Point has constructors and other behavior  distanceFrom and tostring constitute the behavior  Some programmers think structs shouldn’t have any functions, holdover from C rather than C++ What is implemention of Point::distanceFrom like?

A Computer Science Tapestry 7.26 Other uses of structs l In a program using free (non-class) functions, lots of data is often passed from one function to another  In class-based programs data is often, though not always, part of a class and a class object is passed l Using structs to collect related data makes programs easier to read, modify, and maintain  Suppose you want to find mean, mode, and median of the lengths of words in a file, two alternatives: void doFileStats(const string& filename, double & mean, int & mode, int & median); void doFileStats(const string& filename, FileData& data);

A Computer Science Tapestry 7.27 More struct conventions It’s almost always worth including a constructor in a struct struct FileData { FileData() { myMean = 0.0; myMode = 0; myMedian = 0; } double myMean; int myMode; int myMedian; }; What other data might be included in FileData, what about other constructors?

A Computer Science Tapestry 7.28 Class (and struct) conventions For debugging and printing it’s useful for classes to implement a function tostring(), that " stringizes " an object  Also useful in overloading operator << for an object Point p; string s = p.tostring(); cout << s << " " << p << endl; l When initializing data in a constructor, it’s better to use an initializer list than to set values in the constructor body  Sometimes initializer lists are required (see next example), so using them at all times leads to more uniform coding that works in more situations

A Computer Science Tapestry 7.29 Initializer lists are sometimes required Consider a class that has a private Dice data member class Game { public: Game(); // more functions private: Dice myDie; // more data }; The instance variable myDie must be given a # sides, this cannot be given in the.h file/declaration, must be provided in the.cpp file/class implementation  It’s an error if an initializer list isn’t use

A Computer Science Tapestry 7.30 Initializer lists Here are two versions of an initializer list for Game::Game() Game::Game() : myDie(6) { } // if there’s more data, use initializer list Game::Game() : myDie(6), myName(“roulette”) { } l There can be code in constructor body to do more, e.g., read from a file  Sometimes it’s useful to call private, helper functions

A Computer Science Tapestry 7.31 Mary Shaw l Software engineering and software architecture  Tools for constructing large software systems  Development is a small piece of total cost, maintenance is larger, depends on well-designed and developed techniques l Interested in computer science, programming, curricula, and canoeing

A Computer Science Tapestry 7.32 Three phases of creating a program l The preprocessor is a program that processes a file, processing all #include directives (and other preprocessor commands)  Takes a file, and creates a translation unit  Replaces #include “foo.h” with contents of file foo.h, and does this recursively, for all #includes that foo includes and so on  Produces input to the next phase of program creation l The compiler has a translation unit as input and produces compiled object code as output  The object code is platform/architecture specific, the source code is (in theory at least) the same on all platforms  Some compilers require special treatment, not up to standard C++

A Computer Science Tapestry 7.33 From compiling to linking l The compilation phase creates an object file, but libraries and other files still need to be linked to create an executable  Header files like “dice.h” provide only the interface, enough for the compiler to know that a function call has the right parameters and is used correctly  The implemention file, “dice.cpp”, must be compiled and included in the final executable, or the program won’t work (call a dice function, but no one is home?) l Linking combines object files, some of which may be collected in a library of related files, to create an executable  Link the standard library (iostream, for example)  Link other libraries depending on program, graphics, tapestry, other application-specific libraries

A Computer Science Tapestry 7.34 Issues in creating a program l Programming environments create optimized or debug code  Use debug version to facilitate development  If you need optimization, use it only after a program works l Some errors are compilation errors, typically language syntax or failure to find a #include’d header file  The preprocessor looks in standard places for header files, sometimes this list needs to be changed l Other errors are linker errors, libraries or object files that are needed aren’t included  Change programming environment parameters to find the libraries