> word){ // read succeeded numWords++; } cout << "number of words read = " << numWords << endl;"> > word){ // read succeeded numWords++; } cout << "number of words read = " << numWords << endl;">
Download presentation
Presentation is loading. Please wait.
Published byChad Chapman Modified over 9 years ago
1
A Computer Science Tapestry 7.1 Why do we read from files (chapter 6)? l Lots of data stored in files What's a terabyte? Petabyte? Where does this much data come from? What's a URL? Difference between http://… and file://… Can we read strings at a time, ints at a time, lines at a time, chars at a time? Can we read objects at a time? Why? l What's a database? Who knows about your Duke data? Other personal data? Why are "database-backed websites" better?
2
A Computer Science Tapestry 7.2 Streams for reading files We’ve seen the standard input stream cin, and the standard output streams cout (and cerr ) Accessible from, used for reading from the keyboard, writing to the screen l Other streams let us read from files and write to files Syntax of reading is the same: a stream is a stream Syntax for writing is the same: a stream is a stream l To use a file stream the stream must be opened Opening binds stream to a file, then I/O to/from is ok Should close file streams (with only one or two files, doesn’t matter since closed on program exit)
3
A Computer Science Tapestry 7.3 Input file stream: Note similarity to cin string word; int numWords = 0; // # words read so far while (cin >> word) { // read succeeded numWords++; } cout << "number of words read = " << numWords << endl; ifstream input; // need for this string filename = "c:/data/poe.txt"; input.open(filename.c_str()); while (input >> word){ // read succeeded numWords++; } cout << "number of words read = " << numWords << endl;
4
A Computer Science Tapestry 7.4 Find longest word in file (most letters) l Idea for algorithm/program Read every word, remember the longest word read so far Each time a word is read, compare to longest-so-far, if longer then there’s a new longest-so-far l What should longest-so-far be initialized to? Short word? Long word? First word? l In general, when solving extreme-value problems like this use the first value as the initial value Always works, but leads to some duplicate code For some values a default initialization is possible
5
A Computer Science Tapestry 7.5 Maximal and Minimal Values Largest int and double values are found in and, respectively ( and ) INT_MAX and INT_MIN are max and min ints DBL_MAX and DBL_MIN are max and min doubles l What about maximal string value alphabetically? Minimal? l What about longest string in length? Shortest?
6
A Computer Science Tapestry 7.6 Read values, find largest (smallest) l How to write code and initialize values int current; int maxValue = ; // INT_MIN or INT_MAX? int numNums = 0; // # numbers so far while (cin >> current) { // read succeeded numNums++; if (current > maxValue) { maxValue = current; } cout << “max value = “ << maxValue << endl;
7
A Computer Science Tapestry 7.7 What about if ints are strings? l How to write code and initialize values string current; string maxValue = ; // what here? while (cin >> current) { // read succeeded if (current > maxValue) { maxValue = current; } cout << “max value = “ << maxValue << endl;
8
A Computer Science Tapestry 7.8 Using classes to solve problems l Find the word in a file that occurs most frequently What word occurs most often in Romeo and Juliet How do we solve this problem? l Suppose a function exists with the header below: int CountOccurrences(const string& filename, const string& s) // post: return # occurrences of s in file w/filename l How can this function be called to find the maximally occurring word in Romeo and Juliet ? Read words, update counter? Potential problems?
9
A Computer Science Tapestry 7.9 Complete the code below int CountOccurrences(const string& filename, const string& s) // post: return # occurrences of s in file w/filename int main() { string filename = PromptString("enter file: "); ifstream input(filename.c_str()); string word; while (input >> word) { int occs = CountOccurrences(filename, word); } cout << "maximally occurring word is " << maxWord << endl; cout << "which occurs " << max << " times" << endl; }
10
A Computer Science Tapestry 7.10 Two problems l Words appear as The and the, how can we count these as the same? Other issues? Useful utilities in “strutils.h” ToLowerreturn a lowercase version of a string ToUpperreturn an uppercase version of a string StripPuncremove leading/trailing punctuation tostring(int)return “123” for 123, int-to-string conversion atoi(string)return 123 for “123”, string-to-int conversion l We count occurrences of “the” as many times as it occurs Lots of effort duplicated, use the class StringSet A set doesn’t store duplicates, read file, store in set, then loop over the set counting occurrences
11
A Computer Science Tapestry 7.11 StringSet and WordIterator l Both classes support access via iteration Iterating over a file using a WordIterator returns one word at-a-time from the file Iterating over a set using a StringSetIterator returns one word at-a-time from the set l Iteration is a common pattern: A pattern is a solution to a problem in a context We’ll study more patterns later The pattern has a name, and it’s an idea embodied in code rather than the code itself l We can write code without knowing what we’re iterating over if the supports generalization in some way
12
A Computer Science Tapestry 7.12 See setdemo.cpp #include using namespace std; #include "stringset.h" int main() { StringSet sset; sset.insert("watermelon"); sset.insert("apple"); sset.insert("banana"); sset.insert("orange"); sset.insert("banana"); sset.insert("cherry"); sset.insert("guava"); sset.insert("banana"); sset.insert("cherry"); cout << "set size = " << sset.size() << endl; StringSetIterator it(sset); for(it.Init(); it.HasMore(); it.Next()) { cout << it.Current() << endl; } if (sset.contains("banana")){ cout << "has we have a banana" << endl; } return 0; }
13
A Computer Science Tapestry 7.13 Designing and Using Classes l Class implementation, summary of what we’ve seen Data is private and is accessible in each member function Each object has it’s own data, so that each of five Dice objects has its own mySides and myRollCount Member function implementations are in a.cpp file, interface is in a.h file l Compiling and linking, interface and implementation Client programs #include a.h file, this is the interface Client programs link the implementation, which is a compiled version of the.cpp file (.o or.obj suffix), implementations are often combined in a library, e.g., libtapestry, and the library is linked
14
A Computer Science Tapestry 7.14 Implementing Classes l Determining what classes are needed, and how they should be implemented is difficult; designing functions is difficult Experience is a good teacher, failure is a good teacher Good design comes from experience, experience comes from bad design attributed to Fred Brooks, Henry Petroski Design and implementation combine into a cyclical process: design, implement, re-visit design, implement, test, redesign, … Grow a working program, don’t do it all at the same time l One design methodology says “look for nouns, those are classes”, and “look for verbs or scenarios, those are member functions” Not every noun is a class, not every verb is a method
15
A Computer Science Tapestry 7.15 Programming Tips, Heuristics, Help l Develop a core working program, add to it slowly Iterative enhancement, test as you go, debug as you go l Do the hard part first, or do the easy part first Which is best? It depends. l Concentrate on behavior first when designing classes, then on state State is useful for communicating between method calls l If you’re using several classes, you’ll need to modify the Makefile or your project in an IDE: Visual C++/Eclipse
16
A Computer Science Tapestry 7.16 Common interfaces are a good thing l The class WordStreamIterator iterates over a file returning one word/string at a time string filename = PromptString("enter file name: "); WordStreamIterator ws; ws.Open(filename); for(ws.Init(); ws.HasMore(); ws.Next()) { cout << ws.Current() << endl; } l The class StringSet and StringSetIterator allow sets of strings to be iterated over one string at a time StringSet sset; sset.insert("banana"); sset.insert("cherry"); StringSetIterator it(sset); for(it.Init(); it.HasMore(); it.Next()) { cout << it.Current() << endl; }
17
A Computer Science Tapestry 7.17 Reuse concepts as well as code l Using the same syntax for iterating saves time in learning about new classes, will save coding when we learn how to exploit the commonality l We can develop different Question classes and “plug” them into a quiz program if the member functions have the same name See quiz.cpp, mathquest.cpp, and capquest.cpp Programs must #include different headers, and link in different implementations, but quiz.cpp doesn’t change l Random walk classes: one- and two-dimensional, can use the same driver program if the classes use the same method names
18
A Computer Science Tapestry 7.18 Random walks l Throwing darts (randomness in programs) is a technique for simulating events/phenomena that would be otherwise difficult Molecular motion is too time-consuming to model exactly, use randomness to approximate behavior Consider the number of molecules in 10 -10 liters of a gas, each affects the other if we’re simulating motion 6.023x10 23 molecules/22.4 liters is (approx) 2.7e+12molecules If we can do 100 megaflops, what does this mean? l Simulations are important in modeling nature, airplanes, airports, …, require pseudo-random numbers and some mathematics as well as programming
19
A Computer Science Tapestry 7.19 Simulations and solving problems We can find the area under a curve in two dimensions by integrating, see integrate.cpp Sometimes can’t get exact solution, need approximations Can use rectangles, trapezoids, parabolas, etc. Can also use randomization, especially useful in multidimensional integrals where other techniques fail 3 10 y = x 2 3 10
20
A Computer Science Tapestry 7.20 Random walks l Interesting from a mathematical viewpoint, leads to understanding other random simulations l Two-dimensional random molecular simulations are fundamental to understanding physics (nuclear reactions) l Basic idea in one dimension: frog, object, molecule “decides” to step left or right at random Where does object end up after N steps? How many times does from visit each lily pad Assuming pads located at integer values, not real values See frogwalk.cpp for details
21
A Computer Science Tapestry 7.21 Walking behavior (see frogwalk2.cpp) int main() { int numSteps = PromptRange("enter # steps",0,30000); RandomWalk frog(numSteps); // define two random walkers RandomWalk toad(numSteps); int samePadCount = 0; // # times at same location frog.Init(); // initialize both walks toad.Init(); while (frog.HasMore() && toad.HasMore()) { if (frog.Current() == toad.Current()) { samePadCount++; } frog.Next(); toad.Next(); } cout << "frog position = " << frog.Position() << endl; cout << "toad position = " << toad.Position() << endl; cout << "# times at same location = " << samePadCount << endl; return 0; }
22
A Computer Science Tapestry 7.22 Two-dimensional walker l One-d walker Current() returns an int as position l Two-d walker Current() returns a Point as position Both int and Point can be compared using == Both int and Point can be printed using << l Same program works for two-d walker, even though underneath the implementation is very different Since the interfaces are the same/similar, client programs are easier to write once, use many times Client code still needs to #include a different header and must link in a different (two-d) walker implementation
23
A Computer Science Tapestry 7.23 What’s the Point? The two-dimensional walker uses #include " point.h " This provides access to class Point declaration/interface The class Point is actually defined using struct Point In C++, a struct is a class in which everything is public by default In a class, everything is private by default A struct is really a hold-over from C, used in C++ for plain old data Some programmers/designers don’t like to use structs in C++, but use classes only l We’ll use struct when data is public, when the state is really more important than the behavior Guideline, data is private accept in a struct, other options?
24
A Computer Science Tapestry 7.24 point.h struct Point { Point(); Point(double px, double py); string tostring() const; double distanceFrom(const Point& p) const; double x; double y; }; l Two constructors, data is public, how is the (0,0) defined? How is distance from (3,5) to (11,20) calculated? How is a Point p printed?
25
A Computer Science Tapestry 7.25 Other details from point.h Points can be compared with each other using ==, =, etc. Point p can be printed using cout << p << endl; Later we’ll learn how to overload operators like this For now we’ll be clients, using Points like ints, BigInts, etc. The struct Point has constructors and other behavior distanceFrom and tostring constitute the behavior Some programmers think structs shouldn’t have any functions, holdover from C rather than C++ What is implemention of Point::distanceFrom like?
26
A Computer Science Tapestry 7.26 Other uses of structs l In a program using free (non-class) functions, lots of data is often passed from one function to another In class-based programs data is often, though not always, part of a class and a class object is passed l Using structs to collect related data makes programs easier to read, modify, and maintain Suppose you want to find mean, mode, and median of the lengths of words in a file, two alternatives: void doFileStats(const string& filename, double & mean, int & mode, int & median); void doFileStats(const string& filename, FileData& data);
27
A Computer Science Tapestry 7.27 More struct conventions It’s almost always worth including a constructor in a struct struct FileData { FileData() { myMean = 0.0; myMode = 0; myMedian = 0; } double myMean; int myMode; int myMedian; }; What other data might be included in FileData, what about other constructors?
28
A Computer Science Tapestry 7.28 Class (and struct) conventions For debugging and printing it’s useful for classes to implement a function tostring(), that " stringizes " an object Also useful in overloading operator << for an object Point p; string s = p.tostring(); cout << s << " " << p << endl; l When initializing data in a constructor, it’s better to use an initializer list than to set values in the constructor body Sometimes initializer lists are required (see next example), so using them at all times leads to more uniform coding that works in more situations
29
A Computer Science Tapestry 7.29 Initializer lists are sometimes required Consider a class that has a private Dice data member class Game { public: Game(); // more functions private: Dice myDie; // more data }; The instance variable myDie must be given a # sides, this cannot be given in the.h file/declaration, must be provided in the.cpp file/class implementation It’s an error if an initializer list isn’t use
30
A Computer Science Tapestry 7.30 Initializer lists Here are two versions of an initializer list for Game::Game() Game::Game() : myDie(6) { } // if there’s more data, use initializer list Game::Game() : myDie(6), myName(“roulette”) { } l There can be code in constructor body to do more, e.g., read from a file Sometimes it’s useful to call private, helper functions
31
A Computer Science Tapestry 7.31 Mary Shaw l Software engineering and software architecture Tools for constructing large software systems Development is a small piece of total cost, maintenance is larger, depends on well-designed and developed techniques l Interested in computer science, programming, curricula, and canoeing
32
A Computer Science Tapestry 7.32 Three phases of creating a program l The preprocessor is a program that processes a file, processing all #include directives (and other preprocessor commands) Takes a file, and creates a translation unit Replaces #include “foo.h” with contents of file foo.h, and does this recursively, for all #includes that foo includes and so on Produces input to the next phase of program creation l The compiler has a translation unit as input and produces compiled object code as output The object code is platform/architecture specific, the source code is (in theory at least) the same on all platforms Some compilers require special treatment, not up to standard C++
33
A Computer Science Tapestry 7.33 From compiling to linking l The compilation phase creates an object file, but libraries and other files still need to be linked to create an executable Header files like “dice.h” provide only the interface, enough for the compiler to know that a function call has the right parameters and is used correctly The implemention file, “dice.cpp”, must be compiled and included in the final executable, or the program won’t work (call a dice function, but no one is home?) l Linking combines object files, some of which may be collected in a library of related files, to create an executable Link the standard library (iostream, for example) Link other libraries depending on program, graphics, tapestry, other application-specific libraries
34
A Computer Science Tapestry 7.34 Issues in creating a program l Programming environments create optimized or debug code Use debug version to facilitate development If you need optimization, use it only after a program works l Some errors are compilation errors, typically language syntax or failure to find a #include’d header file The preprocessor looks in standard places for header files, sometimes this list needs to be changed l Other errors are linker errors, libraries or object files that are needed aren’t included Change programming environment parameters to find the libraries
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.