CSc 110, Spring 2018 Lecture 22: Line-Based File Input

Slides:



Advertisements
Similar presentations
Unit 6 File processing Special thanks to Roy McElmurry, John Kurkowski, Scott Shawcroft, Ryan Tucker, Paul Beck for their work. Except where otherwise.
Advertisements

Copyright 2008 by Pearson Education Line-based file processing reading: 6.3 self-check: #7-11 exercises: #1-4, 8-11.
Copyright 2006 by Pearson Education 1 Building Java Programs Chapter 6: File Processing.
INTRODUCTION TO PYTHON PART 4 – TEXT AND FILE PROCESSING CSC482 Introduction to Text Analytics Thomas Tiahrt, MA, PhD.
Topic 20 more file processing Copyright Pearson Education, 2010 Based on slides bu Marty Stepp and Stuart Reges from
Week 7 Review and file processing. >>> Methods Hello.java public class Hello { public static void main(String[] args){ hello(); } public static void hello(){
Unit 6 File processing Special thanks to Roy McElmurry, John Kurkowski, Scott Shawcroft, Ryan Tucker, Paul Beck for their work. Except where otherwise.
Copyright 2008 by Pearson Education Building Java Programs Chapter 6: File Processing Lecture 6-2: Advanced file input reading: self-check: #7-11.
1 Reminder: String Scanners A Scanner can be constructed to tokenize a particular String, such as one line of an input file. Scanner = new Scanner( );
File Processing Copyright Pearson Education, 2010 Based on slides bu Marty Stepp and Stuart Reges from "We can only.
Copyright 2008 by Pearson Education Building Java Programs Chapter 6 Lecture 6-3: Searching Files reading: 6.3, 6.5.
CSc 110, Autumn 2016 Lecture 26: Sets and Dictionaries
Thanks to Atif Memon from UMD for disaster examples
CSc 110, Autumn 2017 Lecture 16: Fencepost Loops and Review
CSc 110, Autumn 2017 Lecture 17: while Loops and Sentinel Loops
CSc 110, Autumn 2017 Lecture 19: While loops and File Input
Thanks to Atif Memon from UMD for disaster examples
CSc 110, Autumn 2017 Lecture 20: Line-Based File Input
CSc 110, Spring 2017 Lecture 29: Sets and Dictionaries
CSc 110, Spring 2017 Lecture 18: Line-Based File Input
CSc 110, Autumn 2017 Lecture 21: Line-Based File Input
Adapted from slides by Marty Stepp and Stuart Reges
Adapted from slides by Marty Stepp and Stuart Reges
Hours question Given a file hours.txt with the following contents:
CSc 110, Autumn 2017 Lecture 24: Lists for Tallying; Text Processing
CSc 110, Autumn 2017 Lecture 18: While loops and File Input
Functions CIS 40 – Introduction to Programming in Python
Line-based file processing
CSc 110, Spring 2017 Lecture 28: Sets and Dictionaries
CSc 110, Spring 2018 Lecture 32: Sets and Dictionaries
CSc 110, Autumn 2017 Lecture 31: Dictionaries
CSc 110, Autumn 2017 Lecture 30: Sets and Dictionaries
Topic 20 more file processing
CSc 110, Autumn 2016 Lecture 12: while Loops, Fencepost Loops, and Sentinel Loops Adapted from slides by Marty Stepp and Stuart Reges.
CSc 110, Spring 2017 Lecture 11: while Loops, Fencepost Loops, and Sentinel Loops Adapted from slides by Marty Stepp and Stuart Reges.
while loops; file I/O; introduction to lists
CSc 110, Autumn 2016 Lecture 16: File Input.
Building Java Programs
CSc 110, Spring 2018 Lecture 33: Dictionaries
CSc 110, Spring 2018 Lecture 16: Fencepost Loops and while loops
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
Building Java Programs
CSc 110, Autumn 2017 Lecture 9: Graphics and Nested Loops
CSc 110, Autumn 2016 Lecture 5: Loop Figures and Constants
Week 7 Lists Special thanks to Roy McElmurry, John Kurkowski, Scott Shawcroft, Ryan Tucker, Paul Beck for their work. Except where otherwise noted, this.
CSc 110, Spring 2017 Lecture 17: Line-Based File Input
CSc 110, Autumn 2016 Lecture 20: Lists for Tallying; Text Processing
Adapted from slides by Marty Stepp and Stuart Reges
Adapted from slides by Marty Stepp and Stuart Reges
Lecture 23: Tuples Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Spring 2018 Lecture 7: input and Constants
Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Autumn 2016 Lecture 17: Line-Based File Input
Lecture 19: lists Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Spring 2017 Lecture 20: Lists for Tallying; Text Processing
Reading and Writing Files
CSc 110, Spring 2018 Lecture 5: Loop Figures and Constants
CSc 110, Spring 2018 Lecture 27: Lists that change size and File Output Adapted from slides by Marty Stepp and Stuart Reges.
CSc 110, Spring 2018 Lecture 8: input and Parameters
Chapters 6 and 7 Line-Based File Input, Arrays reading: , 7.1
CSc 110, Spring 2018 Lecture 10: More Graphics
CSc 110, Spring 2018 Lecture 21: Line-Based File Input
CSc 110, Autumn 2016 Programming feel like that?
CSc 110, Autumn 2017 Lecture 8: More Graphics
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
CSc 110, Autumn 2016 Lecture 20: Lists for Tallying; Text Processing
Building Java Programs
Adapted from slides by Marty Stepp and Stuart Reges
Presentation transcript:

CSc 110, Spring 2018 Lecture 22: Line-Based File Input Adapted from slides by Marty Stepp and Stuart Reges

IMDb movies problem Consider the following Internet Movie Database (IMDb) data: 1 9.1 196376 The Shawshank Redemption (1994) 2 9.0 139085 The Godfather: Part II (1974) 3 8.8 81507 Casablanca (1942) Write a program that displays any movies containing a phrase: Search word? part Rank Votes Rating Title 2 139085 9.0 The Godfather: Part II (1974) 40 129172 8.5 The Departed (2006) 95 20401 8.2 The Apartment (1960) 192 30587 8.0 Spartacus (1960) 4 matches. Is this a token or line-based problem?

"Chaining" main should be a concise summary of your program. It is bad if each function calls the next without ever returning (we call this chaining): A better structure has main make most of the calls. Functions must return values to main to be passed on later. main functionA functionB functionC functionD main functionA functionB functionD

Bad IMDb "chained" code 1 # Displays IMDB's Top 250 movies that match a search string. def main(): get_word() # Asks the user for their search word and returns it. def get_word(): search_word = input("Search word: ") search_word = search_word.lower() print() file = open("imdb.txt") search(file, search_word) # Breaks apart each line, looking for lines that match the search word. def search(file, search_word): matches = 0 for line in file: line_lower = line.lower() # case-insensitive match if (search_word in line_lower): matches += 1 print("Rank\tVotes\tRating\tTitle") display(line)

Bad IMDb "chained" code 2 # Displays the line in the proper format on the screen. def display(line): parts = line.split() rank = parts[0] rating = parts[1] votes = parts[2] title = "" for i in range(3, len(parts)): title += parts[i] + " " # the rest of the line print(rank + "\t" + votes + "\t" + rating + "\t" + title)

Better IMDb answer 1 # Displays IMDB's Top 250 movies that match a search string. def main(): search_word = get_word() file = open("imdb.txt") line = search(file, search_word) if (len(line) > 0): print("Rank\tVotes\tRating\tTitle") matches = 0 while (len(line) > 0): display(line) matches += 1 print(str(matches) + " matches.") # Asks the user for their search word and returns it. def get_word(): search_word = input("Search word: ") search_word = search_word.lower() print() return search_word ...

Better IMDb answer 2 ... # Breaks apart each line, looking for lines that match the search word. def search(file, search_word): for line in file: line_lower = line.lower() # case-insensitive match if (search_word in line): return line return "" # not found # displays the line in the proper format on the screen. def display(line): parts = line.split() rank = parts[0] rating = parts[1] votes = parts[2] title = "" for i in range(3, len(parts)): title += parts[i] + " " # the rest of the line print(rank + "\t" + votes + "\t" + rating + "\t" + title)

Output to files Open a file in write or append mode 'w' - write mode – replaces everything in the file 'a' – append mode – adds to the bottom of the file preserving what is already in it name = open("filename", "w") # write name = open("filename", "a") # append

Output to files name.write(str) - writes the given string to the file name.close() - closes file once writing is done Example: out = open("output.txt", "w") out.write("Hello, world!\n") out.write("How are you?") out.close() text = open("output.txt").read() # Hello, world!\nHow are you? common PrintStream bug: - declaring it in a method that gets called many times. This causes the file to be re-opened and wipes the past contents. So only the last line shows up in the file.