File Processing Upsorn Praphamontripong CS 1110 Introduction to Programming Spring 2017
Overview: File Input and Output Opening files Reading from files Writing to files Appending to files Closing files CS 1110
Opening Files (Local) Opening a file The location of a file Example file_variable = open(filename, mode) The location of a file In general, we need to specify a path to a file. If a data file and your program are in different folders, an absolute path must be used. If a data file and your program is in the same folder, a relative path can be used. In this course, for homework submission/testing purpose and to avoid a “file not found” problem, you *need* to put all data files in the same folder as your python files and specify a relative path. Example test_file = open('test.txt', 'r') # relative path ‘r’ = read only ‘w’ = write (If the file exists, erase its contents. If it does not exist, create a new file to be written to) ‘a’ = append a file (If the file exists, append data to the file. If the file does not exist, create a new file) CS 1110
Reading from Files: readline( ) Reading from a file file_variable = open(filename, 'r') file_variable.readline() def readline_file(): # open a file named students.txt to be read infile = open('students.txt', 'r') # read 2 lines from the file line1 = infile.readline() line2 = infile.readline() print(line1, line2) print(line1, line2) # close the file infile.close() CS 1110
Reading from Files: readline( ) with loops Reading from a file with loops file_variable = open(filename, 'r') file_variable.readline() def readline_file_with_loop(): # open a file named students.txt to be read infile = open('students.txt', 'r') line = infile.readline() while line != '': print(line.rstrip('\n')) line = infile.readline() # what happen if this line is commented out # close the file infile.close() CS 1110
Reading from Files: read( ) Reading from a file file_variable = open(filename, 'r') file_variable.read() def read_file(): # open a file named students.txt to be read infile = open('students.txt', 'r') # read the file's contents file_contents = infile.read() print(file_contents) # close the file infile.close() CS 1110
Writing to Files Writing to a file file_variable = open(filename, 'w') file_variable = write(string_to_be_written) def main(): # open a file named students.txt outfile = open('students.txt', 'w') # write the names of three students to the file outfile.write('John\n') outfile.write('Jack\n') outfile.write('Jane\n') # close the file outfile.close() Where is this file? What does it look like? John\nJack\nJane\n Beginning of the file End of the file CS 1110
Appending to Files Appending to a file file_variable = open(filename, 'a') file_variable = write(string_to_be_written) def main(): # open a file named students.txt outfile = open('students.txt', 'a') # write the names of three students to the file outfile.write('Mary\n') # close the file outfile.close() Where is this file? What does it look like? John\nJack\nJane\n Beginning of the file End of the file John\nJack\nJane\nMary\n Beginning of the file End of the file CS 1110
Closing Files Closing a file file_variable.close() def read_file(): # open a file named students.txt to be read infile = open('students.txt', 'r') # read the file's contents file_contents = infile.read() print(file_contents) # close the file infile.close() CS 1110
Wrap Up File Operations argument Function header (or signature) Why isn’t a file opened with a “write” mode ? If we want to open a file with a “write” mode, how should we modify the code ? def read_list_of_names(filename): names = [] datafile = open(filename, "r") outfile = open(filename, "a") for line in datafile: line = line.strip() names.append(line) outfile.write(line) datafile.close() outfile.close() return names print(read_list_of_names("names.txt")) Open a file with a “read” mode Open a file with an “append” mode For each line in datafile Strip leading and tailing spaces Append string to a list Write string to outfile (must be string) Close files Value-return function Function call parameter names CS 1111
Validating a File from os.path import * # get file name file_name = input("Enter csv file name: ") # check if a file is csv, is valid, and exists while (not file_name.endwith("csv") or not isfile(file_name) ): file_name = input(file_name + " is not csv or does not exist. Enter csv file name: ") CS 1111
Exercise (Local File) Down the following file and save it as “weather.csv” in the same folder where you have your .py file (Do not double click!! Do not open it with Excel!!) Charlottesville Weather for September 2015 - http://cs1110.cs.virginia.edu/code/cville_weather_sept15.csv Write a function to read the weather data and compute the average “Max TemperatureF” of Sep 2015. Hint: burn header, use split(“,”), and cast data CS 1111
Opening Files (Internet, via URLs) import urllib.request link = input ( 'Web page: ' ) stream = urllib.request.urlopen( link ) for line in stream: decoded = line.decode("UTF-8") print(decoded.strip()) CS 1111
Exercise (URL) Write a function to read a dataset from the Internet and display the data on the screen. This dataset contains history temperature. (Do not save the dataset to your computer) http://www.wunderground.com/history/airport/KCHO/2012/03/17/DailyHistory.html?format=1 Hint: use urllib, decode("UTF-8"), strip(), then split() CS 1111