CSV File Manipulation
Structured Text Files Simple text files are a collection of lines with an escape sequence at the end of each line. There is no definitive way to identify specific pieces of information unless there is a specified format to the file. Ex. /etc/passwd username:*:UID:GID: name: home Path: shell However there are several structured files Tab Delimited – values separated with a tab CSV – values separated with a ‘,’ HTML/XML – tags , ‘< >’
Comma Separated Values Delimited files are a common format often used as an exchange format for spreadsheets and databases. Each line in a CSV file represents a row in the spreadhseet Usually there is a header that denoted each of the column names. Since CSV’s are a formatted text file they can still have end of line escape sequences CSV vs escel and otherspreadsheets No types – all strings No fonts, sizes or colors No multiple worsheets No cell widths or heights No merged cells No images or charts ID Term Course Grade 800412564 201652 ISY150 A 800798465 CIS120 800125498 C 800174658 CIS150 F
Manipulating CSV Files vs. plain Text files Since CSV files are just formatted text files the process to read them is similar to processing text files. Create a file stream, create reader/writer object, process the reader/writer, close stream When files are read in they need to be processed as lists(arrays) and each element is a unique element in the array that does not need to be split. There is a unique module for processing csv files Code: import csv
Read CSV Example import csv exFile = open(‘example.csv’ , ‘r’) exReader = csv.reader(exFile) for row in exReader: print row exFile.close() import csv exFile = open(‘example.csv’ , ‘r’) exReader = csv.reader(exFile) exReader = list(exReader) for i in (0, 10, 1): print exReader[i] exFile.close()
Write csv Example import csv outFile = open(‘outputFile.csv’, ‘w’) outWriter = csv.writer(outFile) outWriter.writerow([‘Date’, ‘ID’, ‘GPA’]) outWriter.writerow([’01/12/2015’, ‘700514323’, ‘3.0’]) outWriter.writerow([’01/12/2015’, ‘700645798’, ‘2.64’]) outFile.close()
Process CSV Files in a directory example import csv, os for currFile in os.listdir(‘~/Documents’) if (not currFile.endswith(‘.csv’)): continue else: # process csv file