CS2021 Week 2 Off and Running with Python
Two ways to run Python The Python interpreter – You type one expression at a time – The interpreter evaluates the expression and prints its value. No need for print statements. Running a Python program – Python evaluates all the statements in the file, in order – Python does not print their values (but does execute print statements) Writing an expression outside a statement (assignment, print, etc.) is useless, unless it is a function call that has a side effect
The Python interpreter The interpreter is a loop that does: – Read an expression – Evaluate the expression – Print the result If the result is None, the interpreter does not print it This inconsistency can be confusing! (Jargon: An interpreter is also called a “read- eval-print loop”, or a REPL)
How to launch the Python interpreter Two ways to launch the interpreter: – Run IDLE; the interpreter is called the “Python shell” – Type python at the operating system command line Type exit() to return to the operating system command line Python shell and OS shell are not the same: Operating system command line, or “shell” or “command prompt” (cmd.exe under Windows) or “terminal” – Runs programs (Python, others), moves around the file system – Does not understand Python code like 1+2 or x = 22 Python interpreter – Executes Python statements and expressions – Does not understand program names like python or cd
Interpreter: Command History Common to re-evaluate statements, or modify past statements Use command history to cycle back to previous commands Key bindings defaults in IDLE Preferences-Keys – Classic bindings for both OSX and Windows history-previous is history-next is
Two ways to Run a Python program Python evaluates each statement one-by-one Python does no extra output, beyond print statements in the program Two ways to run a program: – While editing a program within IDLE, press F5 (menu item “Run >> Run Module”) Must save the program first, if it is modified – Type at operating system command line: python myprogram.py
Python interpreter vs. Python program Running a Python file as a program gives different results from pasting it line-by-line into the interpreter – The interpreter prints more output than the program would In the Python interpreter, evaluating a top-level expression prints its value – Evaluating a sub-expression generally does not print any output – The interpreter does not print a value for an expression that evaluates to None This is primarily code that is executed for side effect: assignments, print statements, calls to “non-fruitful” functions In a Python program, evaluating an expression generally does not print any output
What are Modules? Modules are files containing Python definitions and statements (ex. name.py) A module’s definitions can be imported into other modules by using “import name” The module’s name is available as a global variable value To access a module’s functions, we use dot notation and type “name.function()”
More on Modules Modules can import other modules Each module is imported once per interpreter session – Multiple import statements are ignored. However, reload(name) will re-run execution We can also directly import names from a module into the importing module’s symbol table – from mod import m1, m2 (or *) – If m1 is a function, then call it with m1()
Executing Modules python name.py – Runs code as if it was imported – Setting _name_ == “_main_” the file can be used as a script and an importable module
Example: person.py class Person: def __init__(self, name, age, pay=0, job=None): self.name = name self.age = age self.pay = pay self.job = job def lastName(self): return self.name.split()[-1] if __name__ == '__main__’: bob = Person('Bob Smith', 42, 30000, 'software’) sue = Person('Sue Jones', 45, 40000, 'hardware’) print(bob.name, sue.lastname()) $ python person.py ('Bob Smith', 'Jones')
The Module Search Path When import name is executed the interpreter searches for a file named name.py in several locations – Current working directory given by os.getcwd() – System path given by variable sys.path – sys.path will include a list of directories specified by environment variable called PYTHONPATH Script being run should not have the same name as a python standard library.
sys.path A list of strings that specifies the search path for modules. Initialized from the environment variable PYTHONPATH, plus an installation-dependent default which includes the directory of all installed packages. As initialized upon program startup, the first item of this list, path[0], is the directory containing the script that was used to invoke the Python interpreter. If the script directory is not available (e.g. if the interpreter is invoked interactively or if the script is read from standard input), path[0] is the empty string, which directs Python to search modules in the current working directory first. Notice that the script directory is inserted before the entries inserted as a result of PYTHONPATH.
Example: showing system path Python 3.5.0b4 (v3.5.0b4:c0d , Jul , 16:26:13) [GCC (Apple Inc. build 5666) (dot 3)] >>> import sys >>> sys.path ['', '/Users/fred/Documents', '/Library/Frameworks/Python.framework/Versions/3.5/lib/python35.zip', '/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5', '/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/plat- darwin',/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/lib- dynload', '/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site- packages'] >>> import os >>> os.getcwd() '/Users/fred/Documents' >>>
Ex: changing working directory and path #Suppose I have directory Preview that contains file initdata.py >>> import initdata Traceback (most recent call last): File " ", line 1, in import initdata ImportError: No module named 'initdata' >>> os.chdir(‘./Preview’) #also try sys.path.append(‘./Preview’) >>> os.getcwd() '/Users/fred/Documents/Preview' >>> import initdata # has an attribute db >>> db Traceback (most recent call last): File " ", line 1, in NameError: name 'db' is not defined >>> initdata.db {'sue': {'job': 'hdw', 'pay': 40000, 'age': 45, 'name': 'Sue Jones'}, 'bob': {'job': 'dev', 'pay': 30000, 'age': 42, 'name': 'Bob Smith'}}
“Compiled” Python Files If files mod.pyc and mod.py are in the same directory, there is a byte-compiled version of the module mod The modification time of the version of mod.py used to create mod.pyc is stored in mod.pyc Normally, the user does not need to do anything to create the.pyc file A compiled.py file is written to the.pyc No error for failed attempt,.pyc is recognized as invalid Contents of the.pyc can be shared by different machines
The dir() Function Used to find and list the names a module defines and returns a sorted list of strings >>> import mod >>> dir(mod) [‘_name_’, ‘m1’, ‘m2’] Without arguments, it lists the names currently defined (variables, modules, functions, etc.) Use dir to list names of all built-in functions and variables >>> dir(__builtins__) ['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException',…..
Packages Collections of modules within a directory – Directory must have an special file contained within named __init__.py Client uses “dotted module names” (ex. a.b) – Submodule b in package a Saves authors of multi-module packages from worrying about each other’s module names Python searches through sys.path directories for the package subdirectory Users of the package can import individual modules from the package Ways to import submodules – import PP4E.Preview.initdata Submodules must be referenced by full name An ImportError exception is raised when the package cannot be found
Importing Modules from Packages This term we will be working with Package associated with the textbook called PP4E. Notice the __init__.py file contained in the PP4E directory. >>> sys.path.append('./PP4E') >>> import Preview.initdata >>> Preview.initdata.db {'sue': {'age': 45, 'job': 'hdw', 'name': 'Sue Jones', 'pay': 40000}, 'tom': {'age': 50, 'job': None, 'name': 'Tom', 'pay': 0}, 'bob': {'age': 42, 'job': 'dev', 'name': 'Bob Smith', 'pay': 30000}} >>> from Preview.initdata import * >>> db {'sue': {'age': 45, 'job': 'hdw', 'name': 'Sue Jones',
USING THE BOOK’s PP4E PACKAGE There are a number of ways to enable imports from this directory tree, 1)Set or change working directory to one that includes PP4E 2)Change the sys.path by appending the directory containing PP4E 3)Add PP4E container directory to your PYTHONPATH module search-path setting 4)Copy the PP4E directory to your Python installation’s Lib\site-packages standard library subdirectory. 5)Include PP4E's container directory in a ".pth" path file too; see Learning Python for more on module search path configuration. The Lib\site-packages directory is automatically included in the Python module search path, and is where 3rd-party software is normally installed by pip and setup.py distutils scripts. If you do copy PP4E into site-packages then the following should work: $ python >>> import PP4E.Gui.Tools.spams spam spamspamspam
Errors when running python
The Call Stack As functions are called, their names are placed on the stack, and as they return, their names are removed. The Traceback presents us with the list of called functions (from the first called to the most recent called [most recent call last]), telling us the file where the call occurred, the line in that file, and the name of the function the call was made from if any (otherwise '?'). On the next line slightly indented it tells us the name of the function called. Traceback (most recent call last): File "test.py", line 25, in ? triangle() File "test.py", line 12, in triangle inc_total_height() File "test.py", line 8, in inc_total_height total_height = total_height + height UnboundLocalError: local variable 'total_height' referenced before assignment
We see that execution started in the file test.py and proceeded to line 25, where the function 'triangle' was called. Within the function triangle, execution proceeded until line 12, where the function 'inc_total_height' was called. Within 'inc_total_height' and error occurred on line 8.
Carefully Read the Error message Traceback (most recent call last): File "nx_error.py", line 41, in print friends_of_friends(rj, myval) File "nx_error.py", line 30, in friends_of_friends f = friends(graph, user) File "nx_error.py", line 25, in friends return set(graph.neighbors(user))# File "/Library/Frameworks/…/graph.py", line 978, in neighbors return list(self.adj[n]) TypeError: unhashable type: 'list' List of all exceptions (errors): Two other resources, with more details about a few of the errors: Call stack or traceback First function that was called ( means the interpreter) Second function that was called Last function that was called (this one suffered an error) The error message: daunting but useful. You need to understand: the literal meaning of the error the underlying problems certain errors tend to suggest
Simple Debugging Tools print – shows what’s happening whether there’s a problem or not – does not stop execution assert – Raises an exception if some condition is not met – Does nothing if everything works – Example: assert len(rj.edges()) == 16 – Use this liberally! Not just for debugging! input – Stops execution – (Designed to accept user input, but I rarely use it for this.)
Working with Files and Persistent Data
Files and filenames A file object represents data on your disk drive – Can read from it and write to it A filename (usually a string) states where to find the data on your disk drive – Can be used to find/create a file – Examples: "/home/mernst/class/140/lectures/file_io.pptx" "C:\Users\mernst\My Documents\cute_cat.gif" "lectures/file_io.pptx" "cute_cat.gif"
Read a file in python # Open takes a filename and returns a file. # This fails if the file cannot be found & opened. myfile = open("datafile.dat") # Approach 1: for line_of_text in myfile: … process line_of_text # Approach 2: all_data_as_a_big_string = myfile.read() Assumption: file is a sequence of lines Where does Python expect to find this file (note the relative pathname)?
Two types of filename An Absolute filename gives a specific location on disk: "/home/mernst/class/140/13wi/lectures/file_io.pptx" or "C:\Users\mernst\My Documents\cute_cat.gif" – Starts with “/” (Unix) or “C:\” (Windows) – Warning: code will fail to find the file if you move/rename files or run your program on a different computer A Relative filename gives a location relative to the current working directory: "lectures/file_io.pptx" or "cute_cat.gif" – Warning: code will fail to find the file unless you run your program from a directory that contains the given contents A relative filename is usually a better choice
Reading a file multiple times You can iterate over a list as many times as you like: mylist = [ 3, 1, 4, 1, 5, 9 ] for elt in mylist: … process elt for elt in mylist: … process elt Iterating over a file uses it up: myfile = open("datafile.dat") for line_of_text in myfile: … process line_of_text for line_of_text in myfile: … process line_of_text # This loop body will never be executed! Solution 1: Read into a list, then iterate over it myfile = open("datafile.dat") mylines = [] for line_of_text in myfile: mylines.append(line_of_text) … use mylines Solution 2: Re-create the file object (slower, but a better choice if the file does not fit in memory) myfile = open("datafile.dat") for line_of_text in myfile: … process line_of_text myfile = open("datafile.dat") for line_of_text in myfile: … process line_of_text
Writing to a file in python # Replaces any existing file of this name myfile = open("output.dat", "w") # Just like print ing output myfile.write("a bunch of data") myfile.write("a line of text\n") myfile.write(4) myfile.write(str(4)) open for Writing (no argument, or "r", for Reading) “\n” means end of line (Newline) Wrong; results in: TypeError: expected a character buffer object Right. Argument must be a string
GO to Homework #2