Structured programming 4 Day 34 LING Computational Linguistics Harry Howard Tulane University
16-Nov-2009LING , Prof. Howard, Tulane University2 Course organization
Structured programming NLPP §4
16-Nov-2009LING , Prof. Howard, Tulane University4 Today's topics Defensive programming Debugging Algorithm design
16-Nov-2009LING , Prof. Howard, Tulane University5 Defensive programming Brainstorm with pseudo-code Careful naming conventions Bottom-up construction Functional decomposition Comment, comment, comment Regression testing
16-Nov-2009LING , Prof. Howard, Tulane University6 Brainstorm with pseudo- code Before you write the first line of Python code, write what your program does as pseudocode. That is to say, before writing a program that NLTK understands, write it in a way that people understand.
16-Nov-2009LING , Prof. Howard, Tulane University7 An example of pseudo- code SPOT, move forward about 10 inches, turn left 90 degrees, and start moving forward, then start looking for a black object with your ultrasonic sensor, because I want you to stop when you find a black object, then turn right 90 degrees, and move backward 2 feet, OK? What is good or bad about this example
16-Nov-2009LING , Prof. Howard, Tulane University8 A different phrasing of the example SPOT, move forward about 10 inches and stop. Now turn left 90 degrees. Start moving forward, and turn on your ultrasonic sensor. Stop when you find a black object. Turn right 90 degrees and stop. Move backward 2 feet and stop. What is good or bad about this example?
16-Nov-2009LING , Prof. Howard, Tulane University9 Pseudo and real code The main advantage of the second phrasing is that we can match up the commands in each line to elements in the programming language.
16-Nov-2009LING , Prof. Howard, Tulane University10 Careful naming conditions Choose meaningful variable and function names.
16-Nov-2009LING , Prof. Howard, Tulane University11 Bottom-up construction Instead of writing a 20-line program and then testing it, build and test smaller units, and then combine them. In general, these smaller units should be functions.
16-Nov-2009LING , Prof. Howard, Tulane University12 NLP pipeline Fig. 3.1
16-Nov-2009LING , Prof. Howard, Tulane University13 Commenting Add comments to every line, unless what a line is does is so obvious that a comment would get in the way. Your pseudo-code could become the comments on your real code.
16-Nov-2009LING , Prof. Howard, Tulane University14 Regressive testing Keep a suite of test cases. As your program gets bigger, it should still work on previous test cases. If it stops working, it has 'regressed'. A change in code has the (unintended) side effect of breaking something that used to work. doctest module does testing It runs a program as if it were in interactive mode. See doctest documentation.
16-Nov-2009LING , Prof. Howard, Tulane University15 Debugging topics Check your assumptions Exception > stack trace Interactive debugging Python's debugger Prediction
16-Nov-2009LING , Prof. Howard, Tulane University16 Debugging "Most code errors result from the programmer making incorrect assumptions". (NLPP:158) When you find an error, first check your assumptions. Add print statements to show values of variables and how far the program progresses. Reduce input to smallest amount needed to cause the error.
16-Nov-2009LING , Prof. Howard, Tulane University17 Stack trace A runtime error (Python exception) gives a stack trace that pinpoints the location of program execution at the time of the error. But the error may actually be upstream.
16-Nov-2009LING , Prof. Howard, Tulane University18 Python's debugger Invoke it: import pdb pdb.run('mymodule') It lets you monitor execution of program, specify line numbers where program should stop (breakpoints), and step through the sections of code inspecting values of variables.
16-Nov-2009LING , Prof. Howard, Tulane University19 Prediction Try to predict the effect of a potential bugfix before re-running the program. "If the bug isn't fixed, don't fall into the trap of blindly changing the code in the hope that it will magically start working again." (NLPP:159) For each change, try to articulate what is wrong and how the change will fix the problem. Undo the change if it doesn't work. "Programs don't magically work; they magically don't work." (Robert Goldman)
Algorithm design NLPP 4.7
16-Nov-2009LING , Prof. Howard, Tulane University21 Algorithms Divide and conquer Start with something that works Iteration Recursion
16-Nov-2009LING , Prof. Howard, Tulane University22 Divide and conquer Divide a problem of size n into two problems of size n/2. Binary search - dictionary example.
16-Nov-2009LING , Prof. Howard, Tulane University23 Start with known Transform task into something that already works. To find duplicates in a list, first sort the list, then check for identity of adjacent pairs.
16-Nov-2009LING , Prof. Howard, Tulane University24 Iteration vs. recursion For some function ƒ… Iteration Repeat ƒ some number of times. Calling ƒ in a for loop. Recursion ƒ calls itself some number of times: NP → the N PP. PP → P NP.
Next time Start NLPP §6 Learning to classify text