Making Good Code AKA: So, You Wrote Some Code. Now What? Ray Haggerty July 23, 2015
Step 1: Debugging
What is Debugging? The term “debugging” was popularized in the 1940s by Admiral Grace Hopper Debugging is, essentially, the fixing of errors in your code
The Three Major Types of Bugs Compilation Errors These prevent your code from executing at all Examples: mismatched delimiters, not ending loops Run-time Errors These occur when your code attempts an impossible operation Examples: dividing by zero, calling a function that doesn’t exist Logic Errors These will not throw an error, but give unexpected results Example: using ‘and’ & ‘or’ incorrectly, off by one
Quick & Dirty: Using Print Statements
ExpectationReality
Interpreting Error Messages References script or function error was generated in Gives the line at which the error was detected This does not necessarily mean the line the error is on Short explanation of what the computer attempted and couldn’t accomplish
Absolutely Correct: Using a Debugger Debuggers act like version control for your programs, saving the state of every variable at every instruction When the program breaks, you may execute instructions on the last state of the program Also allows backtracing to find the faulty piece of code directly
Absolutely Correct: Using a Debugger Debuggers require a debug-mode compile and are slow and memory intensive Check your code in release mode after running it through the debugger. Sample debuggers: Python includes the extensible Pdb debugger MATLAB has sldebug, the simulink debugger R can use the Dr. Mingw “just-in-time” debugger
Step 2: Testing
What is Testing? Just because your code worked on one specific problem, that DOES NOT mean it will necessarily work on other, similar problems Just because you didn’t see an error message, that DOES NOT mean, it is running correctly Testing is a way to ensure that your code will behave appropriately under the circumstances it promises to Need to make sure that your output is actually scientifically correct
The Four Major Testing Cases Base Case This is expected input Edge Case This is a minimum or maximum acceptable parameter Corner Case This is the intersection of multiple Edge Cases Boundary Case This is near the Edge Case, on either side
Quick & Dirty: Test Various Inputs Manually test your code by testing each of the types of testing cases and comparing the results to what you expect For example if the function expects the input to be between 0 and 100, we might test the following inputs: 22, 50, 89 (Base Cases) 0, 100 (Edge Cases) -1, 1, 99, 101 (Boundary Cases)
Absolutely Correct: Unit Testing Unit tests are pieces of test code that exist simply for the purpose of testing your code Since they exist outside of the code, they are durable. The same unit test can check multiple versions of your code However, you need to be very careful writing them Good idea: write a unit test every time you fix a bug
Exception Handling When your code encounters an error, you want it to alert the user to the problem and exit as gracefully as possible Exception Handling changes the flow of a program to handle abnormal or unexpected conditions For example, if a FASTA file contains an unexpected symbol- tell where it is and be informative
Exception Handling Try/Catch Statements allows you to override the default error behavior
Exception Handling Asserts Allows you to throw custom errors if condition is not met
Step 3: Using Best Practices
Why Should You Use Best Practices? In all likelihood, you will not be the only person using or looking at your code In all likelihood, you will want to make revisions to your own code at some point In all likelihood, you will want to have fast code
Encapsulation Encapsulation is breaking your code down into smaller parts and assembling them like legos This is actually a somewhat complex topic, but in the simplest sense, it is breaking out sections of your code into new subfunctions, and calling those within your main script or function Encapsulation has several advantages: Code readability Better for testing Future-proofing Code reuse Controls variable scope issues
Code Testing Since encapsulation breaks your code down into individual functions, it makes unit testing very straightforward Ensuring that each subfunction works appropriately is critical for the other benefits of encapsulation
Future-Proofing Since your code is encapsulated into small, independent blocks, it is very easy to reuse and replace each subfunction If you decide you want to change the way part of your code works, you can easily replace it without worrying about the entire script breaking
Code Reuse Conversely, you may find yourself needing to write a lot of different scripts that perform the same function(s) You could just copy/paste, but that is sloppy and could introduce errors If you’ve already used encapsulation, then you can easily reuse a subfunction in a new script
Variable Scope The scope of a variable refers to which parts of the code can “see” and access that variable Generally, you want your variables to be of a smallest scope possible, so they are protected There are lots of different scopes, but the two main ones are: Local Only visible to a small section of the code (e.g. a subfunction) Non-local Visible to a larger section or all of the code
Header Comments A header comment is a block comment at the top of your code to help readers of your code It should include: Explanation of what the code does (Input/Output) Name and contact info Date edited Requirements (packages, software versions)
Header Comment Examples