Lane Medical Library & Knowledge Management Center How to Write a Program Yannick Pouliot, PhD Bioresearch Informationist © 2009 The Board of Trustees of The Leland Stanford Junior University
Lane Medical Library & Knowledge Management Center 2 The Bioresearch Informationist: At Your Service Yannick Pouliot, PhD, Lane Medical Library & Knowledge Management Center Bioresearch Informationist ≈ computational biologist in residence Role: Support laboratory researchers regarding biocomputational resources and their use …especially postdocs Contact:
Lane Medical Library & Knowledge Management Center 3 Objectives Understanding the thinking process behind writing a program.
Lane Medical Library & Knowledge Management Center Check Point Q: Do you have Perl installed? 4
Lane Medical Library & Knowledge Management Center So What’s Needed for Writing a Program? Need knowledge of: 1. Language instructions (FOR loop, CASE, IF, etc) 2. Syntax: correctly invoking instructions Having a (very) clear strategy Programming is giving instructions to an idiot savant Must be complete Must work in every aspect We’ll use a simple method can greatly facilitate the process and increase quality of your work 5
Lane Medical Library & Knowledge Management Center Our Example: Writing a Program that Prints “Hello World” Applying “Ten Steps to Write a Program Without Tears” to print text 6
Lane Medical Library & Knowledge Management Center Step 1: Start With The End: What is the Outcome? Visualize what the end result of your program will be Be highly specific, e.g. Silly: print “hello” whenever the program runs Real: calculate the average frequency of CpG islands in human genes known to code for interleukins Definition of interleukin protein exists? Everywhere in the genome? What about pseudogenes? Do they count? Useful to create flowchart diagram of process 7
Lane Medical Library & Knowledge Management Center Flowcharting to Understand the Process Lots of tools, some free I use MS Visio (Win only)MS Visio 8
Lane Medical Library & Knowledge Management Center Step 3: Breakdown Tasks into Chunks ( = “Subroutines”) For us, we want to print a single word: “hello” Chunk = print whatever word is requested Use individual subroutines to handle chunks Subroutines used whenever a complex and/or repetitive step is involved Why? Because subs simplify writing and reading of code because they subsume set of instructions under a human- understandable sentence, e.g. “PrintText()” “QueryDB()” 9
Lane Medical Library & Knowledge Management Center Step 6: How is Our Data Stored? 10 We’ll store text to be printed in a string variable Q: what should it be called? $TextToPrint = ‘Hello = (‘Gene1’,’Gene2’,’Gene3’,’Gene4’);
Lane Medical Library & Knowledge Management Center Putting it Practice 11
Lane Medical Library & Knowledge Management Center Step 7: Update Generic Program to Put Everything Together 1. Download and save “GenericProgram.pl” under a new name, e.g., “printtext.pl” 2. Open in vanilla flavor text editor (NOT MS Word) 1. Windows: Use NotePad 2. Mac: Use Applications/TextEdit Use Format/Make Plain Text to ensure … plain text 3. Save file If Mac user, remove first line of text (“#!c:/…”) 12
Lane Medical Library & Knowledge Management Center Updating printtext.pl 13 PrintText printtext.pl
Lane Medical Library & Knowledge Management Center Run The Program Type perl –f “printtext.pl” 14
Lane Medical Library & Knowledge Management Center In Summary… Think in great detail about what the outcome looks like Think in terms of chunks Think about your variables (“data structures”) 15
Lane Medical Library & Knowledge Management Center 16