Presentation is loading. Please wait.

Presentation is loading. Please wait.

Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine.

Similar presentations


Presentation on theme: "Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine."— Presentation transcript:

1 Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine

2 Sources Beginning Perl for Bioinformatics –James Tisdall, O’Reilly Press, 2000 Using Perl to Facilitate Biological Analysis in Bioinformatics: A Practical Guide (2nd Ed.) –Lincoln Stein, Wiley-Interscience, 2001 Introduction to Programming and Perl –Alan M. Durham, Computer Science Dept., Univ. of São Paulo, Brazil

3 Debugging Hopefully you were lucky enough to have some bugs in your programs from the first Perl exercise. Test each line as you write –insert extra print statements to check on variables

4 Perl Debugging Help Add -w on the first line of your programs: #!usr/local/perl -w –provides ‘warnings’ Add use strict as the 2nd line of your programs –enforces proper variable names –must initialize variables before using (set to some initialvalue such as 0 or empty)

5 Variable “Interpolation” A variable holds a value $value = 6; When you print the variable, Perl gives the value rather than the name of the variable. print $value; 6 If you put a variable inside double quotes, Perl substitutes the value (this is called variable interpolation) print “The result is $value\n” The result is 6 If you use single quotes, the variable name is used (interpolation is not used) print ‘The result is $value\n’ The result is $value\n

6 Input A Perl program can take input from the keyboard –The angle bracket operator ( <> )takes input –Usually this is assigned to a variable print “Please type a number: ”; $num = <>; print “Your number is $num\n”;

7 chomp When data is entered from the keyboard, Perl waits for the Enter key to be typed But the string which is captured includes a newline (carriage return) at its end Perl uses the function chomp to remove the newline character: print “Enter your name: ”; $name = <>; print “Hello $name, happy to meet you!\n”; chomp $name; print “Hello $name, happy to meet you!\n”;

8 Working with Text Files To do real work, Perl has to read data out of text files and write results into output files This is done in two steps First, you must give the file a name within the script - this is known as a filehandle Use the open command: open FILE1, ‘/u/schmoj01/Seqs/protein1.seq’;

9 Read From the File Once the file is open, you can read from it using the <> operator –(put the filehandle between the angle brackets) Perl reads files one line at a time, each time you input data from the file, the next line is read: open FILE1, ‘/u/prot1.seq’; $line1 = ; chomp $line1; $line2 = ; …etc

10 Write to a File Writing to a file is similar to reading from it Use the > operator to open a file for writing: open FILE1, ‘>/u/prot1.seq’; This creates a new file with that name, or overwrites an existing file Use >> to append text to an existing file print to the file using the filehandle: print FILE1 $data1;

11 Making Decisons Useful programs must be able to make some decisions on their own The if operator is very powerful It is generally used together with numerical or string comparison operators numerical: ==, !=, >, <, ≥, ≤ strings:eq, ne, gt, lt, ge, le

12 True/False Perl relies on the concept of True/False decisions. Things are true if the math works. The not operator ! reverses it print “positive number” if ! ($a < 0);

13 Conditional Blocks An if test can be used to control multiple lines of commands: print “Enter your age: ”; $age = <>; chomp $age; if ($age < 21) { print “You are too young for this kind of work!\n”; die “too young”; } print “You are old enough to know better!\n”; If the test is true, execute all the command lines inside the {} brackets. If not, then go on past the closing } to the statements below.

14 If evaluates some statement in parentheses (must be true or false) Note: conditional block is indented –Perl doesn’t care about indents, but it makes your code more human readable die is a special function - stops your script and prints its message –Often used to test if keyboard input data is valid or if an input file exists.

15 Else & Elseif Instead of just letting the script go on if it fails the if test, you can designate a second block of code for the “or else” condition You can also perform multiple tests using elseif if $A = 10 { print “yadda yadda”; # do some stuff } elseif $A > 10 { print “yowsa yowsa”; # do different stuff } elseif $A < 10 { print “do this other stuff”; } else $A { print “if it ain\’t =, >, or <, then I’m stumped” die “not a number”; }

16 Loops OK, we’ve got variables, input & output and decisions. Now we need Loops. Loops test a condition and repeat a block of code based on the result –while loops repeat while the condition is true $count = 1; while ($count <= 10) { print “$count bottles of pop\n”; $count = $count +1; }; print “POP!\n”; [Try this program yourself]

17 Read a File: line by line open FILE1, ‘/u/doej01/prot1.seq’; while ($line = ){ chomp($line); $my_sequence = $my_sequence. $line ; }; close FILE1 Dumps the whole file into the variable $my_sequence

18 Arrays It is awkward to store a large DNA sequence in one variable, or to create many variables for a list of numbers Perl has a type of variable called an “array” that can store a list of data –multiple lines of a text file –a list of numbers –a list of words Array variables are referred to with an “@” symbol @numbers = (1,2,45,234,11);

19 Bioinformatics Uses Arrays bioinformatics data often comes in the form of arrays –tab delimited lists –multi-line text files Arrays are handy because the entries are indexed –You can grab the third number directly @numbers = (1, 2, 45, 234, 11); print “$numbers[3]\n”; 234 #Note - the index starts with zero!

20 Read a File into an Array Rather than read a file one line at time into a scalar variable, it is often helpful to read the entire file into an array open FILE1, ‘/u/doej01/prot1.seq’; @DNA = ;

21 join combines the elements of an array into a single scalar variable (a string) $DNA = join('', @DNA); substr takes characters out of a string $letter = substr($DNA, $position, 1) join & substr which string where in the string how many letters to take which array spacer (empty here)

22 Exercise Read a DNA sequence from a text file Calculate the %GC content What about non-DNA characters in the file? –carriage returns and blank spaces –N’s or X’s or unexpected letters Write the output to the screen and to a file –use append so that the file will grow as you run this program on additional sequences


Download ppt "Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine."

Similar presentations


Ads by Google