Download presentation
Presentation is loading. Please wait.
1
Perl for Bioinformatics
Stuart Brown NYU School of Medicine
2
Sources Beginning Perl for Bioinformatics
James Tisdall, O’Reilly Press, 2000 Using Perl to Facilitate Biological Analysis in Bioinformatics: A Practical Guide (2nd Ed.) Lincoln Stein, Wiley-Interscience, 2001 Introduction to Programming and Perl Alan M. Durham, Computer Science Dept., Univ. of São Paulo, Brazil
3
Why Write Programs? Automate computer work that you do by hand - save time & reduce errors Run the same analysis on lots of similar data files = scale-up Analyze data, make decisions sort Blast results by e-value &/or species of best mach Build a pipeline Create new analysis methods
4
Why Perl? Fairly easy to learn the basics
Many powerful functions for working with text: search & extract, modify, combine Can control other programs Free and available for all operating systems Most popular language in bioinformatics Many pre-built “modules” are available that do useful things
5
Get Perl www.perl.org You can install Perl on any type of computer
Your account on mcrcr0 already has Perl Just log in - you don’t even need to type any command to make Perl active. Download and install Perl on your own computer:
6
Programming Concepts Program = a text file that contains instructions for the computer to follow Programming Language = a set of commands that the computer understands (via a “command interpreter”) Input = data that is given to the program Output = something that is produced by the program
7
Programming Write the program (with a text editor) Run the program
Look at the output Correct the errors (debugging) Repeat (computers are VERY dumb -they do exactly what you tell them to do, so be careful what you ask for…)
8
Strings Text is handled in Perl as a string
This basically means that you have to put quotes around any piece of text that is not an actual Perl instruction. Perl has two kinds of quotes - single ‘ ‘ and double “ “ (they are different- more about this later)
9
Print Perl uses the term “print” to create output
Without a print statement, you won’t know what your program has done You need to tell Perl to put a carriage return at the end of a printed line Use the “\n” (newline) command Include the quotes The “\” character is called an escape - Perl uses it a lot
10
Your First Perl Program
Log in to mcrcr0 Open a new text file >emacs my_perl1.pl Type: #!/usr/bin/perl # my first Perl program print “Hello world \n”; Awesome, isn’t it!
11
Program details #!/usr/bin/perl
Perl programs always start with the line: #!/usr/bin/perl this tells the computer that this is a Perl program and where to get the Perl interpreter All other lines that start with # are considered comments, and are ignored by Perl Lines that are Perl commands end with a ;
12
Run your Perl program >chmod u+x *.pl [#make the file executable]
>perl my_perl1.pl [#use the perl interpreter to run your script]
13
Numbers and Functions Perl handles numbers in most common formats:
456 5.6743 6.3E-26 Mathematical functions work pretty much as you would expect: 4+7 6*4 43-27 256/12 2/(3-5)
14
Do the Math (your 2nd Perl program)
#!/usr/bin/perl print “4+5\n”; print 4+5 , “\n”; print “4+5=” , 4+5 , “\n”; [Note: use commas to separate multiple items in a print statement, whitespace is ignored]
15
Variables To be useful at all, a program needs to be able to store information from one line to the next Perl stores information in variables A variable name starts with the “$” symbol, and it can store strings or numbers Variables are case sensitive Give them sensible names Use the “=”sign to assign values to variables $one_hundred = 100 $my_sequence = “ttattagcc”
16
You can do Math with Variables
#!/usr/bin/perl #put some values in variables $sequences_analyzed = 200 ; $new_sequences = 21 ; #now we will do the work $percent_new_sequences =( $new_sequences / $sequences_analyzed) *100 ; print “% of new sequences = ” , $percent_new_sequences; % of new sequences =
17
String Operations Strings (text) in variables can be used for some math-like operations Concatenate (join) use the dot . operator $seq1= “ACTG”; $seq2= “GGCTA”; $seq3= $seq1 . $seq2; print $seq3 ACTGGGCTA String comparison (are they the same, > or <) eq (equal ) ne (not equal ) ge (greater or equal ) gt (greater than ) lt (less than ) le (less or equal ) Uses some non-intuitive ways of comparing letters (ASCII values)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.