1 An Introduction to Perl Part 1 CSC8304 – Computing Environments for Bioinformatics - Lecture 7
2 Objectives To introduce the Perl programming language Why Perl in bioinformatics? To Examine Perl syntax, operators and constructions Recommended Books: SAMS – Teach yourself Perl in 24 hours – Clinton Pierce Beginning Perl for Bioinformatics – James Tisdall The Best way to learn Perl is to read the books, numerous tutorials and to Practice. These notes are not a comprehensive tutorial – reading extra material is essential CSC8304 – Computing Environments for Bioinformatics - Lecture 7
3 Why Perl Benefits Ease of Programming Rapid Prototyping Portability speed and Maintenance Compared to Java No compilation A scripting language, object oriented concepts not required Less rigorous in promoting ‘good’ software engineering No exception handling Perl is much less strongly typed CSC8304 – Computing Environments for Bioinformatics - Lecture 7
4 Perl for Bioinformatics Wealth of existing code Perl is good at: Handling strings Detecting patterns in data Easy for biologists to learn Many books and tutorials Own specialists flavours e.g. BioPerl see CSC8304 – Computing Environments for Bioinformatics - Lecture 7
5 Installing Perl Perl is available for a variety of operating systems including windows and unix Visit Writing Perl program: Formulate program using an editor Save the program as xxx.pl Run using perl xxx.pl Perl is interpreted no compilation necessary CSC8304 – Computing Environments for Bioinformatics - Lecture 7
6 First Perl Program – Hello World #!/usr/bin/perl –w #Say Hello print “Hello, World! \n”; Shebang Line: Tells Computer is a Perl Program Comments prefixed with # Lines terminated with ; Inserts a carriage return Prints a string to output CSC8304 – Computing Environments for Bioinformatics - Lecture 7
7 Print Used to generate to a “file handle” Default file handle is STDOUT Takes a comma separated list of parameters as input CSC8304 – Computing Environments for Bioinformatics - Lecture 7
8 Print - Examples > print “Hello”,”out”,”there”; Hellooutthere> print “Hello out there\n”; Hello out there > print STDOUT “Hello out there\n”; Hello out there > CSC8304 – Computing Environments for Bioinformatics - Lecture 7
9 Numbers and Strings Literals: Integers (6), Floating Points (15.4), Scientific notation (6.6E-33) Strings: Strings of characters e.g. “foo” “Fourscore and seven\n years ago” ‘I don’t like writing lectures\n’ “MRes Bioinformatics is too easy… \n” A backslash (\) inside A string tells Perl that the next character is special In single quoting strings are Taken literally. In double quotes perl checks for escape sequences CSC8304 – Computing Environments for Bioinformatics - Lecture 7
10 Sample String Escape Sequences \n Newline \r Carriage Return \t Tab \b Backspace \u Change next character to upper case \l Change next character to lowercase CSC8304 – Computing Environments for Bioinformatics - Lecture 7
11 Scalar Variables To store scalar data in Perl requires the use of scalar variables in Perl A scalar variable is a variable that allows the definition of one piece of data for later use. A scalar variable in a program always starts with a dollar sign $ For example: $name $a $Date $serial_number $cat450 CSC8304 – Computing Environments for Bioinformatics - Lecture 7
12 Scalar Variables Variable names can contain alphabetic characters, numbers, or an underscore character The first character of a variable name can’t be a number though Variable names are case sensitive Single character names that do not start with an alphabetic character or underscore are special variables in Perl and should not be used as normal variables e.g. $_ $” $/ $2 $$ More on these later CSC8304 – Computing Environments for Bioinformatics - Lecture 7
13 Expressions and Operators Expressions $title = “Gone with the wind”; $pi = ; $area = * ($radius ** 2); Numeric Operators: 5 + $t AdditionSum of 5 and $t $y - $xSubtractionDifference between $y and $x $e * $piMultiplicationProduct of $e and $pi $f / 6 DivisionQuotient of $f divided by 6 24 % 5 ModulusRemainder of 24 divided by 5 (4) 4 ** 2 Exponential4 raised to the 2 nd Power CSC8304 – Computing Environments for Bioinformatics - Lecture 7
14 Expressions and Operators String Operators: Strings can also be manipulated using operators The concatenation operator ‘.’ joins two strings: $a=“Hello, World!”; $b=“Nice to meet you”; $c=$a.$b; If Perl finds a variable inside a string it is interpolated $name=“Fred”; print “I went to the pictures with $name”; CSC8304 – Computing Environments for Bioinformatics - Lecture 7
15 Expressions and Operators To prevent interpolation put a backslash in front of the variable identifier or put the string in single quotes: $name=“Fred”; print “I went to the pictures with \$name”; print ‘I went to the pictures with \$name’; Many more operators – see Perl reference book for details: e.g. int, length, lc, uc, cos etc…. Increment and decrement: $counter = 10; $counter = $counter + 1; $counter++; $counter--; CSC8304 – Computing Environments for Bioinformatics - Lecture 7
16 Expressions and Operators For the most part Perl allows you to use numbers and strings interchangeably How a variable is interpreted depends on what Perl is looking for at the time If something looks like a number Perl can use it as a number when a number is needed: $a=42; # A number print $a+18; # Displays 60 $b=“50”; print $b-10; # Displays 40 CSC8304 – Computing Environments for Bioinformatics - Lecture 7
17 Flow control Up to now all statements have been executed in the order top to bottom Perl’s control structures allow statements to be grouped in to statement blocks and run conditionally or repeatedly CSC8304 – Computing Environments for Bioinformatics - Lecture 7
18 Blocks The simplest grouping of statements is in a block { statement_a; statement_b; statement_c; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
19 The if statement Used to control whether statements are executed based on a condition Syntax: if (expression) BLOCK e.g. if ( $r == 5 ) { print ‘The value of $r is equal to 5.’; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
20 The if statement Can also use else : if ( $r == 5 ) { print ‘The value of $r is equal to 5.’; } else { print ‘$r is something other than 5.’; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
21 Relational Operators Perl, like Java, has operators for numerical conditions OPERATOR EXAMPLEEXPLANATION ==$x == $yTrue if $x equals $y >$x > $yTrue if $x is greater than $y < $x < $yTrue if $x is less than $y >= $x >= $yTrue if $x is greater than or equal to $y <=$x <= $yTrue if $x is less than or equal to $y !=$x != $yTrue if $x is not equal to $y CSC8304 – Computing Environments for Bioinformatics - Lecture 7
22 Alphanumeric Relational Operators Perl, like Java, has operators for numerical conditions OPERATOR EXAMPLEEXPLANATION eq$s eq $tTrue if $s is equal to $t gt$s gt $tTrue if $s is greater to $t lt$s lt $tTrue if $s is less than $t ge$s ge $tTrue if $s is greater than or equal to $t These operators decide “greater than” and “less than” by examining each character left to right and comparing them in ASCII order. Strings therefore sort in ascending order: Most punctuation first, then number, then uppercase then lowercase. CSC8304 – Computing Environments for Bioinformatics - Lecture 7
23 Looping Looping with while : while ( expression ) BLOCK $counter=0; while ($counter < 10) { print “Still counting...$counter\n”; $counter++; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
24 Looping Looping with for : for ( initialisation; test; increment ) BLOCK for ($counter=0; $counter< 10; $counter++) { print “Still counting...$counter\n”; } The initialisation expression is evaluated The test expression is evaluated; if is true the block of code is run After the block is executed the increment is performed and the test is evaluated again If the test is true the block is run again CSC8304 – Computing Environments for Bioinformatics - Lecture 7
25 Looping Fine grained control with last : The last statement causes the innermost currently running loop block to be exited if the condition evaluates to true $counter=0; while ($counter < 10) { print “Still counting...$counter\n”; last if ($counter==6); $counter++; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
26 Looping Fine grained control with next : The next statement causes control to be passed back to the top of the loop and the next iteration of the loop to begin if the loop isn’t finished for ($i=0; $i<100; $i++) { next if (not $i % 2); print “An odd number=$i\n”; } CSC8304 – Computing Environments for Bioinformatics - Lecture 7
27 Summary Numbers, strings, escape sequences Print, file handle Variables, expressions and operators Flow control: Blocks Looping: while, for, last, next CSC8304 – Computing Environments for Bioinformatics - Lecture 7
28 Q & A – 1 Is it true that the default file handle is STDOUT ? Can we get as output: Hello out there by typing and executing print “Hello”,”out”,”there”; ? Is it true that the following are names of variables: $name, $a, $Date, $serial_number ? What is the value of $c: $a=“Hello, World!”; $b=“Nice to meet you”; $c=$a.$b; ? CSC8304 – Computing Environments for Bioinformatics - Lecture 7
29 Q & A – 2 What do we get by executing $name=“Fred”; print “I went to the pictures with \$name”; ? What do we get if we write print ‘I went to the pictures with \$name’; ? Can we use ‘else’ in an if-statement ? Can we use ‘last’ in a for-statement ? What about ‘next’ ? CSC8304 – Computing Environments for Bioinformatics - Lecture 7