Books
Perl Perl (Practical Extraction and Report Language) by Larry Wall Perl 1.0 was released to usenet's alt.comp.sources in 1987 Perl 5 was released in Low Level lang. (C/C++,Pascal) – hard to write, fast runtime, unlimited. High Level lang. (shell, awk, …) – hard, slow, very limited. Perl – easy, mostly fast, nearly unlimited.
Perl Optimized to work with text Free source code, very good support (Perl source, docs, extensions) Perl for Windows
Just for Start Returns a list of strings from the input files Sorts the list Prints the list >sort.pl *.txt #!/usr/bin/perl print sort ; sort.pl
Hello Perl On Unix create text file, “hello.pl” (any name/extension is ok): #!/usr/bin/perl print “Hello, Perl!\n”; #comment your code $var1= “Hello, World!”. ”\n”; #No variable declaration is needed print $var1; $ivar += 10; #scalar variables initialized with 0 # (strings with empty string “” ) print “This is $ivar \n”;
Hello Perl (2) Set on executable flag: >chmod +x hello.pl Run it: >hello.pl #!/usr/bin/perl always use this header (without any spaces)
Strings ‘Hello Perl\n’ not equal to “Hello Perl\n” ‘Hello $var’ not equal to “Hello $var” Concatenation : “Hello”. ”Perl\n” = “Hello Perl\n” Copy: “Perl” x 3 = “PerlPerlPerl”
Print print “hello ”; print 3*$ivar; print “\n”; print “hello “, 3*$ivar, “\n”; $world=“world”; print “hello “, $world, “\n”; print “hello “. $world. “\n”; print “hello $world \n”; print “hello ${world}s \n”;
Comparison Operators ComparisonNumericString Equal==eq Not equal!=ne Less than<lt Greater than>gt Less than or equal to<=le Greater than or equal to>=ge If( $ivar != 5) … If( $str ne “hello”) …
Binary Operators a + b a – b a * b a / b a ** b a b a % bmodulus Unary Operators Changing Sign +a positive operand -a negative operand Changing Value Before Usage ++a $a=3; $b= ++$a; #b=4, a=4 --a Changing Value After Usage a++ $a=3; $b= $a++; #b=3, a=4 a--
User Input $line= ; while ($line= ){ chomp($line); #remove new line char \n if($line eq “quit”){ exit(0); }
Arrays $rocks[0]=“bedrock”; $rocks[1]=“lava”; $rocks[99]=“rock”; # now there are 100 elements print “$rocks[ $#rocks ] \n”; # prints last element ‘rock’ $size = $#rocks + 1; #number of elements
Lists (0,5,6,7,8); (0,5..8); # (..) range operator (0,”5..8”); # contains two elements 0 and string “5…8”
List Assignment ($color, $tree, $list) = (“green”, “red-black”, “linked”); #swap ($color, $tree) = ($tree, $color); ($i, $j)=(1..3); # 3 – is ignored ($i, $j, $k)=(1,2); # $k gets undef
List Assignment (2) ($color[0], $color[1], $color[2])=(“red”, “blue”, “yellow”); # more_colors contains “white”, “red”, “blue”, “green”, “yellow” print “Five \n”; #five colors: white red blue green yellow.
“blue”, “green”); for($i=0;$i <= $#colors; $i++){ print “$colors[$i] \n”; } #other way to do the same #but array is changed foreach $col $col.= “\n”; } #much better foreach $col print “$col \n”; }
Push and 1..5; 6; #array contains 1..6 $six= #array contains #array contains 1..10
Shift and Unshift Push and Pop – for the end of an array Shift and Unshift – for the start of an 1..5; $five = #array contains 2…5 1; #array contains 1…5 (-2,-1,0); #array contains –2…5 ! Shift and Unshift, unlike Push and Pop, change indices of all array elements
Default variable “blue”, “green”); foreach $_.= “ \n”; } $_ = “default variable\n”; print; #prints “default variable\n”
Reverse, Sort ; reverse #array contains #sorts in ASCII sort #array contains 1, 100,11,12 … 19,2,20,…,9,90…99
in List Context open FILE, “readme.txt” or die “Cannot open file: $!”; while ($line= ){ chomp($line); #remove new line char \n $line; } # or this way chomp = ); #better = ; );
Perl is Context #list context $ii = 3 #scalar context: 3+ArraySize ($ii) #list context: = 38; #list $str= ; #return next line form ; #returns all remaining lines
Example Task: Write a program that prints each line in a right- justified 20 character column. First print a “ruler-line” of digits hello test-test 20
Example (solution) ); print “ ” x 7, “\n”; foreach printf “%20s\n”, $_; } $format = “%20s\n” printf For more info on printf run: perldoc –f sprintf
Control Statements: unless, until if( $a != $b){ } unless ( $a== $b){ } while( $a > $b){ } until( $a<=$b ){ }
Control Statements: elsif if ( expression1 ){ }elsif(expression2 ){ }elsif(expression3 ){ }else{ } Only the block of the first true conditional expression is executed, or else otherwise.
Control Statements: next, last while( ){ if( /Protein ID/ ){ print ; next; } ……….. if( /Remarks/ ){ last; }
Control Statements: redo while ( … ) { if( … ){ redo; #starts the current iteration from the beginning (control # statement is not evaluated) #notice the difference with the next (starts the next iteration) }
Control Statements: logical operator $proteinID = $proteins{ $id } or “Unknown”; #or like in C - expression ? TrueExpression : FalseExpression; $proteinID = $proteins{ $id } ? $proteins{ $id } : “Unknown”;
Example: Parsing FASTA file >roa1_drome Rea guano receptor type III >> 0.1 MVNSNQNQNGNSNGHDDDFPQDSITEPEHMRKLFIGGLDYRTTDENLKAHEKWGNIVDVV VMKDPRTKRSRGFGFITYSHSSMIDEAQKSRPHKIDGRVEPKRAVPRQDIDSPNAGATVK KLFVGALKDDHDEQSIRDYFQHFGNIVDNIVIDKETGKKRGFAFVEFDDYDPVDKVVLQK QHQLNGKMVDVKKALPKNDQQGGGGGRGGPGGRAGGNRGNMGGGNYGNQNGGGNWNNGGN NWGNNRGNDNWGNNSFGGGGGGGGGYGGGNNSWGNNNPWDNGNGGGNFGGGGNNWNGGND FGGYQQNYGGGPQRGGGNFNNNRMQPYQGGGGFKAGGGNQGNYGNNQGFNNGGNNRRY >roa2_drome Rea guano ligand MVNSNQNQNGNSNGHDDDFPQDSITEPEHMRKLFIGGLDYRTTDENLKAHEKWGNIVDVV VMKDPTSTSTSTSTSTSTSTSTMIDEAQKSRPHKIDGRVEPKRAVPRQDIDSPNAGATVK KLFVGALKDDHDEQSIRDYFQHLLLLLLLDLLLLDLLLLDLLLFVEFDDYDPVDKVVLQK QHQLNGKMVDVKKALPKNDQQGGGGGRGGPGGRAGGNRGNMGGGNYGNQNGGGNWNNGGN NWGNNRGNDNWGNNSFGGGGGGGGGYGGGNNSWGNNNPWDNGNGGGNFGGGGNNWNGGND FGGYQQNYGGGPQRGGGNFNNNRMQPYQGGGGFKAGGGNQGNYGNNQGFNNGGNNRRY
Example: Parsing FASTA file (2) print “Input file name:”; $filename= ; open FASTA, ; foreach $line f($line =~ /^\s*$/){ next; #continue if empty line } if($line =~ />/){ print "Header:", $line; }else{ print "Seq:", $line; } Parses file in FASTA format. Filename is input to the program.
Example: Parsing FASTA file (3) die "Can't open file:$ARGV[0]" unless open( FASTA, ; foreach unless( /^\s*$/ ){ #continue if empty line if( />/ ){ print "Header:", $_; }else{ print "Seq:", $_; } Parses file in FASTA format. Filename is argument to the program.
Example: Parsing FASTA file <>; foreach unless( /^\s*$/ ){ #continue if empty line if( />/ ){ print "Header:", $_; }else{ print "Seq:", $_; } Parses file in FASTA format. Filenames are arguments to the program. Automatic error message in case of invalid filename. Reads all input files (from the argument list)
Example: Parsing FASTA file (5) while ( <> ){ unless( /^\s*$/ ){ #continue if empty line if( />/ ){ print "Header:", $_; }else{ print "Seq:", $_; } Parses file in FASTA format. Filenames are arguments to the program.
HomeWork (a)Parsing FASTA file. Write two separate programs. Read FASTA file. Output reverse sequence in FASTA format. Hint: use split function (perldoc –f split). For DNA sequence output reverse complement. Hint: use substr function (perldoc –f substr) (c) Create your personal I-net Home Page.