Lecture 7 You’re on your own now..
More reading files Using regexes within while (<FH>): />(.+)/; #regex acts on $_ by default push @anlines, $1; #keep annotation }
last and next while (<FH>) { next if (/>/); last unless (/[ACGT]/); $sequence = chomp $_; push @anlines, $sequence; }
Confounding problems Windows, Mac and UNIX file endings: Perl is looking for the UNIX file ending on UNIX, the Windows one on ActivePerl for Windows So \n means different things depending on the computer
Reformatting in Perl Unix uses LF (carriage return, \015) Windows line ending is CR LF (\015\012) Need to remove LFs (s/\012//g) Mac line ending (to OS9) is CR (\012) Need to replace CR with LF (s/\012/\015/g)
Reformatting in UNIX Can also be done on the UNIX command line: dos2unix file.txt Linux has got quite good at dealing with dos files, but you will still have problems with other UNIXes (eg Solaris) If there is a problem, the extra character (CR) will print as “^M”
System calls There are three commands that can be used to make system calls – i.e., to make UNIX commands from within Perl exec system `` (“backticks”)
examples exec “ls”; system “ls”; my $ls = `ls`; Tell UNIX “ls” then die. Results will be printed to the terminal system “ls”; Tell UNIX “ls”, wait for it to finish, then continue with the program. Results will be printed to the terminal my $ls = `ls`; Tell UNIX “ls”, but don’t print any output to the terminal. Wait until the command has finished, then capture the output in the variable $ls
Doing BLAST from within Perl So now we can operate BLAST, or any other command, from inside our Perl script: open FASTA, “>seq.fas”; print FASTA, “>seq\n$sequence”; close FASTA; my $res = `blastall –i seq.fas –p blastp –d nr`; if ($res =~/Sequences producing significant/) { print “BLAST hits found!\n”;} else {print “No BLAST hits!\n”;}
Rules for successful programming Trust no-one (I mean, to write code for you. I am especially thinking of Bioperl! Often it is quicker to write it yourself in the long run) Don’t get bogged down in complexity. There is probably a simpler way to do it. Try alternative approaches Be vicariously lazy (Larry Wall)