Download presentation
Presentation is loading. Please wait.
Published byAdam Turner Modified over 8 years ago
1
12.1 Running Other Programs And CGI Scripts
2
12.2 Please fill the teaching survey at: http://www.ims.tau.ac.il/tal/login.asp I read it closely, and I make changes in the course from year to year according to the feedback. Teaching Survey
3
12.3 The exam will be on the computers in the PC classroom, on the 31/1/2007 at 9:00 The computers will be disconnected from the network (i.e. no internet access. Sorry… ) You will receive a floppy disk (diskette) with some files, and the exam questions on paper. You will write your solutions as normal Perl scripts and save them to the floppy, which you will submit at the end of the exam. 2 A4 pages Everything except BioPerl and CGI Exam
4
12.4 Write a script that reads a DNA sequence from STDIN and prints its reverse complement. The sequence may be in either small or capital letters. The file exam1.pl contains a script that reads a sequence file in Genbank format. Add the missing regular expression in order to find all CDS lines in line number 25. The regular expression should extract the coordinates of the start and stop codons. Fill in the appropriate variables in lines 27 and 28. The file exam2.pl contains a script that reads a file in PDB format (see example in EHD1.pbd) and finds all the “ATOM…” lines. Write the subroutine getAtomInfo that is called for each such line. The subroutine has one parameter – the scalar string of the ATOM line. It should return the following data structure: {‘amino_acid’ => AMINO_ACID, ‘coordinates’ => [X,Y,Z], ‘amino_acid_number’ => N} Make a copy of exam2.pl and name it exam3.pl. Add a new section at the end of the script that makes an array of arrays. Each internal array should hold all the hashes of the ATOMs that belong to a single amino acid of the protein. Some exam questions
5
12.5 Running Other Programs
6
12.6 e.g. Rate4Site: Still not very widely used (54 citations so far…) so there is no BioPerl modules that will run it for you and read its output:Rate4Site Dealing with less common formats #POS SEQ SCORE QQ-INTERVAL STD MSA DATA #The alpha parameter 1.5 1 K -0.9763 [-1.6621,-0.5750] 0.8777 6/6 2 V 0.9820 [-0.1107,2.2169] 1.5983 6/6 3 F 0.0035 [-0.9640,0.4935] 1.3195 6/6 4 S 0.2010 [-0.7766,0.8962] 1.3975 6/6 5 K -0.3480 [-1.1423,0.1673] 1.0990 6/6 6 C -0.7887 [-1.4855,-0.3560] 1.0182 6/6 7 E -0.9894 [-1.6621,-0.5750] 0.8714 6/6 8 L 0.0153 [-0.9640,0.4935] 1.3378 6/6 9 A -1.1347 [-1.6621,-0.7766] 0.7487 6/6 10 H -0.3200 [-1.1423,0.1673] 1.1252 6/6 11 K -0.3557 [-1.1423,0.1673] 1.1077 6/6 12 L -0.8331 [-1.4855,-0.3560] 0.9965 6/6 13 K -0.9763 [-1.6621,-0.5750] 0.8777 6/6 14 A 1.6809 [0.4935,2.2169] 1.6672 6/6 15 Q 1.4315 [0.1673,2.2169] 1.7297 6/6 16 E 0.1025 [-0.9640,0.8962] 1.3784 6/6 17 M 0.5006 [-0.5750,1.4226] 1.4456 6/6
7
12.7 You may run programs using the system function: $exitValue = system("blast.exe..."); if ($exitValue!=0) {die "blast failed!";} This way the output of blast will be seen on the screen. Another way is to use “back-ticks” (left of the “1” key on your keyboard): @blastOutput = `blast.exe...`; This way the output of blast is stored in the array. Running programs from a script
8
12.8 Class exercise 15 1. Write a script that runs clustalw on a given protein FASTA file (use ex15.zip from the website, use the help file in there!) 2. Modify the script: Now do both multiple sequence alignment, and build an NJ tree. 3. Modify the script: Now add a rate4site run on the output of clustalw (type “ rate4site.exe -h ” for help)
9
12.9 CGI Scripts: Producing Web Pages
10
12.10 A CGI script is a script that is intended to be used over the internet. A CGI script on a web server can be used by a user to obtain data from databases (e.g. Genbank web server) or run analyses for the user (e.g. Blast at NCBI). The results of the script are an HTML page. CGI: Common Gateway Interface
11
12.11 All web pages that you see on the internet are written in HTML. HTML (HyperText Markup Language) is a computer language that defines how a web page will look in you web browser. Web browsers (such as Microsoft Internet Explorer) read HTML text files and produce colorful graphical pages. You can see the HTML source code of a web page in Explorer by clicking: View->Source Try it on the course web page: HTML: What is a web page? ob Perl Programming By Eyal Privman …
12
12.12 HTML uses tags. Tags are always enclosed in angle-brackets and are case- insensitive. For example: Tags typically occur in begin-end pairs. These pairs are in the form... For example, if you want some text to be underlined in your page: Aim: The aim of this course is to introduce the participant HTML basics
13
12.13 The whole document should be between... The text between... includes general information about the page. Inside the “head” section, use... to write the title of the page. The text between... is the actual contents of the page Structure of HTML documents
14
12.14 Class exercise 16 1. Create the following HTML file and view it with Internet Explorer: Hello World Page Hello World! (name your file “ class_ex16.1.html ” )
15
12.15 The easiest way to get yourself a webserver is if you have an account at the bioinformatics unit. (On the bioinfo server) You should place your HTML files and CGI script in your home directory on the bioinfo server. You will have to ask the staff of the bioinfo unit to open your account to web access. (They will create the needed directories for you) Running a CGI over the web
16
12.16 Any Perl script can output its results in HTML, using simple print commands. The Perl CGI module can make it easier for you: #!/usr/local/bin/perl This is necessary on a UNIX server use CGI; my $cgi = new CGI; print $cgi->header. $cgi->start_html('Hello World Page'). $cgi->h1('Hello World!'). $cgi->end_html; exit (0); Tells the server everything is fine Producing HTML page with a script
17
12.17 Class exercise 16 2. Create the Perl script from the previous slide and test it.
18
12.18 An HTML form can run a CGI script
19
12.19 Here is the HTML that makes this form that takes input (a name) and invokes a CGI script named script.pl, which should be placed in the directory cgi-bin: An HTML form can run a CGI script HTML Form Example Enter your name: Submit this Form Reset this Form
20
12.20 Use the CGI function param to get the input that was entered into the form. To get a list of all parameter names: my @params = $cgi->param(); To get the value for a specific parameter name: my @params = $cgi->param(PARAM_NAME); For the example form in the previous slide, the CGI script could do this: print $cgi->h1('Hello '.$cgi->param("userName").'!'); Using the input in the CGI script
21
12.21 Class exercise 17 Create the HTML form and the Perl script from the previous slides on the bioinfo server (it ’ s a UNIX system!): 1. Log in to bioinfo using TeraTerm (Start ??? Tera Term): The host is “ bioinfo.tau.ac.il ”, choose SSH, click OK, click Yes, user-name is “ symp ”, password is “ turj ”. 2. In UNIX you can use “ cd ” as in Windows, and “ ls ” or “ ls -l ” are like “ dir ”. 3. Use the command “ mkdir DIR_NAME ” to create a directory named as your first name inside the directory “ public_html ”. the HTML file should be in there. 4. To create and edit files use the editor pico ( “ pico FILE_NAME ” ). To paste into TeraTerm click the middle mouse button. 5. To access this HTML from your browser use this address: http://bioinfo.tau.ac.il/~symp/YOUR_NAME/form.html http://bioinfo.tau.ac.il/~symp/YOUR_NAME/form.html
22
12.22 Class exercise 17 6. Create another directory for yourself inside the directory “ cgi-bin ”. The CGI script should be in there. 7. After creating the script you have to give it execution permissions: “ chmod +x SCRIPT_NAME ”. Use “ ls -l ” to check that it now has x ’ s like this: (bioinfo:symp)~/cgi-bin/eyal>ls –l -rwxr-xr-x 1 symp staff 167 Jan 23 13:14 hello.pl* 8. The reference to the CGI script in the HTML form should be: Bonus* Write another HTML form that ask the user for a FASTA file of DNA sequences, and runs a CGI version of ex3.4 (find ORFs in each sequence)
23
12.23 Installing packages: Do it yourself!
24
12.24 If you find a package in CPAN or elsewhere you can usually download a zip archive of all the files of the package, which usually is a.tar.gz file For example: Search for BioPerl version 1.4 in CPAN – it should be called something like “bioperl-1.4.tar.gz” Unzip it (extract the files from the compressed archive) Place the unzipped files or directories in the ActivePerl directory on your computer in the site\lib\ directory. (…\ActivePerl-5.8.7.813\site\lib\) For example – the “Bio” directory of BioPerl should be moved to: …\ActivePerl-5.8.7.813\site\lib\Bio Now you should be able to use modules named like Bio::SeqIO. Test it with SeqIO_example.pl (available on the webpage) Download and install a package (Class exercise 17)
25
12.25 The command “use lib” asks Perl to search in certain directory when searching for packages that are used in the script: use lib 'D:\perl\myPackages'; use myPackage; (Assuming that the direcory “myPackages” contains “myPackage.pm”) Move the “Bio” directory of BioPerl to a ‘D:\test’ and make SeqIO_example.pl find it by adding “use lib” Using packages from other directories (Class exercise 17)
26
12.26 Running Blast Remotely and Locally
27
12.27 BioPerl lets us to blast our sequence at the NCBI website: Use Bio::Tools::Run::RemoteBlast Instead of Bio::Tools::Blast (which I showed you before) use Bio::Tools::Run::RemoteBlast ; # here we define the parameters and input of blast my %runParam = (-method => 'remote', -prog => 'blastp', -database => 'swissprot', -seqs => [$seqObj1,$seqObj2]); # here we run it my $blastObj = Bio::Tools::Blast->new( -run => \%runParam, -parse => 1,# ask to parse the report -signif => '1e-10',# the cutoff -strict => 1); BioPerl: run blast over the web
28
12.28 1. You could install blast on your computer from: ftp.ncbi.nlm.nih.gov ftp.ncbi.nlm.nih.gov (There go to the directory: blast/executables/release/) But this may be difficult, and you will also need to download and install the databases you want to search. 2. You can also work on the Unix servers of the bioinformatics unit you can use local blast that is already installed there. Genbank databases that are installed there can be used for blast and for any other work, such as getting a sequence by its accession. Running a local blast
29
12.29 Class exercise 18 1. Write a script that runs blast over the web on a given protein FASTA file (Use the same FASTA file as in ex. 14), and print the accessions of the first 20 hits for each input sequence. 2. Modify the script: Take the accession of a sequence as a command-line argument, fetch this sequence from Genbank over the web, and then blast it
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.