Sup.1 Supplemental Material (NOT part of the material for the exam)

Slides:



Advertisements
Similar presentations
Lecture 6 More advanced Perl…. Substitute Like s/// function in vi: #cut with EcoRI and chew back $linker = “GGCCAATTGGAAT”; $linker =~ s/CAATTG/CG/g;
Advertisements

Welcome to lecture 5: Object – Oriented Programming in Perl IGERT – Sponsored Bioinformatics Workshop Series Michael Janis and Max Kopelevich, Ph.D. Dept.
1 Introduction to Perl Part III: Biological Data Manipulation.
Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp
UNIT 12 UNIX I/O Redirection.
12.1 בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
11.1 Variable types in PERL ScalarArrayHash $number $string %hash $array[0] $hash{key}
Advanced Perl for Bioinformatics Lecture 5. Regular expressions - review You can put the pattern you want to match between //, bind the pattern to the.
9.1 Subroutines and sorting. 9.2 A subroutine is a user-defined function. Subroutine definition: sub SUB_NAME { STATEMENT1; STATEMENT2;... } Subroutine.
11ex.1 Modules and BioPerl. 11ex.2 sub reverseComplement { my ($seq) $seq =~ tr/ACGT/TGCA/; $seq = reverse $seq; return $seq; } my $revSeq = reverseComplement("GCAGTG");
5.1 Previously on... PERL course (let ’ s practice some more loops)
7ex.1 Hashes. 7ex.2 Let's say we want to create a phone book... Enter a name that will be added to the phone book: Eyal Enter a phone number:
9.1 Hash revision. 9.2 Variable types in PERL ScalarArrayHash $number $string %hash => $array[0] $hash{key}
Sorting. Simple Sorting As you are probably aware, there are many different sorting algorithms: selection sort, insertion sort, bubble sort, heap sort,
13.1 בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
14.1 Wrapping up Revision 14.3 References are your friends…
13.1 Wrapping up Running Other Programs 13.3 You may run programs using the system function: $exitValue = system("blastall.exe..."); if ($exitValue!=0)
10.1 Variable types in PERL ScalarArrayHash $number $string %hash => $array[0] $hash{key}
14.1 Wrapping up Some final Perl notes 14.3 You may run programs using the system function, returns 0 if the command was executed successfully:
8ex.1 References and complex data structures. 8ex.2 An associative array (or simply – a hash) is an unordered set of key=>value pairs. Each key is associated.
4.1 Revision. 4.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n"; my $number.
10.1 Sorting and Modules בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
I/O while ($line= ){ #remove new line char \n chomp($line); if($line eq “quit”){ exit(1); } while ( ){ #remove new line char \n chomp($_); if($_ eq “quit”){
12ex.1. 12ex.2 The BioPerl project is an international association of developers of open source Perl tools for bioinformatics, genomics and life science.
5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";
2ex.1 Lists and Arrays. 2ex.2 Comments on exercises Always run your script with “ perl -w ” and take care of all warnings  submitted scripts should not.
3ex.1 Note: use strict on the first line Because of a bug in the Perl Express debugger you have to put “use strict;” on the first line of your scripts.
Perl Functions Learning Objectives: 1. To learn how to create functions in a Perl’s program & how to call them 2. To learn how to pass [structured] arguments.
Lecture 8: Basic concepts of subroutines. Functions In perl functions take the following format: – sub subname – { my $var1 = $_[0]; statements Return.
Fortran- Subprograms Chapters 6, 7 in your Fortran book.
13r.1 Revision (Q&A). 13r.2 $scalar 13r.3 Multiple assignment my ($a,$b) = ('cow','dog'); = = (6,7,8,9,10);
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
11.1 Subroutines A function is a portion of code that performs a specific task. Functions Functions we've met: $newStr = substr
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Perl Tutorial Presented by Pradeepsunder. Why PERL ???  Practical extraction and report language  Similar to shell script but lot easier and more powerful.
General Computer Science for Engineers CISC 106 Lecture 07 James Atlas Computer and Information Sciences 06/29/2009.
Agenda User Profile File (.profile) –Keyword Shell Variables Linux (Unix) filters –Purpose –Commands: grep, sort, awk cut, tr, wc, spell.
Advanced UNIX Shell Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
Computer Programming for Biologists Class 8 Nov 28 th, 2014 Karsten Hokamp
13.1 בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
12.1 Running Other Programs And CGI Scripts Please fill the teaching survey at: I read it closely, and I.
7 1 User-Defined Functions CGI/Perl Programming By Diane Zak.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
5 1 Data Files CGI/Perl Programming By Diane Zak.
Perl Tutorial. Why PERL ??? Practical extraction and report language Similar to shell script but lot easier and more powerful Easy availablity All details.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Topic 4:Subroutines CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 4, pages 56-72, Programming Perl 3rd edition pages 80-83,
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
2.1 Scalar data - revision numeric e-14 ( = 6.35 × )‏ operators: + (addition) - (subtraction) * (multiplication) / (division)
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
Command Prompt Chapter 9 Pipes, Filters, and Redirection ©Richard Goldman 11/30/2000 Revised 10/16/2001.
BINF 634 Fall LECTURE061 Outline Lab 1 (Quiz 3) Solution Program 2 Scoping Algorithm efficiency Sorting Hashes Review for midterm Quiz 4 Outline.
File Handle and conditional Lecture 2. File Handling The Files associated with Perl are often text files: e.g. text1.txt Files need to be “opened for.
Lecture 20: C File Processing. Why Using Files? Storage of data in variables and arrays is temporary Data lost when a program terminates. Files are used.
CSC 4630 Perl 3 adapted from R. E. Beck. Problem But we worked on it first: Input: Read from a text file named in a command line argument Output: List.
PHP Tutorial. What is PHP PHP is a server scripting language, and a powerful tool for making dynamic and interactive Web pages.
Chapter 16 Advanced Bourne Shell Programming. Copyright © 2005 Pearson Addison-Wesley. All rights reserved. Objectives To discuss numeric data processing.
Linux Administration Working with the BASH Shell.
Modules and BioPerl.
CS1010 Programming Methodology
Lecture 7 You’re on your own now...
Subroutines Web Programming.
Programming Techniques
Presentation transcript:

Sup.1 Supplemental Material (NOT part of the material for the exam)

Sup.2 More about Blast

Sup.3 You may run programs using the system function: $exitValue = system("blastall.exe..."); if ($exitValue!=0) {die "blast failed!";} This way the output of blast will be seen on the screen. You can use ' > ' to redirect the output to a file: $exitValue = system("blastall.exe... > out.blast"); If you want to capture the output use “back-ticks” (left of the “1” key on your = `blastall.exe...`; In this case the output of blast is stored in the array. Running programs from a script

Sup.4 1. You could install blast on your computer from: ftp.ncbi.nlm.nih.gov ftp.ncbi.nlm.nih.gov There go to the directory: blast/executables/release/ The current version can be downloaded here: ftp://ftp.ncbi.nih.gov/blast/executables/release/2.2.21/blast ia32-win32.exe some non-official, but useful help can be found in: And the official help is here: Running a local blast

Sup.5 You can also work on the Unix servers of the bioinformatics unit you can use local blast that is already installed there. Genbank databases that are installed there can be used for blast and for any other work, such as getting a sequence by its accession. Running a local blast

Sup.6 Advanced Sorting

Sup.7 sub reverseComplement { my ($seq) $seq =~ tr/ACGT/TGCA/; $seq = reverse $seq; return $seq; } my $revSeq = reverseComplement("GCAGTG"); CACTGC A subroutine receives its arguments and may return a scalar or a list value: Subroutine revision

Sup.8 If we want to pass arrays or hashes to a subroutine, we must pass a reference: %gene = ("protein_id" => "E4a", "strand" => "-", "CDS" => [126,523]); printGeneInfo(\%gene); sub printGeneInfo { my ($geneRef) print "Protein $geneRef->{'protein_id'}\n"; print "Strand $geneRef->{'strand'}\n"; print "From: $geneRef->{'CDS'}[0] "; print "to: $geneRef->{'CDS'}[1]\n"; } Passing variables by reference

Sup.9 We learned the default sort, which is lexicographic: print sort("Yossi","Bracha","Moshe"); Bracha Moshe Yossi print sort(8,3,45,8.5); To sort by a different order rule we need to give a comparison subroutine – a subroutine that compares two scalars and says which comes first sort COMPARE_SUB (LIST); Advanced sorting no comma here

Sup.10 sort COMPARE_SUB (LIST); COMPARE_SUB is a special subroutine that compares two scalars $a and $b, and says which comes first (by returning 1, 0 or -1). For example: sub compareNumber { if ($a > $b){return 1;} elsif ($a == $b){return 0;} else{return -1;} } print sort compareNumber (8,3,45,8.5); Sorting numbers no comma here

Sup.11 The operator does exactly that – it returns 1 for “greater than”, 0 for “equal” and -1 for “less than”: sub compareNumber { return $a $b; } print sort compareNumber (8,3,45,8.5); For easier use, you can use a temporary subroutine definition in the same line: print sort {return $a $b;} (8,3,45,8.5); or just: print sort {$a $b;} (8,3,45,8.5); The operator

Sup.12 Now we can also sort complex = sort sub compareGenes { return $a->{"CDS"}[0] $b->{"CDS"}[0]; {protein_id => PROTEIN_ID strand => STRAND CDS => [START, END]}

Sup.13 Now we can also sort complex = sort sub compareGenes { if ($a->{"CDS"}[0] != $b->{"CDS"}[0]) { return $a->{"CDS"}[0] $b->{"CDS"}[0]; } else { return $a->{"CDS"}[1] $b->{"CDS"}[1]; } {protein_id => PROTEIN_ID strand => STRAND CDS => [START, END]}

Sup.14 The returns 1, 0 or -1 if the first value is greater, equal or lesser then the second numerically, respectively. The equivalent alphabetical operator is cmp. It returns 1, 0 or -1 if the first value is greater, equal or lesser then the second alphabetical. Is equivalent to: sort {$a cmp The operator cmp

Sup.15 Class exercise 12 Write scripts that read an input file with the following data, sort them and print them in a sorted order to the screen: 1. Sort a file of grades and names, according to the grades (e.g. grades.txt from the course website). 2. Sort a file where each line is a date. e.g. 24/7/2003 (e.g. dates.txt). 3. Sort the proteins in the file from ex. 9.1 by their lengths (create an array of keys sorted by the protein lengths). 4.* From the home exercise 4: Sort the CDSs from the adeno genome file: - First by the number of the exons - Then by the length of the CDS (without the introns!) e.g. E1B 55K (1 exon, 1449bp) comes before E1A (2 exons, 801), but after E1B 19K (1 exon, 492bp). Use an array of gene hashes as in class ex. 10, and an appropriate comparison subroutine. Print the sorted protein IDs with their number of exons and lengths of CDS.