5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";

Slides:



Advertisements
Similar presentations
CSC 4630 Perl 1. Perl Practical Extraction and Support Language A glue language under UNIX Written by Larry Wall Claimed to be the most portable of scripting.
Advertisements

Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp
Second edition Your UNIX: The Ultimate Guide Das © 2006 The McGraw-Hill Companies, Inc. All rights reserved. UNIX – The Master Manipulator perl Perl is.
Perl for Bioinformatics Lecture 4. Variables - review A variable name starts with a $ It contains a number or a text string Use my to define a variable.
CS 330 Programming Languages 10 / 14 / 2008 Instructor: Michael Eckmann.
Introduction to Perl Bioinformatics. What is Perl? Practical Extraction and Report Language A scripting language Components an interpreter scripts: text.
CSET4100 – Fall 2009 Perl Introduction Scalar Data, Operators & Control Blocks Acknowledgements: Slides adapted from NYU Computer Science course on UNIX.
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
4.1 Controls: Ifs and Loops. 4.2 Controls: if ? Controls allow non-sequential execution of commands, and responding to different conditions else { print.
4ex.1 More loops. 4ex.2 Loops Commands inside a loop are executed repeatedly (iteratively): my $num=0; print "Guess a number.\n"; while ($num != 31) {
9.1 Subroutines and sorting. 9.2 A subroutine is a user-defined function. Subroutine definition: sub SUB_NAME { STATEMENT1; STATEMENT2;... } Subroutine.
Scalar Variables Start the file with: #! /usr/bin/perl –w No spaces or newlines before the the #! “#!” is sometimes called a “shebang”. It is a signal.
CS 330 Programming Languages 10 / 11 / 2007 Instructor: Michael Eckmann.
11ex.1 Modules and BioPerl. 11ex.2 sub reverseComplement { my ($seq) $seq =~ tr/ACGT/TGCA/; $seq = reverse $seq; return $seq; } my $revSeq = reverseComplement("GCAGTG");
E.1 Eclipse. e.2 Installing Eclipse Download the eclipse.installation.exe from the course web site to your computer and execute it. Keep the destination.
5.1 Previously on... PERL course (let ’ s practice some more loops)
Python November 18, Unit 7. So Far We can get user input We can create variables We can convert values from one type to another using functions We can.
Getting Started with Perl (and Excel) Biophysics 101 September 17, 2003 Griffin Weber (With material from Jon Radoff and Ivan Ovcharenko)
6.1 Short foreach revision. 6.2 $arr[2]$arr[1]$arr[3]$arr[4] Loops: foreach The foreach loop passes through all the elements of an array = (2,3,4,5,6);
1ex.1 Perl Programming for Biology Exercise 1 The Bioinformatics Unit G.S. Wise Faculty of Life Science Tel Aviv University, Israel March 2009 Eyal Privman.
4.1 Revision. 4.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n"; my $number.
10.1 Sorting and Modules בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
3.1 Ifs and Loops. 3.2 Revision: variables Scalar variables can store scalar values: Variable declaration my ($priority); Numerical assignment $priority.
4.1 More loops. 4.2 Loops Commands inside a loop are executed repeatedly (iteratively): my $num=0; print "Guess a number.\n"; while ($num != 31) { $num.
2.1 Lists and Arrays Summary of 1 st lesson Single quoted and double quoted strings Backslash ( \ ) – the escape character: \t \n Operators:
3ex.1 Note: use strict on the first line Because of a bug in the Perl Express debugger you have to put “use strict;” on the first line of your scripts.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
11.1 Subroutines A function is a portion of code that performs a specific task. Functions Functions we've met: $newStr = substr
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Builtins, namespaces, functions. There are objects that are predefined in Python Python built-ins When you use something without defining it, it means.
MCB 5472 Assignment #6: HMMER and using perl to perform repetitive tasks February 26, 2014.
Subroutines and Files Bioinformatics Ellen Walker Hiram College.
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.
CS 330 Programming Languages 10 / 07 / 2008 Instructor: Michael Eckmann.
Bioinformatics 生物信息学理论和实践 唐继军
Books. Perl Perl (Practical Extraction and Report Language) by Larry Wall Perl 1.0 was released to usenet's alt.comp.sources in 1987 Perl 5 was released.
Perl Language Yize Chen CS354. History Perl was designed by Larry Wall in 1987 as a text processing language Perl has revised several times and becomes.
Perl: Lecture 1 The language. What Perl is Merger of Unix tools – Very popular under UNIX – shell, sed, awk Programming language – C syntax Scripting.
Chapter 10: BASH Shell Scripting Fun with fi. In this chapter … Control structures File descriptors Variables.
Python uses boolean variables to evaluate conditions. The boolean values True and False are returned when an expression is compared or evaluated.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
(A Very Short) Introduction to Shell Scripts CSCI N321 – System and Network Administration Copyright © 2000, 2003 by Scott Orr and the Trustees of Indiana.
6.1 Before we start ( צילום : איתן שור ) Let’s talk a bit about the last exercise, and Eclipse…
Perl Tutorial. Why PERL ??? Practical extraction and report language Similar to shell script but lot easier and more powerful Easy availablity All details.
Introduction to Perl October 4, 2004 Class Meeting 7 * Notes on Perl by Lenwood Heath, Virginia Tech © 2004.
Topic 2: Working with scalars CSE2395/CSE3395 Perl Programming Learning Perl 3rd edition chapter 2, pages 19-38, Programming Perl 3rd edition chapter.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
1.1 Perl Programming for Biology G.S. Wise Faculty of Life Science Tel Aviv University, Israel October 2012 Eli Levy Karin and Haim Ashkenazy
5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
2.1 Scalar data - revision numeric e-14 ( = 6.35 × )‏ operators: + (addition) - (subtraction) * (multiplication) / (division)
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
Visual Basic Review LBS 126. VB programming Project Form 1Form 2Form 3 Text boxButton Picture box Objects Text box Button Objects.
2.1 Lesson 2: Scalar Functions and Arrays “Perl programming is an empirical science!” - Larry Wall.
PERL By C. Shing ITEC Dept Radford University. Objectives Understand the history Understand constants and variables Understand operators Understand control.
Perl Scripting II Conditionals, Logical operators, Loops, and File handles Suzi Lewis Genome Informatics.
Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine.
CSI605 perl. Perl Facts Perl = Pathologically Eclectic Rubbish Lister Perl is highly portable across many different platforms and operating systems Perl.
File Handle and conditional Lecture 2. File Handling The Files associated with Perl are often text files: e.g. text1.txt Files need to be “opened for.
The Scripting Programming Language
Teaching Materials by Ivan Ovcharenko
Miscellaneous Items Loop control, block labels, unless/until, backwards syntax for “if” statements, split, join, substring, length, logical operators,
Perl for Bioinformatics
Control Structures: if Conditional
Control Structures: for & while Loops
Lesson 2. Control structures File IO - reading and writing Subroutines
INTRODUCTION to PERL PART 1.
REPETITION Why Repetition?
Control Structures.
Presentation transcript:

5.1 Revision: Ifs and Loops

5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n"; my $number = ; if ($number 100) { print "ERROR: The average must be between 0 and 100.\n"; } elsif ($number > 90) { print "wow!\n"; } elsif ($number > 80) { print "well done.\n"; } else { print "oh well...\n"; } Note the indentation: a single tab in each line of new block ‘ } ’ that ends the block should be in the same indentation as where it started True if at least one condition is true

5.3 if, elsif, else my $number = ; $number 100 “ERROR” Yes > 90 > 80 No “wow!” “well done” “oh well…” Yes No Yes if ($number 100) { print "ERROR"; } elsif ($number > 90) { print "wow!\n"; } elsif ($number > 80) { print "well done.\n"; } else { print "oh well...\n"; }

5.4 Comparison operators StringNumericComparison eq== Equal ne!= Not equal lt< Less than gt> Greater than le<= Less than or equal to ge>= Greater than or equal to if ($age == 18)... if ($name eq "Yossi")... if ($name ne "Yossi")... if ($name lt "n")... if ($age = 18)... Found = in conditional, should be == at... if ($name == "Yossi")... Argument "Yossi" isn't numeric in numeric eq (==) at...

5.5 If Commands inside a loop are executed repeatedly (iteratively): my $luckyNum = 42; print "Guess a number\n"; my $num = ; if ($num != $luckyNum) { print "Wrong...\n"; } print "Correct!!\n"; $num != 42 Correct!! Guess a number Wrong… No Yes

5.6 Loops: while Commands inside a loop are executed repeatedly (iteratively): my $luckyNum = 42; print "Guess a number\n"; my $num = ; while ($num != $luckyNum) { print "Wrong. Guess again.\n"; $num = ; } print "Correct!!\n"; $num != 42 Correct!! Guess a number Wrong… $num No Yes

5.7 Loops: while (defined …) Let's observe the following code : open (IN, "<numbers.txt"); my $line = ; while (defined $line) { chomp $line; if ($line > 10) { print $line; } $line = ; } close (IN); read $line defined ? >10 print $line End Start read $line Yes No Yes

5.8 $arr[2]$arr[1]$arr[3]$arr[4] Loops: foreach The foreach loop passes through all the elements of an array = (1,1,2,3,5); Note: The array is actually $arr[0] foreach my $num { $num++; } undef

5.9 Loops: while = (1,1,2,3,5); my $num; my $i = 0; while ($i { $num = $arr[$i]; $num++; $arr[$i] = $num; } = (1,1,2,3,5); foreach my $num $num++; } $i = 0 Yes Guess a number $num++ $num = $arr[$i] $arr[$i] = $num End Start Yes No

5.10 Breaking out of loops next – skip to the next iteration last – skip out of the loop open (IN, "<numbers.txt"); = ; foreach my $num { if ($num <= 10) { next; } print $num; } close (IN);

5.11 Breaking out of loops next – skip to the next iteration last – skip out of the loop open (IN, "<numbers.txt"); = ; foreach my $num { if ($num <= 10) { last; } print $num; } close (IN);

5.12 More loops

5.13 Scope of variable declaration If you declare a variable inside a loop it will only exist in that loop This is true for every { block } : my $name=""; while ($name ne "Nimrod") { $name = chomp($name); print "Hello $name, what is your age?\n"; my $age; $age = ; } print $name; print $age; Global symbol "$age" requires explicit package name

5.14 Never declare the same variable name twice If you declare a variable name twice, outside and inside – you are creating two distinct variables… don’t do it! my $name = "Ruti"; print "Hello $name!\n"; my $num; = (1,2,3); foreach $num { my $name = "Nimrod"; print "$num. Hello $name!\n"; } print "Hello $name!\n"; Hello Ruti! 1. Hello Nimrod! 2. Hello Nimrod! 3. Hello Nimrod! Hello Ruti!

5.15 Never declare the same variable name twice If you declare a variable name twice, outside and inside – you are creating two distinct variables… don’t do it! my $name = "Ruti"; print "Hello $name!\n"; my $num; = (1,2,3); foreach $num { $name = "Nimrod"; print "$num. Hello $name!\n"; } print "Hello $name!\n"; Hello Ruti! 1. Hello Nimrod! 2. Hello Nimrod! 3. Hello Nimrod! Hello Nimrod!

5.16 Fasta format Fasta format sequence begins with a single-line description, which starts with ' > ', followed by lines of sequence data that contain new-lines after a fixed number of characters: >gi| |ref|NP_ | thr operon leader peptide… MKRISTTITTTITITTGNGAG >gi| |ref|NP_ | fused aspartokinase I and homoserine… MG1655]MRVLKFGGTSVANAERFLRVADILESNARQGQVATVLSAPAKITNHLVAMIEKTISGQDALPN AKFFAALARANINIVAIAQGSSERSISVVVNNDDATTGVRVTHQMLFNTDQVIEVFVIGVGGVGGALLEQ NAGDELMKFSGILSGSLSYIFGKLDEGMSFSEATTLAREMGYTEPDPRDDLSGMDVARKLLILARETGRE LELADIEIEPVLPAEFNAEGDVAAFMANLSQLDDLFAARVAKARDEGKVLRYVGNIDEDGVCRVKIAEVD GNDPLFKVKNGENALAFYSHYYQPLPLVLRGYGAGNDVTAAGVFADLLRTLSWKLGV >gi| |ref|NP_ | homoserine kinase [Escherichia coli… MG1655]MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETFSLNNLGRFADKLPSEPREN IVYQCWERFCQELGKQIPVAMTLEKNMPIGSGLGSSACSVVAALMAMNEHCGKPLNDTRLLALMGELEGR ISGSIHYDNVAPCFLGGMQLMIEENDIISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAH GRHLAGFIHACYSRQPELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKP ETAQRVADWLGKNYLQNQEGFVHICRLDTAGARVLEN

5.17 GenBank files… GenBank and GenPept are two NCBI formats for representing information of genes and proteins (respectively). Here is a sample recordsample record

5.18 Class exercise 4b 1.Read a file containing several proteins sequences in FASTA format, and print only their header lines using a while loop (see example FASTA file on the course webpage). 2.Read a file containing several proteins sequences in FASTA format, and print only their header lines using a foreach loop (see example FASTA file on the course webpage). 3.(Ex 3.1b) Read a file containing numbers, one in each line and print the sum of these numbers. (use number.txt from the website as an example). 4*.Read the "fight club.txt" file and print the 1 st word of the 1 st line, the 2 nd word of the 2 nd line, and so on, until the last line. (If the i-th line does not have i words, print nothing).

5.19 Class exercise 5a 1*.Read the "fight club.txt" file and print for each line the number of words in the line. 2*. Read a file containing several proteins sequences in FASTA format, and print only the gi numbers (the number that appears after ' gi| '). Note that the number of digits in the gi number may vary. 3*. Read the "fight club.txt" file and print for each line the number of times the letter ' i ' appears in it.

5.20 The substr function The substr function extracts a substring out of a string. It receives 3 arguments: substr(EXPR,OFFSET,LENGTH) Note: OFFSET count starts from 0. For example: my $str = "university"; my $sub = substr($str, 3, 5); $sub is now "versi", and $str remains unchanged. Also note : You can use variables as the offset and length parameters. The substr function can do a lot more, Google it and you will see…

5.21 Documentation of perl functions Anothr good place to start is the list of All basic Perl functions in the Perl documentation site: Click the link “Functions” on the left (let's try it…)

5.22 Peldoc in Eclipse Also note a little pinuk: At the bottom you have a 'PerlDoc' tab that contains information about all of Perl's functions (and much more)

5.23 FASTA: Analyzing complex input Assignment: Write a script that reads several protein sequences in FASTA format, and prints for each sequence its header and its 30 C-terminal (last) amino-acids. | Obtain from the assignment:  Input  Required Output  Required processes (functions)

5.24 FASTA: Analyzing complex input Let's start with something easier: Print header and last 30 aa of the first protein: 1.Read the first FASTA sequence: 1.1. Read FASTA header 1.2. Read each line until next FASTA header 2.Do something (print output) 2.1. Get last 30aa Print header last 30aa Let’s see how it’s done… Do something Start Read line End Save header Read line Concatenate to sequence defined and not header No Read line Yes

5.25 ## 1.1. Read FASTA header and save it my $fastaLine = ; chomp $fastaLine; my $header = substr($fastaLine,1); ## 1.2. Read sequence until next FASTA header $fastaLine = ; my $seq = ""; while ((defined $fastaLine) and (substr($fastaLine,0,1) ne ">" )){ chomp $fastaLine; $seq = $seq.$fastaLine; $fastaLine = ; } ## 2.1 get last 30aa my $subseq = substr($seq,-30); ## 2.2 print header and last 30aa print "$header\n$subseq\n"; Do something End Start Save header Read line No Read line Concatenate to sequence defined and not header Read line Yes

5.26 FASTA: Analyzing complex input Overall design: Read the FASTA file (several sequences). For each sequence: 1.Read the FASTA sequence 1.1. Read FASTA header 1.2. Read each line until next FASTA header 2.For each sequence: Do something 2.1. Get last 30aa Print header and last 30aa. Let’s see how it’s done… Do something End Start Save header Read line Concatenate to sequence defined and not header No Read line Yes defined? No Yes

5.27 ## 1.1. Read FASTA header and save it my $fastaLine = ; while (defined $fastaLine) { chomp $fastaLine; my $header = substr($fastaLine,1); ## 1.2. Read seq until next header $fastaLine = ; my $seq = ""; while ((defined $fastaLine) and (substr($fastaLine,0,1) ne ">" )) { chomp $fastaLine; $seq = $seq.$fastaLine; $fastaLine = ; } ## 2.1 get last 30aa my $subseq = substr($seq,-30); ## 2.2 print header and last 30aa print "$header\n$subseq\n"; } Do something End Start Save header Read line No Read line Concatenate to sequence defined and not header Read line Yes defined? No Yes

5.28 Class exercise 5b 1.(Ex 3.2) Read a Fasta file (you can use as an example Ecoli.prot.fasta from the course web-site) and print for each sequence the header and the sequence length. 2.Read a Fasta file (such as Ecoli.prot.fasta from) and print the headers of the proteins that their sequence start with MAD or MAN. 3*.Write a script that reads a file containing names and expenses on separate lines (such as expenses.txt from the course web site). Sum the numbers while there is a '+' sign before them, and print for each name the total of expenses. For example: Input:Output: NimrodNimrod Dana Dana