Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp

Slides:



Advertisements
Similar presentations
Lecture 6 More advanced Perl…. Substitute Like s/// function in vi: #cut with EcoRI and chew back $linker = “GGCCAATTGGAAT”; $linker =~ s/CAATTG/CG/g;
Advertisements

Input from STDIN STDIN, standard input, comes from the keyboard. STDIN can also be used with file re-direction from the command line. For instance, if.
4.1 Reading and writing files. 4.2 Open a file for reading, and link it to a filehandle: open(IN, "
CS 330 Programming Languages 10 / 14 / 2008 Instructor: Michael Eckmann.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
4ex.1 More loops. 4ex.2 Loops Commands inside a loop are executed repeatedly (iteratively): my $num=0; print "Guess a number.\n"; while ($num != 31) {
CS 330 Programming Languages 10 / 11 / 2007 Instructor: Michael Eckmann.
5.1 Previously on... PERL course (let ’ s practice some more loops)
Getting Started with Perl (and Excel) Biophysics 101 September 17, 2003 Griffin Weber (With material from Jon Radoff and Ivan Ovcharenko)
Scripting Languages Perl Chapter #4 Subroutines. Writing your own Functions Functions is a programming language serve tow purposes: –They allow you to.
4.1 Revision. 4.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n"; my $number.
10.1 Sorting and Modules בשבועות הקרובים יתקיים סקר ההוראה (באתר מידע אישי לתלמיד)באתר מידע אישי לתלמיד סקר הוראה.
5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
Lecture 8: Basic concepts of subroutines. Functions In perl functions take the following format: – sub subname – { my $var1 = $_[0]; statements Return.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Introduction to Perl & BioPerl Dr G. P. S. Raghava Bioinformatics Centre Bioinformatics Centre IMTECH, Chandigarh Web:
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Perl Tutorial Presented by Pradeepsunder. Why PERL ???  Practical extraction and report language  Similar to shell script but lot easier and more powerful.
Computer Programming for Biologists Class 8 Nov 28 th, 2014 Karsten Hokamp
Subroutines and Files Bioinformatics Ellen Walker Hiram College.
Computer Programming for Biologists Class 10 Dec 5 th, 2014 Karsten Hokamp
Meet Perl, Part 2 Flow of Control and I/O. Perl Statements Lots of different ways to write similar statements –Can make your code look more like natural.
Chapter 7 File I/O 1. File, Record & Field 2 The file is just a chunk of disk space set aside for data and given a name. The computer has no idea what.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
CS 330 Programming Languages 10 / 07 / 2008 Instructor: Michael Eckmann.
File Input and Output in C++. Keyboard and Screen I/O #include cin (of type istream) cout (of type ostream) Keyboard Screen executing program input data.
Books. Perl Perl (Practical Extraction and Report Language) by Larry Wall Perl 1.0 was released to usenet's alt.comp.sources in 1987 Perl 5 was released.
Perl: Lecture 1 The language. What Perl is Merger of Unix tools – Very popular under UNIX – shell, sed, awk Programming language – C syntax Scripting.
Chapter 10: BASH Shell Scripting Fun with fi. In this chapter … Control structures File descriptors Variables.
7 1 User-Defined Functions CGI/Perl Programming By Diane Zak.
Chapter 9: Perl (continue) Advanced Perl Programming Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
Review, Pseudocode, Flow Charting, and Storyboarding.
Prof. Alfred J Bird, Ph.D., NBCT Office – McCormick 3rd floor 607 Office Hours – Tuesday and.
Computer Programming for Biologists Class 3 Nov 13 th, 2014 Karsten Hokamp
Introduction to Programming the WWW I CMSC Winter 2003.
5 1 Data Files CGI/Perl Programming By Diane Zak.
Programming Perl in UNIX Course Number : CIT 370 Week 6 Prof. Daniel Chen.
Computer Programming for Biologists Class 6 Nov 21 th, 2014 Karsten Hokamp
Perl Tutorial. Why PERL ??? Practical extraction and report language Similar to shell script but lot easier and more powerful Easy availablity All details.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Department of Electrical and Computer Engineering Introduction to Perl By Hector M Lugo-Cordero August 26, 2008.
5.1 Revision: Ifs and Loops. 5.2 if, elsif, else It’s convenient to test several conditions in one if structure: print "Please enter your grades average:\n";
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python Karsten Hokamp, PhD Genetics TCD, 03/11/2015.
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
GE3M25: Computer Programming for Biologists Python, Class 5
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 4 Karsten Hokamp, PhD Genetics TCD, 01/12/2015.
2.1 Scalar data - revision numeric e-14 ( = 6.35 × )‏ operators: + (addition) - (subtraction) * (multiplication) / (division)
Trinity College Dublin, The University of Dublin GE3M25: Computer Programming for Biologists Python, Class 2 Karsten Hokamp, PhD Genetics TCD, 17/11/2015.
Computer Programming for Biologists Class 4 Nov 14 th, 2014 Karsten Hokamp
PERL By C. Shing ITEC Dept Radford University. Objectives Understand the history Understand constants and variables Understand operators Understand control.
Part 4 Arrays: Stacks foreach command Regular expressions: String structure analysis and substrings extractions and substitutions Command line arguments:
CSC 4630 Meeting 17 March 21, Exam/Quiz Schedule Due to ice, travel, research and other commitments that we all have: –Quiz 2, scheduled for Monday.
Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine.
2000 Copyrights, Danielle S. Lahmani Foreach example = ( 3, 5, 7, 9) foreach $one ) { $one*=3; } is now (9,15,21,27)
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
CSC 4630 Perl 3 adapted from R. E. Beck. Problem But we worked on it first: Input: Read from a text file named in a command line argument Output: List.
Perl Subroutines User Input Perl on linux Forks and Pipes.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
Input from STDIN STDIN, standard input, comes from the keyboard.
Completing the Problem-Solving Process
Shell Scripting March 1st, 2004 Class Meeting 7.
Topics Introduction to File Input and Output
Lesson 2. Control structures File IO - reading and writing Subroutines
Language Constructs Construct means to build or put together. Language constructs refers to those parts which make up a high level programming language.
Topics Introduction to File Input and Output
Presentation transcript:

Computer Programming for Biologists Class 9 Dec 4 th, 2014 Karsten Hokamp

Computer Programming for Biologists mock exam revision variable scope extensions file handles Overview

Computer Programming for Biologists Mock Exam

Computer Programming for Biologists my $prot = &translate($seq);# call sub translate { # definition my $seq = # parameters … return $prot; # return value(s) } Revision - Subroutines

Computer Programming for Biologists scope The area of the script in which a variable is visible. Different blocks defined by: main program subroutines loops branches  different namespaces

Computer Programming for Biologists scope my $global_1; … while (my $input = <>) { statement1; } if (condition) { my $local_1 = 'xxx'; } sub subroutine { my $local_1; foreach my $nuc { statement2; } main part

Computer Programming for Biologists scope my $global_1; … while (my $input = <>) { statement1; } if (condition) { my $local_1 = 'xxx'; } sub subroutine { my $local_1; foreach my $nuc { statement2; } Blocks

Computer Programming for Biologists scope Tip: Keep local variables within subroutines  explicitly pass content between main part and subs, e.g.: $protein = &translate($seq); value passed into subroutine value returned from subroutine  avoid accidentally changing global variables

Computer Programming for Biologists scope # extract header while ($input = <>) { if ($input =~ /^>(.+)/) { my $header = $1; } print "sequence ID: $header\n"; Global symbol "$header" requires explicit package name Wrong:

Computer Programming for Biologists scope # initialize global variable my $header = ''; # extract header while ($input = <>) { if ($input =~ /^>(.+)/) { $header = $1; } print "sequence ID: $header\n"; Correct:

Computer Programming for Biologists course project common errors: (scope) my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string my $dna.= $input; } my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string my $dna.= $input; }

Computer Programming for Biologists course project common errors: (scope) my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string my $dna.= $input; } my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string my $dna.= $input; } different variables

Computer Programming for Biologists course project common errors: (scope) my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string $dna.= $input; } my $dna = ''; # read input while (my $input = <>) { # remove line ending chomp $input; # append to sequence string $dna.= $input; } same variable

Computer Programming for Biologists course project common errors: (arrangement) # print output in chunks of 60 bp width while ($dna) { $out = substr $dna, 0, 60, ''; print "$i $out\n"; $i += length($out); } # change string to array: = split //, $dna; # print output in chunks of 60 bp width while ($dna) { $out = substr $dna, 0, 60, ''; print "$i $out\n"; $i += length($out); } # change string to array: = split //, $dna; empties $dna

Computer Programming for Biologists course project considerations: # form the reverse complement $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; # translate my $protein = &translate($dna); # form the reverse complement $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; # translate my $protein = &translate($dna); order is important # translate my $protein = &translate($dna); # form the reverse complement $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; # translate my $protein = &translate($dna); # form the reverse complement $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/;

Computer Programming for Biologists course project considerations: # define variables: my $do_revcomp = ''; my $do_composition = ''; my $do_translate = ''; # read sequence my $dna = ''; … # calculate GC content if ($do_composition) { &composition($dna); } # form the reverse complement if ($do_revcomp) { $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; } # translate the protein if ($do_translate) { &translate($dna); } # define variables: my $do_revcomp = ''; my $do_composition = ''; my $do_translate = ''; # read sequence my $dna = ''; … # calculate GC content if ($do_composition) { &composition($dna); } # form the reverse complement if ($do_revcomp) { $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; } # translate the protein if ($do_translate) { &translate($dna); } make actions optional

Computer Programming for Biologists course project considerations: # define variables: my $do_revcomp = '1'; my $do_composition = ''; my $do_translate = ''; # read sequence my $dna = ''; … # calculate GC content if ($do_composition) { &composition($dna); } # form the reverse complement if ($do_revcomp) { $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; } # translate the protein if ($do_translate) { &translate($dna); } # define variables: my $do_revcomp = '1'; my $do_composition = ''; my $do_translate = ''; # read sequence my $dna = ''; … # calculate GC content if ($do_composition) { &composition($dna); } # form the reverse complement if ($do_revcomp) { $dna = reverse($dna); $dna =~ tr/ACTG/TGAC/; } # translate the protein if ($do_translate) { &translate($dna); } make actions optional

Computer Programming for Biologists course project Work on your course project (sequanto.pl): 1.fix bugs 2.add "choice" variables at the top 3.move code blocks into subroutines (GC-content, composition)

Computer Programming for Biologists Control through options Perl module Getopt::Long allows processing command line options.

Computer Programming for Biologists Control through options $ man Getopt::Long NAME Getopt::Long - Extended processing of command line options SYNOPSIS use Getopt::Long; my $data = "file.dat"; my $length = 24; my $verbose; $result = GetOptions ("length=i" => \$length, # numeric "file=s" => \$data, # string "verbose" => \$verbose); # flag

Computer Programming for Biologists Control through options $ man Getopt::Long NAME Getopt::Long - Extended processing of command line options SYNOPSIS use Getopt::Long; my $data = "file.dat"; my $length = 24; my $verbose = ''; $result = GetOptions ("length=i" => \$length, # numeric "file=s" => \$data, # string "verbose" => \$verbose); # flag type of argument name of parameter reference

Computer Programming for Biologists Control through options perl test.pl -verbose -length 20 -file input.txt  $verbose set to '1', $length set to '20', $data set to 'input.txt' Command line parameters (with arguments):

Computer Programming for Biologists Control through options perl test.pl -verbose -length 20 -file input.txt  $verbose set to '1', $length set to '20', $data set to 'input.txt' Reorder: perl test.pl -file input.txt -length 20 –verbose Command line parameters (with arguments):

Computer Programming for Biologists Control through options perl test.pl -verbose -length 20 -file input.txt  $verbose set to '1', $length set to '20', $data set to 'input.txt' Reorder: perl test.pl -file input.txt -length 20 -verbose Long version: perl test.pl --verbose --length=20 --file=input.txt Command line parameters (with arguments):

Computer Programming for Biologists Control through options perl test.pl -verbose -length 20 -file input.txt  $verbose set to '1', $length set to '20', $data set to 'input.txt' Reorder: perl test.pl -file input.txt -length 20 -verbose Long version: perl test.pl --verbose --length=20 --file=input.txt Short version: perl test.pl -v -l 20 -f input.txt Command line parameters (with arguments):

Computer Programming for Biologists Control through options use Getopt::Long; my $do_translation = ''; my $do_revcomp = ''; &GetOptions ("translate" => \$do_translation, "revcomp" => \$do_revcomp, ); Try this in your script: perl sequanto.pl -gc -revcomp test.fa To allow the following execution:

Computer Programming for Biologists Control through options use Getopt::Long; my $do_translation = ''; my $do_revcomp = ''; &GetOptions ("translate" => \$do_translation, "revcomp" => \$do_revcomp, ); Try this in your script: perl sequanto.pl -gc -revcomp test.fa To allow the following execution: 2. Initialise variables 1. Import module 3. Call function define flags associate with referenced variables

Program prints output to screen: $ translate.pl seq.fa MGSAILSALLSRRSQRATTIIYHYARITTQRAHGLCDII… Redirect into file: $ translate.pl seq.fa > seq.aa Append to file: $ translate.pl seq.fa >> seq.aa Data Input/Output Redirect output

Reading from STDIN, default input stream: my $in = <>; Use filehandle to read input from a specific file: open (IN, 'input.txt'); # open file for reading while (my $in = ) { … } # read content line by line close IN; # close filehandle when finished Data Input/Output Filehandles filehandle

Syntax: open (FH, filename); # open file for reading open (FH, "< filename"); # open file for reading open (FH, "> filename"); # open file for writing open (FH, ">> filename"); # append to file close FH; # empties buffer Data Input/Output Filehandles Write and append mode will create files if necessary Write mode will empty file first

$file_name = 'results.txt'; if ($write_modus eq 'append') { # append to file (creates file if necessary) open (OUT, ">>$file_name"); } else { # normal write (erases content if file exists) open (OUT, ">$file_name"); } print OUT 'some text'; close OUT; # output might not appear until FH is closed! Data Input/Output Writing to files

open (IN, $file_name) or die "Can't read from $file_name: $!"; open (OUT, ">>$file_out") or die "Can't append to $file_out: $!"; # Note: special variable $! contains error message Data Input/Output Error check! Always test if an important operation worked out:

One or more file names are specified after the program, loop over each argument: foreach my $file { # special open (IN, $file) or die; # open filehandle while (my $in = ) { # read file line by line # do something } close IN; # close filehandle } Data Input/Output Reading from Filehandle

Computer Programming for Biologists Reading sequence from a file # read pattern and sequence my ($pattern, $file) open (IN, $file) or die "$!"; my $sequence = ''; while ( ) { next if (/^>/); chomp; $sequence.= $_; } close IN; # get pattern my $pattern = my $sequence = ''; while (<>) { next if (/^>/); chomp; $sequence.= $_; } $ perl split.pl gcctg test.fa Note: two command line arguments!

Computer Programming for Biologists Reading sequence from a file # read pattern and sequence my $pattern = my $file = open (IN, $file) or die "$!"; my $sequence = ''; while ( ) { next if (/^>/); chomp; $sequence.= $_; } close IN; # get pattern my $pattern = my $sequence = ''; while (<>) { next if (/^>/); chomp; $sequence.= $_; } $ perl split.pl gcctg test.fa Note: two command line arguments!

Computer Programming for Biologists course project Work on your course project (sequanto.pl): 1.Add explicit opening of file-handle 2.Store translated sequence into a new file

Computer Programming for Biologists Exam Reminder Exam: Thu, Dec 11 th, pm