Lecture 6.11

Slides:



Advertisements
Similar presentations
Introducing JavaScript
Advertisements

George Blank University Lecturer. CS 602 Java and the Web Object Oriented Software Development Using Java Chapter 4.
CS 898N – Advanced World Wide Web Technologies Lecture 8: PERL Chin-Chih Chang
CS311 – Today's class Perl – Practical Extraction Report Language. Assignment 2 discussion Lecture 071CS Operating Systems I.
CS 330 Programming Languages 10 / 11 / 2007 Instructor: Michael Eckmann.
PERL Part 3 1.Subroutines 2.Pattern matching and regular expressions.
CS 201 Functions Debzani Deb.
Perl Lecture #1 Scripting Languages Fall Perl Practical Extraction and Report Language -created by Larry Wall -- mid – 1980’s –needed a quick language.
Guide To UNIX Using Linux Third Edition
Guide To UNIX Using Linux Third Edition
Chapter 1 Program Design
XP Tutorial 1 New Perspectives on JavaScript, Comprehensive1 Introducing JavaScript Hiding Addresses from Spammers.
Physical Mapping II + Perl CIS 667 March 2, 2004.
Introduction to Unix (CA263) Introduction to Shell Script Programming By Tariq Ibn Aziz.
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
Last Updated March 2006 Slide 1 Regular Expressions.
JavaScript, Fifth Edition Chapter 1 Introduction to JavaScript.
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
Introduction to Shell Script Programming
Programming Perl in UNIX Course Number : CIT 370 Week 4 Prof. Daniel Chen.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Perl Tutorial Presented by Pradeepsunder. Why PERL ???  Practical extraction and report language  Similar to shell script but lot easier and more powerful.
Lecture 8 perl pattern matching features
Nael Alian Introduction to PHP
2 1 Sending Data Using a Hyperlink CGI/Perl Programming By Diane Zak.
CMPS 211 JavaScript Topic 1 JavaScript Syntax. 2Outline Goals and Objectives Goals and Objectives Chapter Headlines Chapter Headlines Introduction Introduction.
Tutorial 10 Programming with JavaScript
Input, Output, and Processing
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
Linux Operations and Administration
1 System Administration Introduction to Scripting, Perl Session 3 – Sat 10 Nov 2007 References:  chapter 1, The Unix Programming Environment, Kernighan.

Guide to Programming with Python Chapter One Getting Started: The Game Over Program.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
CS 330 Programming Languages 10 / 07 / 2008 Instructor: Michael Eckmann.
Introduction to Perl Yupu Liang cbio at MSKCC
Perl: Lecture 1 The language. What Perl is Merger of Unix tools – Very popular under UNIX – shell, sed, awk Programming language – C syntax Scripting.
7 1 User-Defined Functions CGI/Perl Programming By Diane Zak.
Review Please hand in your practicals and homework Regular Expressions with grep.
Chapter 0 Getting Started. Objectives Understand the basic structure of a C++ program including: – Comments – Preprocessor instructions – Main function.
Chapter 9: Perl (continue) Advanced Perl Programming Some materials are taken from Sams Teach Yourself Perl 5 in 21 Days, Second Edition.
More “What Perl can do” With an introduction to BioPerl Ian Donaldson Biotechnology Centre of Oslo MBV 3070.
_______________________________________________________________________________________________________________ PHP Bible, 2 nd Edition1  Wiley and the.
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
Perl Tutorial. Why PERL ??? Practical extraction and report language Similar to shell script but lot easier and more powerful Easy availablity All details.
1 Perl, Beyond the Basics: Regular Expressions, Subroutines, and Objects in Perl CSCI 431 Programming Languages Fall 2003.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
 2008 Pearson Education, Inc. All rights reserved JavaScript: Introduction to Scripting.
Introduction to Perl. What is Perl Perl is an interpreted language. This means you run it through an interpreter, not a compiler. Similar to shell script.
 2001 Prentice Hall, Inc. All rights reserved. Chapter 7 - Introduction to Common Gateway Interface (CGI) Outline 7.1Introduction 7.2A Simple HTTP Transaction.
Python Let’s get started!.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
Introduction to Programming the WWW I CMSC Winter 2004 Lecture 13.
Tutorial 10 Programming with JavaScript. 2New Perspectives on HTML, XHTML, and XML, Comprehensive, 3rd Edition Objectives Learn the history of JavaScript.
Perl for Bioinformatics Part 2 Stuart Brown NYU School of Medicine.
Dept. of Animal Breeding and Genetics Programming basics & introduction to PERL Mats Pettersson.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
Perl Subroutines User Input Perl on linux Forks and Pipes.
XP Tutorial 10New Perspectives on HTML, XHTML, and DHTML, Comprehensive 1 Working with JavaScript Creating a Programmable Web Page for North Pole Novelties.
Linux Administration Working with the BASH Shell.
CS 330 Class 7 Comments on Exam Programming plan for today:
Tutorial 10 Programming with JavaScript
Python Let’s get started!.
First & Last Name February X, 2005 Lecture 6.0 (c) 2005 CGDN.
Programming Basics Web Programming.
Perl for Bioinformatics
CSCI 431 Programming Languages Fall 2003
Presentation transcript:

Lecture

Lecture 6.12 An Introduction to Perl for Bioinformatics – Part 1 Will Hsiao Simon Fraser University Department of Molecular Biology and Biochemistry

Lecture 6.13 Outline Session 1 –Review of the previous day –Perl – historical perspective –Expand on Regular Expression –General Use of Perl –Expand on Perl Functions and introduce Modules –Interactive demo on Modules Break Session 2 –Use of Perl in Bioinformatics –Object Oriented Perl –Bioperl Overview –Interactive demo on Bioperl –Introduction to the Perl assignment

Lecture 6.14 Today’s Goals Will have become familiar with a few more advanced programming concepts –Regular Expression –Functions and Modules –Object Oriented Perl Will have heard a few common uses of Perl Will have learned how Perl can be used in bioinformatics Will have discovered Bioperl

Lecture 6.15 Recap from Yesterday Which ones below are variables? 74, ‘I knew this’, %seq_id, “exciting stuff” What are functions? Which part of the statement below is a function: = split (/\t/, $genome); Other issues?

Lecture 6.16 What does this program do? #!/usr/bin/perl –w #a mystery subroutine that does something sub mystery_function{ my ($seq1, my $rDNA = reverse $seq1; $seq2 =~ tr/T/U/; my $hybrid = $rDNA.$seq2; return $hybrid; } #body of the main program $DNA1 = “GATACAATAC”; $DNA2 = “ATCGTAATCC”; $answer = mystery_function($DNA1, $DNA2); print “$answer\n”;

Lecture 6.17 “use strict” #!/usr/bin/perl –w use strict; #a mystery subroutine that does something sub mystery_function{ my ($seq1, my $rDNA = reverse $seq1; $seq2 =~ tr/T/U/; my $hybrid = $rDNA.$seq2; return $hybrid; } #body of the main program my $DNA1 = “GATACAATAC”; my $DNA2 = “ATCGTAATCC”; my $answer = mystery_function($DNA1, $DNA2); print “$answer\n”;

Lecture 6.18 Effects of “use strict” Requires you to declare variables Warns you about possible typos in variables CorrectIncorrect my $DNA; $DNA = “ATCG”; or my $DNA = “ATCG”; $DNA = “ATCG”; No warningWarning my $DNA = “ATCG”; $DNA =~tr/ATCG/TAGC/ my $DNA = “ATCG”; $DAN =~tr/ATCG/TAGC

Lecture 6.19 Why bother “use strict” Enforces some good programming rules Helps to prevent silly errors Makes trouble shooting your program easier Becomes essential as your code becomes longer We will use strict in all the code you see today and in your assignment Bottom line: ALWAYS use strict

Lecture Perl – a brief history Purpose: “… for scanning arbitrary text files, extracting information from those text files, and printing reports based on that information…” - from perl manpage 1987-Perl 1.0 released 1993 – CPAN conceived 1995 – Perl released –Object oriented perl –Modules for creating interactive web pages (CGI) –Modules for connection to databases (DBI) Current stable version of Perl is 5.8.5

Lecture How do we manipulate text? Regular Expression

Lecture What is Regular Expression REGEX provides pattern matching ability Tells you whether a string contains a pattern or not (Note: it’s a yes or no question!) Regular Expression looking for “dog” “I have a golden retriever”“Yesterday I saw a big black dog” “My dog ate my homework” “Yes” or “True” “No” or “False” Dog! Human’s best friend “No” b/c REGEX is case sensitive

Lecture Why need Regular Expression Human does this quite well But…. Imagine trying to find all ATG’s in the human genome by hand Furthermore, imagine trying to find all EcoRI digestion sites (GAATTC) in the human genome

Lecture Perl REGEX example my $text = “Bioinformatics Kicks Ass”; if ($text=~/Kicks/){ print “The text contains Kicks\n”; } =~ is the binding operator –It says: “does the string on the left contain the pattern on the right?” /Kicks/ is my pattern The matching operation results in a true or false answer

Lecture More Regular Expression A pattern that match only one string is not very useful! We need symbols to represent classes of strings REGEX is its own little language inside Perl –Has different syntax and symbols! –Symbols which you have used in perl such as $. { } [ ] have totally different meanings in REGEX

Lecture REGEX Metacharacters Metacharacters allow a pattern to match different strings –Wildcards are examples of metacharacters –/.ick/ will match “kick”, “sick”, “tick”, “stick”, “kicks”, etc. –Perl REGEX has much more powerful metacharacters used to represent classes of characters

Lecture Types of Metacharacters. matches any one character or space except “\n” [ ] denotes a selection of characters and matches ONE of the characters in the selection. –What does [ATCG] match? \t, \s, \n match a tab, a space and a newline respectively \w matches any characters in [a-zA-Z0-9] \d matches [0-9] \D matches anything except [0-9]

Lecture An Example of Metacharacters V1S 5A6? /\w\d\D\s\d.[0-9]/ Is it a good pattern for postal code? What else does it match?

Lecture REGEX Quantifiers What if you want to match a character more than once? What if you want to match an mRNA with a polyA tail that is at least 5 – 12 A’s? “ATG……AAAAAAAAAAA”

Lecture REGEX Quantifiers + matches one or more copies of the previous character * matches zero or more copies of the previous character ? matches zero or one copy of the previous character {min,max} matches a number of copies within the specified range “ATG……AAAAAAAAAAA” /ATG[ATCG]+A{5,12}/

Lecture REGEX Anchors The previous pattern is not strictly correct because: –It’ll match a string that doesn’t start with ATG –It’ll match a string that doesn’t end with poly A’s Anchors tell REGEX that a pattern must occur at the beginning or at the end of a string

Lecture REGEX Anchors ^ anchors the pattern to the start of a string $ anchors the pattern to the end of a string /^ATG[ATCG]+A{5,12}$/

Lecture REGEX is greedy! The revised pattern is still incorrect because –It’ll match a string that has more than 12 A’s at the end quantifiers will try to match as many copies of a sub- pattern as possible! /^ATG[ATCG]+A{5,12}$/ “ATGGCCCGGCCTTTCCCAAAAAAAAAAAA”

Lecture Curb that Greed! ? after a quantifier prevensts REGEX from being greed note this is the second use of the question mark What is the other use of ? in REGEX? /^ATG[ATCG]+?A{5,12}$/ “ATGGCCCGGCCTTTCCGAAAAAAAAAAAA”

Lecture REGEX Capture What if you want to keep the part of a string that matches to your pattern? Use ( ) “memory parentheses” “ATGGCCCGGCCTTTCCGAAAAAAAAAAAA” /^ATG([ATCG]+?)A{5,12}$/

Lecture REGEX Capture What’s inside the first ( ) is assigned to $1 What’s inside the Second ( ) is $2 and so on So $2 eq “AAAAAAAAAAAA” /^ATG([ATCG]+?)(A{5,12})$/ $1$2

Lecture REGEX Modifiers Modifiers come after a pattern and affect the entire pattern You have seen //g already which does global matching (/T/g) and global replacement(s/T/U/g) Other useful modifiers: //i make pattern case insensitive //s let. match newline //m let ^ and $ (anchors) match next to embedded newline ///e allow the replacement string to be a perl statement

Lecture REGEX Demo Demonstrate quantifiers Demonstrate anchors Demonstrate //i Demonstrate capture Demonstrate the effect of greedy vs. non- greedy Demonstrate metacharacters

Lecture Other binding operators =~ is called the binding operator which “binds” the a string on the left to a pattern on the right –E.g. $text =~/PATTERN/ Two other binding operators: s/// and tr/// –=~s/// (substitution) substitutes a matched pattern by a string (kind of like the “replace” function in MS Word) –=~tr/// (translation) translates a character to another

Lecture Summary on REGEX REGEX is its own little language!!! REGEX is used in some functions (e.g. split) Perl REGEX: extremely powerful and fast REGEX is one of the main strengths of Perl To learn more: –Learning Perl (3 rd ed.) Chapters 7, 8, 9 –Programming Perl (3 rd ed.) Chapter 5 –Mastering Regular Expression (2 nd ed.)

Lecture Common Uses of Perl REGEX –Complete set of tools for pattern matching text System administration –Perl scripts can be written to automate many system administration tasks CGI.pm –Module for designing interactive web pages DBI.pm –Database Interface – allows communication between all major RDBMS systems (Oracle, MySQL, etc.)

Lecture Review on Functions How do we “call” a function? my $sum = add (2, 3) Functions can take some input values (parameters) and can return some output values You need to assign the return values to a variable in order to use them Function NameParameter listReturn Value

Lecture More Review on Functions Benefits of subroutines –Decompose a big problem into smaller, more manageable problems –Organize your code –Improve code reuse –Easier to test and debug your code sub add { #some code that adds numbers here #return the sum }

Lecture What are Modules a “logical” collection of functions Each collection (or module) has its own “name space” –Name space: a table containing the names of variables and functions used in your code

Lecture Why is name space important Package: SEQanalysis_DNAPackage: SEQanalysis_Prot $DNA = “ATGAATACTACTAT…” $polyAtail = “AAAAAAAAAA” sub Revcom{} #reverse complement sequence sub concat{} #concatenate two DNA sequences $exon1 = “MEDAVRSKNTMI” $exon2 = “RSVADEGFLSMIRQH” sub findmotif{} #find a peptide motif sub concat{} #concatenate two exon sequences SEQanalysis_DNA::concatSEQanalysis_Prot::concat

Lecture Why Use Modules Modules allow you to use others’ code to extend the functionality of your program But, use other people’s modules is like going to other people’s houses –Not everything will be the way you like it Read the module documentation –Be nice use a module as it is intended In Perl, each module is a file stored in some directory in your system –E.g. you can find cgi.pm in /usr/lib/… on your system (ask Graeme where it is)

Lecture Use Modules To use a module: –use ; Examples: –use strict; –use Env; –use cgi qw(:standard); To find out where modules are installed : perl –V To find out what standard modules are available: perldoc perlmodlib

Lecture Module Demo Demonstrate perldoc as a method to read module documentation Demonstrate the difference before and after using a module (use strict and use Env) Demonstrate the perl –V and an example of directory structure of modules

Lecture Where to find modules CPAN: Comprehensive Perl Archive Network Central repository for Perl modules and more “If it’s written in Perl, and it’s helpful and free, it’s probably on CPAN” To install modules from CPAN –perl –MCPAN –e “install ‘Some::Module’” Module dependency is taken cared of automatically You’ll (usually) need to be root to install a module successfully For details see your notes

Lecture CPAN Web Demo Demonstrate how to search for a module and how to access the online documentation We’ll use Getopt::Long as an example

Lecture Interactive Demo on Getopt::Long Open your laptop! Open a terminal window Type cd ~/perl_two Type emacs./getopt_demo.pl& Let’s go over the example together

Lecture Summary for Session 1 Always “use strict” Regular Expression is its own language inside Perl I encourage you to read the chapters on REGEX in Learning Perl A module is a logical collection of functions You can find module documentation by using perldoc (command line) or by going online to CPAN

Lecture Break