Strings and Patterns in Perl Ellen Walker Bioinformatics Hiram College.

Slides:



Advertisements
Similar presentations
1 Perl Syntax: substitution s// and character replacement tr//
Advertisements

Designing Algorithms Csci 107 Lecture 4. Outline Last time Computing 1+2+…+n Adding 2 n-digit numbers Today: More algorithms Sequential search Variations.
Programming and Perl for Bioinformatics Part III.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
Nonregular languages Sipser 1.4 (pages 77-82). CS 311 Mount Holyoke College 2 Nonregular languages? We now know: –Regular languages may be specified either.
Programming and Perl for Bioinformatics Part I. A Taste of Perl: print a message perltaste.pl: Greet the entire world. #!/usr/bin/perl #greet the entire.
CS 497C – Introduction to UNIX Lecture 31: - Filters Using Regular Expressions – grep and sed Chin-Chih Chang
Regular Expressions.
W3101: Programming Languages (Perl) 1 Perl Regular Expressions Syntax for purpose of slides –Regular expression = /pattern/ –Broader syntax: if (/pattern/)
CS 497C – Introduction to UNIX Lecture 10: The vi/vim Editor Chin-Chih Chang
8.1 Last time on: Pattern Matching. 8.2 Finding a sub string (match) somewhere: if ($line =~ m/he/)... remember to use slash( / ) and not back-slash Will.
Lecture 4 BNFO 235 Usman Roshan. IUPAC Nucleic Acid symbols.
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Physical Mapping II + Perl CIS 667 March 2, 2004.
Scripting Languages Chapter 8 More About Regular Expressions.
CSE467/567 Computational Linguistics Carl Alphonce Computer Science & Engineering University at Buffalo.
UNIX Filters.
Filters using Regular Expressions grep: Searching a Pattern.
LING/C SC/PSYC 438/538 Lecture 5 9/8 Sandiway Fong.
Regular Expression A regular expression is a template that either matches or doesn’t match a given string.
Computer Programming for Biologists Class 2 Oct 31 st, 2014 Karsten Hokamp
Lecture 7: Perl pattern handling features. Pattern Matching Recall =~ is the pattern matching operator A first simple match example print “An methionine.
Overview of the grep Command Alex Dukhovny CS 265 Spring 2011.
Programming Perl in UNIX Course Number : CIT 370 Week 4 Prof. Daniel Chen.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Computer Programming for Biologists Class 5 Nov 20 st, 2014 Karsten Hokamp
Lecture 8 perl pattern matching features
Subroutines and Files Bioinformatics Ellen Walker Hiram College.
Exact string matching Rhys Price Jones Anne Haake Week 2: Bioinformatics Computing I continued.
CS 403: Programming Languages Fall 2004 Department of Computer Science University of Alabama Joel Jones.
Introduction to Perl Giorgos Georgakilas Graduated from C.E.I.D.Graduated from C.E.I.D. M.Sc. degree in ITMBM.Sc. degree in ITMB Ph.D. student in DIANA-LabPh.D.
LING 388: Language and Computers Sandiway Fong Lecture 6: 9/15.
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
Regular Expression Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
Bioinformatics 生物信息学理论和实践 唐继军
Perl Programming Paul Tymann Computer Science Department Rochester Institute of Technology
Regular Expression - Intro Patterns that define a set of strings (or, pieces of a string) Not wildcards (similar notion, but different thing) Used by utilities.
Clearly Visual Basic: Programming with Visual Basic 2008 Chapter 24 The String Section.
Overview A regular expression defines a search pattern for strings. Regular expressions can be used to search, edit and manipulate text. The pattern defined.
Regular Expressions in Perl CS/BIO 271 – Introduction to Bioinformatics.
Appendix A: Regular Expressions It’s All Greek to Me.
12. Regular Expressions. 2 Motto: I don't play accurately-any one can play accurately- but I play with wonderful expression. As far as the piano is concerned,
©Brooks/Cole, 2001 Chapter 9 Regular Expressions.
9/14/2015BCHB Edwards Introduction to Python BCHB Lecture 4.
20-753: Fundamentals of Web Programming 1 Lecture 10: Server-Side Scripting II Fundamentals of Web Programming Lecture 10: Server-Side Scripting II.
May 2008CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Regular Expressions CS 2204 Class meeting 6 Created by Doug Bowman, 2001 Modified by Mir Farooq Ali, 2002.
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong. Adminstrivia Homework 4 not yet graded …
10 – Java Script (3) Informatics Department Parahyangan Catholic University.
CGS – 4854 Summer 2012 Web Site Construction and Management Instructor: Francisco R. Ortega Chapter 5 Regular Expressions.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
Prof. Alfred J Bird, Ph.D., NBCT Door Code for IT441 Students.
8 1 String Manipulation CGI/Perl Programming By Diane Zak.
ORAFACT Text Processing. ORAFACT Searching Inside Files grep - searches for patterns within files grep [options] [[-e] pattern] filename [...] -n shows.
Finding substrings my $sequence = "gatgcaggctcgctagcggct"; #Does this string contain a startcodon? if ($sequence =~ m/atg/) { print "Yes"; } else { print.
LING/C SC/PSYC 438/538 Online Lecture 7 Sandiway Fong.
Regular Expressions. What is it 4? Text searching & replacing Sequence searching (input, DNA) Sequence Tracking Machine Operation logic machines that.
GENE EXPRESSION. Transcription 1. RNA polymerase unwinds DNA 2. RNA polymerase adds RNA nucleotides (A ↔ U, G ↔ C) 3. mRNA is formed! DNA reforms a double.
May 2006CLINT-LIN Regular Expressions1 Introduction to Computational Linguistics Regular Expressions (Tutorial derived from NLTK)
Filters and Utilities. Notes: This is a simple overview of the filtering capability Some of these commands are very powerful ▫Only showing some of the.
Context Free Grammars & Parsing CPSC 388 Fall 2001 Ellen Walker Hiram College.
Chapter 18 The HTML Tag
Deterministic Finite-State Machine (or Deterministic Finite Automaton) A DFA is a 5-tuple, (S, Σ, T, s, A), consisting of: S: a finite set of states Σ:
Introduction to Python
Some CFL Practicalities
CSE 1020:Software Development
Introduction to Python
Introduction to Python
Protein Synthesis.
Presentation transcript:

Strings and Patterns in Perl Ellen Walker Bioinformatics Hiram College

Finding a Fixed Pattern my $string = “ATAAGCTTATCG”; my $pattern = “GCT”; print index($string,$pattern); print index (reverse($string), $pattern);

Finding multiple occurrences my $start = 0; print index($string, $pattern, $start); $start = index($string, $pattern, $start) + length($pattern); print index($string, $pattern, $start); $start = index($string, $pattern, $start) + length($pattern); When do you stop searching?

Finding all (non-overlapping) occurrences my $start = 0; my $found; $found = index($string, $pattern, $start); while ($found > -1) { print “$pattern found at $found\n”; $start = $found + length($pattern); $found = index($string, $pattern, $start); }

Pattern Matching Operators Three types of operators (so far) –Translation: tr –Substitution: s and g –Matching: m Used with =~ to modify a string Example: –$complement =~ tr/ACGT/TGCA/

Translation The tr operator takes two sequences of characters of the same length Every character in the first string is changed to the character at the same position in the second string This is destructive; save the old string before you use it!

Translation examples my $string = “actgTGCA”; my $capitalizedString = $string; $capitalizedString =~ tr/actg/ACTG/; my $lowerCaseString = $string; $lowerCaseString =~tr/ACTG/actg/;

Substitution Replaces an entire pattern with another pattern Patterns need not be the same length s changes only the first occurrence Add g to change all occurrences Example: –$string =~ s/T/U/g

Substitution Examples My $aminoAcids = $dna; $aminoAcids =~ s/AUG/Met/g; $aminoAcids =~ s/GGU/Gly/g; $aminoAcids =~ s/GGG/Gly/g; A sequence of these substitutions will not really work to translate RNA (why not?)

Matching Not destructive to the string Tests if the string matched (can be used as a condition in an if statement. Example: if ($string =~ m/T/) print “String is DNA, not RNA\n”;

Non-Exact Patterns Can be used with s or m Include – wildcard characters, –multiple option matches – capturing

Wildcard characters. Matches any character * Matches 0 or more characters equal to the preceding character + matches 1 or more… ^ before the beginning of the string $ matches after the end of the string

Multiple option matches [actg] Matches one character in the set a, c, t, g [^A-Z] Matches one character that is not A-Z TAG|TGA|TAA Matches either TAG, TGA or TAA –Example:my $Rpattern = ‘A|G’;

Capturing Patterns Any pattern in parentheses is “captured” The pattern can be recovered with \1, \2 etc. Example: s/(…)(…)/\2\1/ switches the first two codons in the string.

Slides are not Complete! Page of the Perl book has an extensive list of regular expression examples.

Examples 6-mer palindrome (.)(.)(.)\3\2\1 Pair of nucleotides repeated at least three times (.)(.).*\1\2.*\1\2 Strings that end with GGA GGA$