Regular Expressions in Perl

Slides:



Advertisements
Similar presentations
LIS651 lecture 4 regular expressions Thomas Krichel
Advertisements

Perl & Regular Expressions (RegEx)
COMP234 Perl Printing Special Quotes File Handling.
ISBN Regular expressions Mastering Regular Expressions by Jeffrey E. F. Friedl –(on reserve.
ISBN Chapter 6 Data Types Character Strings Pattern Matching.
W3101: Programming Languages (Perl) 1 Perl Regular Expressions Syntax for purpose of slides –Regular expression = /pattern/ –Broader syntax: if (/pattern/)
CS 330 Programming Languages 10 / 10 / 2006 Instructor: Michael Eckmann.
More Regular Expressions. List/Scalar Context for m// Last week, we said that m// returns ‘true’ or ‘false’ in scalar context. (really, 1 or 0). In list.
Regular Expressions Comp 2400: Fall 2008 Prof. Chris GauthierDickey.
Regular Expressions in ColdFusion Applications Dave Fauth DOMAIN technologies Knowledge Engineering : Systems Integration : Web.
Last Updated March 2006 Slide 1 Regular Expressions.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
INFO 320 Server Technology I Week 7 Regular expressions 1INFO 320 week 7.
Perl and Regular Expressions Regular Expressions are available as part of the programming languages Java, JScript, Visual Basic and VBScript, JavaScript,
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Regular Expressions CSC207 – Software Design. Motivation Handling white space –A program ought to be able to treat any number of white space characters.
Programming in Perl regular expressions and m,s operators Peter Verhás January 2002.
CSC 352– Unix Programming, Spring 2015 April 28 A few final commands.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
VBScript Session 13.
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2010 All Rights Reserved. 1.
Regular Expressions. Overview Regular expressions allow you to do complex searches within text documents. Examples: Search 8-K filings for restatements.
Regular Expressions in Perl CS/BIO 271 – Introduction to Bioinformatics.
Pattern Matching CSCI N321 – System and Network Administration.
12. Regular Expressions. 2 Motto: I don't play accurately-any one can play accurately- but I play with wonderful expression. As far as the piano is concerned,
Regular Expressions Todd Kelley CST8207 – Todd Kelley1.
CS346 Regular Expressions1 Pattern Matching Regular Expression.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
PHP’s Regular Expression Functions (Perl Compatible) Examples taken from: Beginning PHP 5 and MySQL 5 From Novice to Professional.
CS 330 Programming Languages 10 / 02 / 2007 Instructor: Michael Eckmann.
1 Perl, Beyond the Basics: Regular Expressions, Subroutines, and Objects in Perl CSCI 431 Programming Languages Fall 2003.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
Karthik Sangaiah.  Developed by Larry Wall ◦ “There’s more than one way to do it” ◦ “Easy things should be easy and hard things should be possible” 
CSCI 330 UNIX and Network Programming Unit IV Shell, Part 2.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
Standard Types and Regular Expressions CS 480/680 – Comparative Languages.
8 1 String Manipulation CGI/Perl Programming By Diane Zak.
LING/C SC/PSYC 438/538 Online Lecture 7 Sandiway Fong.
Introduction to Programming the WWW I CMSC Winter 2003 Lecture 17.
CS 330 Programming Languages 09 / 30 / 2008 Instructor: Michael Eckmann.
OOP Tirgul 11. What We’ll Be Seeing Today  Regular Expressions Basics  Doing it in Java  Advanced Regular Expressions  Summary 2.
Lesson 4 String Manipulation. Lesson 4 In many applications you will need to do some kind of manipulation or parsing of strings, whether you are Attempting.
Regular Expressions Copyright Doug Maxwell (
CSC 4630 Meeting 7 February 7, 2007.
Looking for Patterns - Finding them with Regular Expressions
Lecture 19 Strings and Regular Expressions
CSC 594 Topics in AI – Natural Language Processing
Regular Expressions in Pearl - Part II
Shell Scripting March 1st, 2004 Class Meeting 7.
Regular Expressions and perl
Lecture 9 Shell Programming – Command substitution
LING/C SC/PSYC 438/538 Lecture 8 Sandiway Fong.
CSC 594 Topics in AI – Natural Language Processing
Folks Carelli, Instructor Kutztown University
Phil Tayco Slide version 1.0 Created Oct 2, 2017
CSC 352– Unix Programming, Spring 2016
CSCI 431 Programming Languages Fall 2003
Functions, Regular expressions and Events
CSE 303 Concepts and Tools for Software Development
Regular Expressions and Grep
CIT 383: Administrative Scripting
PolyAnalyst Web Report Training
Regular Expressions in Java
Regular Expression in Java 101
Regular Expression: Pattern Matching
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
LIS651 lecture 4 regular expressions
Presentation transcript:

Regular Expressions in Perl Pat Morin COMP 2405

The =~ Operator In Perl, we can use regular expressions to match (parts of) strings This is done with the =~ operator This operator evaluates to true if the expression matches the string and false otherwise Note that the text between the / and / is processed as a double-quoted string if ($line =~ /__FILENAME__/) { print("This line contains __FILENAME__\n"); }

What is a regular expression? The simplest regular expressions are just strings This is the case if the RE doesn't contain any special characters such as \^.$|()[]{}*+?,/ If you want to include a special character, just escape it with \ if ($formula =~ /e=mc\^2/) { .... }

The Alternation Operator The choice | operator creates a regular expression that matches one of two things if ($lastName =~ /Morin|Lebowsky/) { PrepareWhiteRussian(); }

The Character Class Operator The character class operator [] allows to match any character within the class [abcdefg] is equivalent to a|b|c|d|e|f|g There are a few predefined character classes . is (almost) any character \d any digit 0-9 [0123456789] \s any whitespace [ \t\n\r] \w any word character [_0123456789abc...zABC...Z] if ($lastName = /[Ll1]ebowsky/) { ... } if ($year = /\d\d/) { ... } if ($isoDate = /\d\d\d\d-\d\d-\d\d/) { ... }

More Character Classes There are lots of predefined character classes \W any non-word character \S any non-space character \D any non-digit character There are also POSIX character classes Syntax is [:class:] where class is one of alpha, alnum, ascii, blank, cntrl, digit, graph, lower, print, punct, space, upper, word, xdigit.

The Range Operator The range operator {} can be used to match the same expression repeatedly * match 0 or more times + match 1 or more times ? match 1 or 0 times {n} match exactly n times {n,} match at least n times {n,m} match at least n but not more than m times These operators are greedy, they will try to match as many as possible

The Range Operator - Examples if ($year =~ /\d{4}/) { ... } if ($date =~ /\d{4}-\d{1,2}-\d{1,2}/) { ... } if ($word =~ /\w+/) { ... } if ($oneOrMany =~ /\sabstract(s?)\s/) { ... } if ($numbers =~ /((\d+)\s+)*/) { ... }

Special Characters There are some important special characters ^ is the beginning of the string $ is the end of the string (or before the newline) You can use ^ and $ to make sure your strings don't contain garbage This is good practice for validating user input if ($filename =~ /^\w*$/) { print("$filename is a safe filename"); }

Modifiers Modifiers that appear after the second / control aspects of the RE matching process i makes the RE case-insensitive x ignores whitespace in the RE (for readability) m treats string as a multiline string s treats string as single line string (so . matches \r and \n) g performs a global search (for repeated searches) $date =~ /\d{4} - \d{1,2} - \d{1,2}/x;

The Return Value The =~ operator actually returns a value If no parentheses appear in the RE then =~ returns true or false (1 or 0) If parentheses appear in the RE then =~ returns an array whose elements are the parenthesized parts of the expression that match $year = ($line =~ /\d{4}/); ($lastName, $firstName) = ($input =~ /\s*(\w+)\s*\,\s*(\w+)\s*$/);

Substitutions The RE mechanism also allows for substitutions using the s/// operator Usage: s/pattern/replacement/ The g modifier allows substitution of all occurrences of pattern with replacement (otherwise only the first is substituted) Note: These modify the string $line =~ s/__TITLE__/My Document Title/; $line =~ s/__USERNAME__/$username/; $leet =~ s/[Ll]/1/g; $leet =~ s/[Ee]/3/g;

Summary Perl regular expressions allow us to Search for patterns in strings Handle user input in a robust way Substitute portions of strings with other strings The pattern is treated like a double-quoted string (with variable substitution and escape characters) The parts of the pattern can be returned as an array (when using parentheses) The substitution operator is a great way to manipulate text See Chapter 9 of Perl 5 Tutorial for more info.