Download presentation
Presentation is loading. Please wait.
Published byVictor Neal Modified over 9 years ago
1
Pattern Matching II
2
Greedy Matching When dealing with quantifiers, Perl’s pattern matcher is by default greedy. For example, –$_ = “Bob sat next to the Bobcat and listened to the Bobolink”; /.*Bob/ –$_ = “Freddie’s hot dogs”; /Fred+/ –$_ = “Freddie’s hot dogs are really hot!”; /.*hot/
3
Minimal Matching The minimal mode is specified by (?) after the quantifier. For example, –$_ = “Freddie’s hot dogs”; /Fred+?/ –$_ = “Freddie’s hot dogs are really hot!”; /.*?hot/
4
Multiple Quantifiers Leftmost quantifier is greediest. For example, –$_ = “Bob sat next to the Bobcat and listened to the Bobolink”; /Bob.*Bob.*link/ The first.* matches: –“ sat next to the Bobcat and listened to the “
5
Anchors More complicated patterns can be created with anchors. An anchor requires a pattern to match at specific places in a string. Allows a particular position in a pattern to align with a particular position in the string.
6
(^) Anchor (^) requires the pattern match at the beginning. For example, –/^Shelley/ “Shelley has red hair” “What color is Shelley’s hair?” –/^[^!]^/ The meaning of (^) depends on the context.
7
($) Anchor ($) requires the pattern match at the end. For example, –/hair$/ “Shelley has red hair” “What color is Shelley’s hair?”
8
(\b) Anchor (\b) matches the position between a word and a non-word character. For example, –/\bwear\b/ “I wear shoes” “Swimwear for sale.” “Molly wears green sweaters.”
9
Binding Operators A pattern can be matched against any string with binding operators (=~) and (!~) The left operand must evaluate to a string and the return value is a Boolean. For example, –$string =~ /[,;:]/ –$string !~ /[,;:]/ –if ( =~ /^[Yy]/) { … }
10
Pattern Modifiers A pattern can be followed by a modifier. The modifier changes how: –The pattern is interpreted. –The pattern matcher works while using the pattern. The most common modifiers are: –i, m, s, o, x
11
(i) Modifier (i) modifier tells the pattern matcher to ignore case. For example, /apples/i matches –“apples” –“Apples” –“APPLES” –“ApPlEs”
12
(m) And (s) Modifier (m) treats a string as multiple lines: –(^) matches just after any newline. –($) matcher just before any newline. (s) treats a string as a single line: –(.) will also match newline characters. If both (m) and (s) are specified: –(.) matches any character. –(^) and ($) match positions after and before a newline
13
(o) Modifier Patterns can include scalar variables: –The variables are interpolated. Patterns containing variables are recompiled every time their used. Provides dynamic patterns, but very expensive. Include (o) modifier if variable never changes. –Tells Perl not to recompile the pattern.
14
(x) Modifier (x) tells the pattern matcher to ignore white spaces. For example, /\d+ \. \d+/x is equivalent to /\d+\.\d+/ Allows comments to be included for patterns. /\d+# digits before the decimal. \.# The decimal point. \d+# digits after the point. /x
15
Remembering Matches Sometimes a pattern needs to reference a part of a string it matched earlier. Done by parenthesizing parts of interest. Referenced by implicitly defined variables –e.g. \1, \2, \3, … For example, –/(\w+).*\1/ - “jo likes joanne.” –/(.)\1/ –/([‘”])(.*?)\1/
16
References Outside a Pattern Parts of a pattern are needed outside the pattern sometimes. Can be referenced by implicit variables: –e.g. $1, $2, $3, … For example, “VY ran for 267 yards Saturday” =~ /(\d+) (\w+) (\w+)/; print “$1 $2 $3 \n”;
17
Nested Parentheses Patterns can have nested parentheses. Relate to variables by counting ( starting from the left. For example: $_ = “31 Oct 2005”; /((\d+) (\w+) (\d+))/; print “$1 \n $2 $3 $4 \n”;
18
Backreferences \n and $n are called backreferences. –Refers to the result of the previous match. Perl also includes 3 implicit variables. –$` – part before the match. –$& – part that matched. –$’ – part after the match. Costly for matcher to save these for every match.
19
RegEx Extensions Perl includes several extensions to previous versions of its regular expression syntax. The general form is: (?xPattern) x is a one or two character code.
20
Look Ahead Want a pattern to match if (not) followed by a subpattern, but do not want the subpattern as part of the match. (?=) and (?!) provides this look ahead behavior. For example, –/\d+(?=\.)/ –/\d+(?!\.)/
21
Look Behind Perl also allows look behinds. (?<=) and (?<!) provides this behavior. For example, –/(?<=\.)d+/ –/(?<!\.)d+/
22
Substitution Often need to find a substring and replace it with another. Perl has a substitution operator for this. The general form is: –s dl Pattern dl New_string dl Modifiers The common form is: –s/Pattern/New_string/ The return value is the number of substitutions made.
23
Examples Example 1: $_ = “No more apples!”; s/apples/applets/; Example 2: $_ = “Who are Jack and Jill?”; s/(\w+) and (\w+)/$2 & $1/;
24
Substitution with Modifiers Modifiers can be used with the substitution operator. i, o, m, s, and x have the same effect. There are two common modifiers for substitutions: –g: perform substitution everywhere it applies. –e: substitution part treated as a Perl expression.
25
(g) and (e) Examples Example 1: $_ = “12034005”; s/0//g; Example 2: $_ = “Molly and Mary were cold.”; s/(\w+)/”\1”/g; Example 3: $_ = “Is it Sum, SUM, sum, or suM?”; s/sum/sum/ig; Example 4: s/(\w+)/uc($1)/e;
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.