Download presentation
Presentation is loading. Please wait.
Published byIris Harrell Modified over 9 years ago
1
CPTG286K Programming - Perl Chapter 7: Regular Expressions
2
Regular Expressions (aka regex) Regular expressions are patterns used to match against a string Regular expressions are contained between slashes The outcome is either a successful match or a failure to match Substitution, join, and split operations can be performed on successful matches
3
Simple Uses of regex while (<>)# similar to grep “abc” filename { if (/abc/) # regex /abc/ matches abc to $_ { print; }# prints $_ if it contains abc } Replacing regex /abc/ with: –/ab*c/ matches an a, followed by 0 or more b’s, followed by a c; same as /ab{0,}c/ –/ab+c/ matches an a, followed by 1 or more b’s, followed by a c; same as /ab{1,}c/ –/ab?c/ matches an a, followed by 0 or 1 b’s, followed by a c; same as /ab{0,1}c/
4
Quantifiers SymbolMeaning +Match 1 or more times *Match 0 or more times ?Match 0 or 1 time {n}Match exactly n times {n,}Match at least n times {n,m}Match at least n but not more than m times
5
Patterns Single-character patterns –Character class –Negated character class Grouping patterns –Parenthesis –Multipliers –Sequence and anchoring –Alternation
6
Single-Character Patterns Specific single-character match: /a/ Any non-newline character: /./ Character class: /[valid_list]/ –/[0-9]/ # or \d, any single digit –/[a-zA-Z0-9_]/ # or \w, any word –/[ \r\t\n\f]/ # or \s, any space Negated class: /[^valid_list]/ –/[^0-9]/ # or \D, any single non-digit –/[^a-zA-Z0-9_]/ # or \W, any single non-word –/[^ \r\t\n\f]/ # or \S, any non-space
7
Parenthesis grouping This grouping is used to “memorize” a pattern, so it can be referenced later A memorized pattern is referenced using a backslash and parenthesis grouping number Examples: /(a)(b)c\2d\1/;# matches abcbda /a(.*)b\1c/;# matches aFREDbFREDc but # does not match aXXbXXXc
8
Multiplier grouping /x{5}/# matches exactly 5 x’s /x{5,10}/# matches 5 to 10 x’s /fo+ba?r*/# matches f followed by one or more o’s, a b, # an optional a, and zero or more r’s /fo{1,}ba{0,1}r{0,}/# same as /fo+ba?r*/ using a general multiplier By default, * and + groupings are greedy: $_ = “Nuts sold here. Come here!”; /N.*here/# $_ matches “Nuts sold here. Come here!” /N.*?here/# $_ matches “Nuts sold here.” (non-greedy)
9
Anchor grouping \b requires a word boundary for a match \B requires NO word boundary for match ^ matches beginning of the string $ matches end of string Examples: /\bFred\b/;# matches Fred, not Frederick or alFred /\bFred\B/;# matches Frederick, not Fred Flintstone /^a/;# matches strings beginning with a /c$/;# matches strings ending in c (before \n)
10
Alternatives grouping /al|bert|c/; # matches al or bert or c /^x|y/;# x at beginning of line, # or y anywhere /^(x|y)/;# either x or y at # beginning of line /songbird|bluebird/;# songbird or bluebird /(song|blue)bird/;# same, using parenthesis /(a|b)(c|d)/;# ac, ad, bc, or bd
11
Regex Grouping Precedence Arranged from highest to lowest precedence: NameRepresentation Parenthesis( ) (?: ) Multipliers? + * {m,n} ?? +? *? {m,n}? Sequence and Anchoringabc ^ $ \A \Z (?= ) (?! ) Alternation| Example: /a|b*/;# interpreted as /a|(b*)/, not (a|b)* /a|(?:b*)/;# same, but does not trigger memory # to store into \1
12
The pattern binding =~ operator Use the =~ to bind pattern to a scalar variable other than the default $_ variable To match the regex to $name from keyboard: print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ /^[yY]/)# test both cases print “Proceeding.”;# display decision
13
Ignoring case & other delimiters Append an i to the regex to ignore case: print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ /^y/i)# use either case print “Proceeding.”;# display decision To use a different delimiter: –Place an m followed by a new character in place of slashes (i.e. a #) print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ m#^y#i)# new # delimiter print “Proceeding.”;# display decision
14
Variable Interpolation A regex can be constructed from computed strings rather than literals: $sentence = “Every good bird does fly.”; print “What should I look for? “;# prompt $what = ;# read keyboard chomp($what);# chomp input if ($sentence =~ /$what/)# matches [bw]ird { print “I saw $what in $sentence. \n”; } else { print “Nope… didn’t find it.\n”; }
15
Special Read-only Variables Upon a successful pattern match, $1, $2, $3… are set to values in \1, \2, \3… These read-only variables can be used in later parts of the program: $_ = “This is a test”; /(\w+)\W+(\w+)/;# match first two words # $1 is now “this” and # $2 is now “is” ($first,$second) = /(\w+)\W+(\w+)/; # $first is now “this” and $second is now “is”
16
More Read-only Variables Use the $& variable to examine part of string matching a regex $` is part of string before matching part $’ is part of string after matching part $_ = “This is a sample string”; /sa.*le/; # matches “sample” # $` is now “This is a “ # $& is now “sample” # $’ is now “ string”
17
Substitutions Use the substitution operator: s/regex/new-string/ Replacement strings can be variable interpolated Can use pattern characters in the regex, and special read-only variables Can use ignore case and custom delimiters Can use the pattern binding =~ operator
18
Split Function The split function splits a string into fields delimited by a regex $line = “merlyn::118:10:Randal:/home/merlyn:/usr/bin/perl”; @fields = split(/:/,$line);# split $line using # : as delimiter # @fields is now # (“merlyn”, “”, “118”, “10”, “Randal”, “/home/merlyn”, # “/usr/bin/perl”)
19
Splitting in list context $line = “merlyn::118:10:Randal:/home/merlyn:”; ($name,$password,$uid,$gid,$gcos,$home,$shell) = split(/:/,$line);# split $line using : as delimiter # $name is now “merlyn”, # $password is now “”, # $uid is now “118”, # $gid is now “10”, # $gcos is now “Randal”, # $home is now “/home/merlyn”, # $shell is now undef
20
The “Default” Split $_ = “some string”; @words = split; # same as @words = split(/\s+/, $_); # where \s+ specifies 1 or more spaces # @words is now (“some”,“string”)
21
Join Function The join function joins a list of values with a glue string between list elements The $line can be reconstructed from the @field using $line = join(“:”, @fields);# glue string “:” # is not a regex
22
Glue Ahead & Trailing Glue $_ = "some string";# initialize default string @words = split;# perform default split print "@words\n";# show split result $result = join("+","",@words);# glue ahead print "$result\n";# $result is “+some+string” $output = join(“\n”, @word, “”);# trailing glue print $output\n”;# $output is “some\nstring\n”
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.