Presentation is loading. Please wait.

Presentation is loading. Please wait.

CPTG286K Programming - Perl Chapter 7: Regular Expressions.

Similar presentations


Presentation on theme: "CPTG286K Programming - Perl Chapter 7: Regular Expressions."— Presentation transcript:

1 CPTG286K Programming - Perl Chapter 7: Regular Expressions

2 Regular Expressions (aka regex) Regular expressions are patterns used to match against a string Regular expressions are contained between slashes The outcome is either a successful match or a failure to match Substitution, join, and split operations can be performed on successful matches

3 Simple Uses of regex while (<>)# similar to grep “abc” filename { if (/abc/) # regex /abc/ matches abc to $_ { print; }# prints $_ if it contains abc } Replacing regex /abc/ with: –/ab*c/ matches an a, followed by 0 or more b’s, followed by a c; same as /ab{0,}c/ –/ab+c/ matches an a, followed by 1 or more b’s, followed by a c; same as /ab{1,}c/ –/ab?c/ matches an a, followed by 0 or 1 b’s, followed by a c; same as /ab{0,1}c/

4 Quantifiers SymbolMeaning +Match 1 or more times *Match 0 or more times ?Match 0 or 1 time {n}Match exactly n times {n,}Match at least n times {n,m}Match at least n but not more than m times

5 Patterns Single-character patterns –Character class –Negated character class Grouping patterns –Parenthesis –Multipliers –Sequence and anchoring –Alternation

6 Single-Character Patterns Specific single-character match: /a/ Any non-newline character: /./ Character class: /[valid_list]/ –/[0-9]/ # or \d, any single digit –/[a-zA-Z0-9_]/ # or \w, any word –/[ \r\t\n\f]/ # or \s, any space Negated class: /[^valid_list]/ –/[^0-9]/ # or \D, any single non-digit –/[^a-zA-Z0-9_]/ # or \W, any single non-word –/[^ \r\t\n\f]/ # or \S, any non-space

7 Parenthesis grouping This grouping is used to “memorize” a pattern, so it can be referenced later A memorized pattern is referenced using a backslash and parenthesis grouping number Examples: /(a)(b)c\2d\1/;# matches abcbda /a(.*)b\1c/;# matches aFREDbFREDc but # does not match aXXbXXXc

8 Multiplier grouping /x{5}/# matches exactly 5 x’s /x{5,10}/# matches 5 to 10 x’s /fo+ba?r*/# matches f followed by one or more o’s, a b, # an optional a, and zero or more r’s /fo{1,}ba{0,1}r{0,}/# same as /fo+ba?r*/ using a general multiplier By default, * and + groupings are greedy: $_ = “Nuts sold here. Come here!”; /N.*here/# $_ matches “Nuts sold here. Come here!” /N.*?here/# $_ matches “Nuts sold here.” (non-greedy)

9 Anchor grouping \b requires a word boundary for a match \B requires NO word boundary for match ^ matches beginning of the string $ matches end of string Examples: /\bFred\b/;# matches Fred, not Frederick or alFred /\bFred\B/;# matches Frederick, not Fred Flintstone /^a/;# matches strings beginning with a /c$/;# matches strings ending in c (before \n)

10 Alternatives grouping /al|bert|c/; # matches al or bert or c /^x|y/;# x at beginning of line, # or y anywhere /^(x|y)/;# either x or y at # beginning of line /songbird|bluebird/;# songbird or bluebird /(song|blue)bird/;# same, using parenthesis /(a|b)(c|d)/;# ac, ad, bc, or bd

11 Regex Grouping Precedence Arranged from highest to lowest precedence: NameRepresentation Parenthesis( ) (?: ) Multipliers? + * {m,n} ?? +? *? {m,n}? Sequence and Anchoringabc ^ $ \A \Z (?= ) (?! ) Alternation| Example: /a|b*/;# interpreted as /a|(b*)/, not (a|b)* /a|(?:b*)/;# same, but does not trigger memory # to store into \1

12 The pattern binding =~ operator Use the =~ to bind pattern to a scalar variable other than the default $_ variable To match the regex to $name from keyboard: print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ /^[yY]/)# test both cases print “Proceeding.”;# display decision

13 Ignoring case & other delimiters Append an i to the regex to ignore case: print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ /^y/i)# use either case print “Proceeding.”;# display decision To use a different delimiter: –Place an m followed by a new character in place of slashes (i.e. a #) print “Proceed (y/Y)? ”;# produce prompt chomp ($name = );# chomp input if ($name =~ m#^y#i)# new # delimiter print “Proceeding.”;# display decision

14 Variable Interpolation A regex can be constructed from computed strings rather than literals: $sentence = “Every good bird does fly.”; print “What should I look for? “;# prompt $what = ;# read keyboard chomp($what);# chomp input if ($sentence =~ /$what/)# matches [bw]ird { print “I saw $what in $sentence. \n”; } else { print “Nope… didn’t find it.\n”; }

15 Special Read-only Variables Upon a successful pattern match, $1, $2, $3… are set to values in \1, \2, \3… These read-only variables can be used in later parts of the program: $_ = “This is a test”; /(\w+)\W+(\w+)/;# match first two words # $1 is now “this” and # $2 is now “is” ($first,$second) = /(\w+)\W+(\w+)/; # $first is now “this” and $second is now “is”

16 More Read-only Variables Use the $& variable to examine part of string matching a regex $` is part of string before matching part $’ is part of string after matching part $_ = “This is a sample string”; /sa.*le/; # matches “sample” # $` is now “This is a “ # $& is now “sample” # $’ is now “ string”

17 Substitutions Use the substitution operator: s/regex/new-string/ Replacement strings can be variable interpolated Can use pattern characters in the regex, and special read-only variables Can use ignore case and custom delimiters Can use the pattern binding =~ operator

18 Split Function The split function splits a string into fields delimited by a regex $line = “merlyn::118:10:Randal:/home/merlyn:/usr/bin/perl”; @fields = split(/:/,$line);# split $line using # : as delimiter # @fields is now # (“merlyn”, “”, “118”, “10”, “Randal”, “/home/merlyn”, # “/usr/bin/perl”)

19 Splitting in list context $line = “merlyn::118:10:Randal:/home/merlyn:”; ($name,$password,$uid,$gid,$gcos,$home,$shell) = split(/:/,$line);# split $line using : as delimiter # $name is now “merlyn”, # $password is now “”, # $uid is now “118”, # $gid is now “10”, # $gcos is now “Randal”, # $home is now “/home/merlyn”, # $shell is now undef

20 The “Default” Split $_ = “some string”; @words = split; # same as @words = split(/\s+/, $_); # where \s+ specifies 1 or more spaces # @words is now (“some”,“string”)

21 Join Function The join function joins a list of values with a glue string between list elements The $line can be reconstructed from the @field using $line = join(“:”, @fields);# glue string “:” # is not a regex

22 Glue Ahead & Trailing Glue $_ = "some string";# initialize default string @words = split;# perform default split print "@words\n";# show split result $result = join("+","",@words);# glue ahead print "$result\n";# $result is “+some+string” $output = join(“\n”, @word, “”);# trailing glue print $output\n”;# $output is “some\nstring\n”


Download ppt "CPTG286K Programming - Perl Chapter 7: Regular Expressions."

Similar presentations


Ads by Google