Presentation is loading. Please wait.

Presentation is loading. Please wait.

R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen.

Similar presentations


Presentation on theme: "R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen."— Presentation transcript:

1 R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen

2 O BJECTIVE What is Regular Expression? How to use Regular Expression in Perl Basic tools Simple word matching Using character classes Matching this or that … Power tools

3 W HAT IS R EGULAR E XPRESSION (R EGEX, R EGEXP )? Big factor behind the fame of Perl A string that describe a pattern Examples of pattern: Search engine to find webpage (Google) List files in directory (ls *.txt, dir *.*) Search, extract parts of strings, search and replace (Microsoft Word) Efficient, flexible to manipulate text Not really difficult to understand as reputation Constructed using simple concepts (conditional, loop) If getting used to terse notation of them, you’re good to go

4 H OW TO USE R EGEX Part 1: basics (solve about 98% of your needs) Simple word matching Using character classes Matching this or that Part 2: power tools (for the rest) Advanced regex operators Latest innovation

5 P ART 1: T HE BASICS Simple word matching The simplest regex: a word, a string of characters Match any string that contains that word Eg: Result: It matches

6 P ART 1: T HE BASICS Simple word matching Operator =~ : return true if the regex matched !~ : return true if doesn’t match / … / : delimiter to enclose the string/variable of string needed to search Eg: $greeting = “World”; if (“Hello World” =~ /$greeting/) { … } Other arbitrary delimiters:

7 P ART 1: T HE BASICS Simple word matching – Additional Can use the default variable $_, the omit “$_ =~ ” part Eg: $_ = “Hello World”; If (/World/) { … } If regex matches in > 1 place: the earliest point is matched Eg: "Hello World" =~ /o/; # matches 'o' in 'Hello‘

8 P ART 1: T HE BASICS Simple word matching – Special characters metacharacters: {}[]()^$.|*+?\ Use backslash \ to include Escape Sequences ASCII characters (\n, \t. etc), arbitrary bytes (octal, hexa ) Variables: substituted before matching Eg: $foo = ‘house’; 'cathouse' =~ /cat$foo/; # matches

9 P ART 1: T HE BASICS Simple word matching – Special characters Anchor metacharacters: ^ and $, to match the beginning and the end of string Overall: it’s just the surface of regex technology

10 P ART 1: T HE BASICS Using character classes: A set of possible characters To match the whole class at particular point in the regex Denoted by brackets [ … ] Eg: /item[0123456789]/; # matches 'item0' or... or 'item9' "abc" =~ /[cab]/; # matches 'a‘ To match 'yes' in a case-insensitive way (yes, Yes, YES): /[yY][eE][sS]/ /yes/i(i : case-insensitive, modifier of matching operation)

11 P ART 1: T HE BASICS Using character classes – Special characters: Special characters: -]\^$ Needed a backslash to represent ]The end of a character class $Scalar variable \Escape sequences -Range operator within character class ^Negated character class

12 P ART 1: T HE BASICS Using character classes – Special characters: Several abbreviations for common character classes \da digit and represents [0-9] \swhitespace character, represents [\ \t\r\n\f] \Dnegated \d \Snegated \s \Wnegated \w.any character but "\n" \bmatches a boundary between a word character and a non-word character \w\W or \W\w

13 P ART 1: T HE BASICS Issues: why '.' matches everything but "\n“? We would like to ignore the newline characters, empty when counting and matching on the line If we want to keep track of newlines: anchor ^ $, modifiers /…/s (single line) and /…/m (multiple line) No modifier //‘.’ match any character except ‘\n’ ^, $: just match the beginning and end of string, before a newline S modifier //sTreat string as a single long line ‘.’ match any character, ^ and $ just match the beginning and end of string before a newline M modifier //m Treat string as a set of multiple lines ‘.’ match any character except ‘\n’ ^ and $ match at the start or end of any line in string Both //smTreat string as a single line, but detect multiple lines ‘.’ match any character ^ and $ match the start and end of any line within the string

14 P ART 1: T HE BASICS Matching this or that: Able to match different possible words or strings Using alternation metacharacter | Eg: "cats and dogs" =~ /dog|cat|bird/; # matches "cat“ "cats" =~ /cats|cat|ca|c/; # matches "cats"

15 QUESTION


Download ppt "R EGULAR E XPRESSION IN P ERL (P ART 1) Thach Nguyen."

Similar presentations


Ads by Google