Download presentation
Presentation is loading. Please wait.
1
Using regular expressions Search for a single occurrence of a specific string. Search for all occurrences of a string. Approximate string matching.
2
Forming RegExps Strings Variables Patterns
3
Strings and Variables /Joey Ramone/ - match a specific string. /$name/, where $name = “Joey Ramone” - match the string stored in a variable. /Joey $name/ - matching a pattern defined by a mixture of strings and variables.
4
Character classes abc – match “abc”. – match any single character (i.e. a.b). [abc] – match “a” or “b” or “c” [0123456789] – match “0” or “1” or …or “9” [0-9] – same as previous [a-z] – match “a” or “b” or …or “z” [A-Z] – same as previous only with caps [] – match any single occurrence of any of the characters found within. [0-9a-zA-Z-] – match any alphanumeric or the minus sign
5
Negated character classes [^0-9] – match any single character that is not a numeric digit [^aeiouAEIOU] – match any single character that is not a vowel Works only for single characters We’ll discuss matching negated strings of characters later.
6
Escape characters \ - use the backslash to match any special character as the character itself. /\$name/ - match the literal string “$name”. /a\.b/ - match the literal string “a.b” rather than “a” followed by any character, followed by “b”.
7
Convenience character classes \d (a digit) - [0-9] \D (digits, not!) - [^0-9] \w (word char) - [a-zA-Z0-9_] \W (words, not!) - [^a-zA-Z0-9_] \s (space char) - [ \r\t\n\f] \S (space, not!) - [^ \r\t\n\f]
8
Sequences + - one or more of preceding pattern /[a-zA-Z]+/ (match a string of alpha characters such as a name). ? (match zero or one instance of preceding character). /[a-zA-Z]+-?[a-zA-Z]+ (Now we can match hyphenated names).
9
Sequences * (match zero or more of preceding pattern) Example – list of names: –George Harrison –Paul McCartney –Richard “Ringo” Starkey –John Winston Lennon /[a-zA-Z]+ [a-zA-Z]+/ (match first and last name) /[a-zA-Z]+ [a-zA-Z\”]* [a-zA-Z]+/ (match first name, middle name, if it exists, and last name)
10
Sequences {k} – match k instances of preceding pattern. Example: floating point numbers to 2 decimal places –/[0-9]+\.[0-9]{2} {k,j} – match at least k instances of preceding pattern, but no more than j. Example: floating point numbers that may or may not have a decimal component. –/[0-9]+\.?[0-9]{0,2}/
11
Grouping /(John|Paul|George|Ringo)/ – matches any one of either “John”, “Paul”, “George”, or “Ringo” /((John|Paul|George|Ringo) )+/ Matches the Beatles names listed in any order. –John Paul George Ringo –Paul George John Ringo –Ringo Paul George John Actually, this will also match: –Paul Paul Paul Paul Paul Paul Paul Paul Paul Be careful about what assumptions you make.
12
Problem Write a regular expression that will match social security number. Format: 555-55-5555
13
A solution /[0-9]{3}-[0-9]{2}-[0-9]{4}/
14
Problem Write a regular expression that will match a phone number. Formats –319-337-3663 –319.337.3663
15
A solution /[0-9]{3}[\.-][0-9]{3}[\.-][0-9]{4}
16
Add another format 3193373663
17
A solution /[0-9]{3}[\.-]?[0-9]{3}[\.-]?[0- 9]{4}/
18
Problem Write a regular expression that will match an email address. Legal characters for names are: –Letters, numbers, “-”, and “_” Legal characters for domain names are: –Letters only Assume form: username@machine.domain.suffix
19
A solution /[a-z0-9-_]+\@[a-z]+(\.[a-z]+){2}/ More general version: /[a-z0-9-_]+\@[a-z]+(\.[a-z]+)+/
20
Problem Write a regular expression that will match an HTML anchor start tag. Assume anchor tag is of the form: – some anchor text
21
A solution / Actually, quotes are not required So it should be: –/ ]+”?>/ How would we assign the url to a variable?
22
A solution ($url) = ($htmlText =~ m/ ]”?>/);
23
Take Away There is almost always a pattern that will match what you want it to match. The best way to learn is to simply jump in and start writing your own patterns. If you have a question about how to construct one, feel free to ask me. One typically learns Perl by asking people with more experience.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.