Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nate Brunelle Today: Regular Expressions

Similar presentations


Presentation on theme: "Nate Brunelle Today: Regular Expressions"— Presentation transcript:

1 Nate Brunelle Today: Regular Expressions
CS1110 Nate Brunelle Today: Regular Expressions

2 Questions?

3 String.find() Takes a string as an argument, and if exactly that string appears, give its index Mystring.find(“Purple Elephant”) “purple elephant”.find(“Purple Elephant”) “the elephant was purple”

4 Wildcards [Rr]ugs?[^a-zA-Z] Match on/ find: Will not match on/find:
Rugged rugged We might want: A way of saying r or R å Maybe there’s an s ç Something that’s not a letter ê åugçê [Rr]ugs?[^a-zA-Z]

5 he she it they went to the store s?h?e?i?t? Alternation (or)
Sit Alternation (or) | s?he|it|they (she|he|it|they) went to the store she went to the store he went to the store it went to the store they went to the store

6 Star vs plus vs ? Spo?ky Spo*ky Spo+ky Spky Spoky Spooky Spoooky
Spooooky Spoooooooooooooky Spo+ky

7 R string “\”” r“\”” -> error r“\”this” -> error r“\n” -> \n

8 Regex Pieces Operation Example Meaning Character class [Rr] or [rR]
[abcd] [\^A] R or r Exactly one of a, b, c, or d Just carat (^) or A Character Range [a-z] [a-zA-Z] [0-9] Exactly one character “between” a and z “between” a and z or “between” A and Z Any one digit Negative character class [^a] [^a-zA-Z] [^\^] Any one character that’s not an a Any one character that’s not a letter any one character that’s not a carat Optional Quantifier s? [Rr]? Maybe there’s an s, 0 or 1 s Either have one of R or r or neither OR, alternation wx|xyz s?he|it One of the strings wx or xyz Matches one of the two regexes Star [abc]* Any number of a’s b’s and c’s at all 0 or more copies of… Plus [abc]+ At least one of a’s, b’s, and c’s 1 or more copies of…

9 Regex Pieces, Cont. All UVA computing IDs
Operation Example Meaning Count Range {3, 5} [ab]{2,3} [abc]{5} Between 3 and 5 (inclusive) copies of. aa, ab, ba, bb, aaa, aab, abb, baa, … End of Text $ This is some text# Beginning of Text ^ #This is some text Word Boundary \b #This# #is# #some# #text# Anything . Any one character .* Any number of characters All UVA computing IDs 2-3 letters, number, 1-3 letters [a-z]{2,3}[2-9][a-z]{1,3}

10 Give an Expression to match
All UVA computing IDs 2-3 letters, number, 1-3 letters [a-z] [a-z] [a-z]?[2-9] [a-z] [a-z]? [a-z]?

11 What does a for loop look like?
for [variable] in [collection]: Variable: [a-zA-Z]+ [0, 1, 5, 9]

12 import re finder = re.compile Use the finder Match Object search
Similar to string.find(), gives just the first matching instance finditer Gives a collection of match objects findall I list containing: 0 parentheses: m.group() 1 paren: m.group(1) 2+ paren: m.groups() Match Object Group The text we matched on start end groups

13 Writing a regex Write down some examples of strings you want to match, and some examples of similar strings that you don’t want to match Want to match: njb2b, mst3k, aaa8bbb, aa4aa Don’t want to match: a2b, njb2, 7bb Going left-to-right through your examples, try to come up with the rules that will match/not match on the correct strings

14 Regex for phone numbers
((3n) ?|3n-)? 3n- 4n Area = (\([0-9]{3}\) ? | [0-9]{3}-)? Office = [0-9]{3}- rest = [0-9]{4} Want to match: (434) Don’t want to match: Also handle parentheses [2-9][0-9]{2}\-([2-9][0-9]{2}\-)?[0-9]{4}|\([2-9][0-9]{2}) ? [2-9][0-9]{2}\-[0-9]{4}


Download ppt "Nate Brunelle Today: Regular Expressions"

Similar presentations


Ads by Google