Presentation is loading. Please wait.

Presentation is loading. Please wait.

Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. email address.

Similar presentations


Presentation on theme: "Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. email address."— Presentation transcript:

1 Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. email address Regular expressions or ‘regexes’ give use the power to do this kind of matching At simplest, any word is a regex –regex: ‘email’ –test: ‘email’ Regex is in string, so it matches!

2 Regular Expressions In reality regexes are used to search for a string that "has the form" of the regular expression” Need to define some syntax that lets us specify things such as –'a number is in a range‘; –'a letter is one of a set‘; –'a certain number of characters' etc. Requires special characters

3 Regular Expressions Some special characters: *, [], {} For a complete reference see http://www.regular- expressions.info/reference.html http://www.regular- expressions.info/reference.html An asterisk * specifies that the character preceding it can appear zero or more times, e.g, –regex: 'a*b' –test: 'b' # Matches as there is no 'a’ –test: ‘ab’ #Matches –test: ‘aaab’ #Matches

4 Regular Expressions A range of characters, or a "character class" is defined using square brackets [], e.g. –regex: '[a-z]' –test: 'm' # Matches as it is a lower case letter –test: ‘M' # Fails as it is an upper case letter Multiple ranges: separate with comma –regex: '[a-z,A-Z,0-9]' –test: ‘M’ # Matches –test: ‘9’ # Matches

5 Regular Expressions To specify an exact number of characters use braces {}, e.g. –regex: 'a{2}' –test: 'abab' # Fails as there is not two # consecutive a's in the string –test: 'aaaab' # Matches

6 Regular Expressions in Python Python contains a regular expression module, called ‘re’ that allows strings to be tested against regular expressions –import re –checker = re.compile('[a-z]') –if checker.match(test) != None: print 'String matches!' –else: print 'String does not contain a match'

7 Practical example filetestsRun = testResults.log' f = open(filetestsRun,'r') reTestCount = re.compile("Running\\s*(\\d+)\\s*test", re.IGNORECASE) reCrashCount = re.compile("OK!") reFailCount = re.compile("Failed\\s*(\\d+)\\s*of\\s*(\\d+)\\s*tests", re.IGNORECASE) Above code searches through a file for lines such as –Running 13 tests.............OK! Used on Mantid to keep track of build server test passes/failures


Download ppt "Pattern matching with regular expressions A common file processing requirement is to match strings within the file to a standard form, e.g. email address."

Similar presentations


Ads by Google