Download presentation
Presentation is loading. Please wait.
Published byElaine Foster Modified over 9 years ago
1
Mechanics Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information. Regular Expressions
2
Mechanics Notebook #1 Site Date Evil (millivaders) ---- ---- ------------------ Baker 1 2009-11-17 1223.0 Baker 1 2010-06-24 1122.7 Baker 2 2009-07-24 2819.0 Baker 2 2010-08-25 2971.6 Baker 1 2011-01-05 1410.0 Baker 2 2010-09-04 4671.6 ⋮ ⋮ ⋮
3
Regular ExpressionsMechanics Notebook #2 Site/Date/Evil Davison/May 22, 2010/1721.3 Davison/May 23, 2010/1724.7 Pertwee/May 24, 2010/2103.8 Davison/June 19, 2010/1731.9 Davison/July 6, 2010/2010.7 Pertwee/Aug 4, 2010/1731.3 Pertwee/Sept 3, 2010/4981.0 ⋮ ⋮ ⋮
4
Regular ExpressionsMechanics '(.+)/([A-Z][a-z]+) ([0-9]{1,2}),? ([0-9]{4})/(.+)' This pattern matches: - one or more characters - a slash - a single upper-case letter - one or more lower-case letters - a space - one or two digits - a comma if one is there - a space - exactly four digits - a slash - one or more characters
5
Regular ExpressionsMechanics How?
6
Regular ExpressionsMechanics How? Using finite state machines
7
Regular ExpressionsMechanics a Match a single 'a'
8
Regular ExpressionsMechanics Match a single 'a' start here a
9
Regular ExpressionsMechanics Match a single 'a' match this character a
10
Regular ExpressionsMechanics Match a single 'a' must be here at the end a
11
Regular ExpressionsMechanics Match a single 'a' must be here at the end a a
12
Regular ExpressionsMechanics Match one or more 'a' a a
13
Regular ExpressionsMechanics Match one or more 'a' a a match this as before
14
Regular ExpressionsMechanics Match one or more 'a' a a match this as before match again and again
15
Regular ExpressionsMechanics Match one or more 'a' a a don't have to stop here the first time, just have to be here at the end match this as before match again and again
16
Regular ExpressionsMechanics Match one or more 'a' a a don't have to stop here the first time, just have to be here at the end a a+ match this as before match again and again
17
Regular ExpressionsMechanics Match 'a' or nothing a
18
Regular ExpressionsMechanics Match 'a' or nothing a transition is "free"
19
Regular ExpressionsMechanics Match 'a' or nothing a transition is "free" So this is '(a|)'
20
Regular ExpressionsMechanics Match 'a' or nothing a transition is "free" So this is '(a|)' Which is 'a?'
21
Regular ExpressionsMechanics Match 'a' or nothing a transition is "free" So this is '(a|)' Which is 'a?' a a+ a?
22
Regular ExpressionsMechanics Match zero or more 'a' a a
23
Regular ExpressionsMechanics Match zero or more 'a' a a Combine ideas
24
Regular ExpressionsMechanics Match zero or more 'a' a a Combine ideas This is 'a*'
25
Regular ExpressionsMechanics Match zero or more 'a' a a Combine ideas This is 'a*' a a+ a? a*
26
Regular ExpressionsMechanics a a d c What regular expression is this? b
27
Regular ExpressionsMechanics a a d c What regular expression is this? a+|(b(c|d)) b
28
Regular ExpressionsMechanics Action at a node depends only on:
29
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node
30
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data
31
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data Finite state machines have no memory
32
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data Finite state machines have no memory Means it is impossible to write a regular expression to check if arbitrarily nested parentheses match
33
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data Finite state machines have no memory Means it is impossible to write a regular expression to check if arbitrarily nested parentheses match "(((....)))" requires memory
34
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data Finite state machines have no memory Means it is impossible to write a regular expression to check if arbitrarily nested parentheses match "(((....)))" requires memory (or at least a counter)
35
Regular ExpressionsMechanics Action at a node depends only on: - arcs out of that node - characters in target data Finite state machines have no memory Means it is impossible to write a regular expression to check if arbitrarily nested parentheses match "(((....)))" requires memory (or at least a counter) Similarly, only way to check if a word contains each vowel once is to write 5! = 120 clauses
36
Regular ExpressionsMechanics Why use a tool with limits?
37
Regular ExpressionsMechanics Why use a tool with limits? They're fast
38
Regular ExpressionsMechanics Why use a tool with limits? They're fast - After some pre-calculation, a regular expression only has to look at each character in the input data once
39
Regular ExpressionsMechanics Why use a tool with limits? They're fast - After some pre-calculation, a regular expression only has to look at each character in the input data once It's readable
40
Regular ExpressionsMechanics Why use a tool with limits? They're fast - After some pre-calculation, a regular expression only has to look at each character in the input data once It's readable - More readable than procedural equivalent
41
Regular ExpressionsMechanics Why use a tool with limits? They're fast - After some pre-calculation, a regular expression only has to look at each character in the input data once It's readable - More readable than procedural equivalent And regular expressions can do a lot more than what we've seen so far
42
June 2010 created by Greg Wilson Copyright © Software Carpentry 2010 This work is licensed under the Creative Commons Attribution License See http://software-carpentry.org/license.html for more information.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.