Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular languages, regular expressions, & finite automata (intro)

Similar presentations


Presentation on theme: "Regular languages, regular expressions, & finite automata (intro)"β€” Presentation transcript:

1 Regular languages, regular expressions, & finite automata (intro)
CS 350 β€” Fall 2018 gilray.org/classes/fall2018/cs350/

2 𝐿={hello,bonjour,konnichiwa,…}
Ξ£={a,b,c,…,y,z}

3 Ξ£={a,b,c,…,y,z} Ξ£ βˆ— ={ 𝑀 0 … 𝑀 π‘˜ |βˆ€π‘˜,π‘–βˆˆβ„•. 𝑀 𝑖 ∈Σ}

4 Ξ£={a,b,c,…,y,z} Ξ£ βˆ— ={ 𝑀 0 … 𝑀 π‘˜ |βˆ€π‘˜,π‘–βˆˆβ„•. 𝑀 𝑖 ∈Σ} abcabc∈ Ξ£ βˆ— πœ–βˆˆ Ξ£ βˆ—
Examples: abcabc∈ Ξ£ βˆ— πœ–βˆˆ Ξ£ βˆ— aaaaβ€¦βˆ‰ Ξ£ βˆ—

5 Regular languages

6 Regular expressions: E

7 βˆ€π‘ŽβˆˆΞ£.β„’(a)={a}

8 βˆ€π‘ŽβˆˆΞ£.β„’(a)={a} The interpretation of regex β€œa” is the singleton set
containing just the string β€œa”. βˆ€π‘ŽβˆˆΞ£.β„’(a)={a}

9 All characters in the alphabet are regular expressions.
βˆ€π‘ŽβˆˆΞ£.β„’(π‘Ž)={π‘Ž}

10 There is also an empty/null language, , and an empty-string language, .
Ø πœ– β„’(Ø)={} β„’(πœ–)={πœ–}

11 Composite forms of regular expressions can be derived, from other composite forms, and terminally, from null, empty, or single-character REs. A minimal and sufficient set of derived forms is: disjunction of REs, composition of REs, and kleene star of REs.

12 Ø∈𝐸, πœ–βˆˆπΈ, π‘ŽβˆˆΞ£βŸΉπ‘ŽβˆˆπΈ π‘’βˆˆπΈβŸΉ(𝑒)∈𝐸 𝑒 0 ∈𝐸∧ 𝑒 1 ∈𝐸⟹ 𝑒 0 𝑒 1 ∈𝐸 𝑒 0 ∈𝐸∧ 𝑒 1 ∈𝐸⟹ 𝑒 0 | 𝑒 1 ∈𝐸 π‘’βˆˆπΈβŸΉ 𝑒 βˆ— ∈𝐸

13 π‘’βˆˆπΈ::=𝑀|Ø|πœ– |(𝑒) | 𝑒 0 𝑒 1 | 𝑒 0 + 𝑒 1 | 𝑒 βˆ— π‘€βˆˆΞ£={…}
There is one base case defining regexes and four inductive cases. Both β€œ|” and β€œ+” are commonly used to signify disjunction in regexes. π‘’βˆˆπΈ::=𝑀|Ø|πœ– |(𝑒) | 𝑒 0 𝑒 1 | 𝑒 0 + 𝑒 1 | 𝑒 βˆ— π‘€βˆˆΞ£={…}

14

15 Interpreting Regexes

16 Thus, a|bc|bcd* is the same as (a)|(bc)|(bc(d*))
Precedence: kleene star (*), concatenation (ab), then disjunction (a|b). Thus, a|bc|bcd* is the same as (a)|(bc)|(bc(d*))

17 β„’(π‘Ž)={π‘Ž} β„’(πœ–)={πœ–} β„’(Ø)={} β„’( 𝑒 0 𝑒 1 )=β„’( 𝑒 0 )βˆ˜β„’( 𝑒 1 )
Juxtaposition is language concatenation, disjunction is language union, kleene star is interpreted as kleene closure: β„’( 𝑒 0 𝑒 1 )=β„’( 𝑒 0 )βˆ˜β„’( 𝑒 1 ) β„’( 𝑒 0 | 𝑒 1 )=β„’( 𝑒 0 )βˆͺβ„’( 𝑒 1 ) β„’( 𝑒 0 βˆ— )=β„’( 𝑒 0 ) βˆ—

18 Language concatenation: Kleene-closure of a Language:
𝐿∘ 𝐿 β€² ={ 𝑠 0 𝑠 1 | 𝑠 0 ∈𝐿∧ 𝑠 1 ∈ 𝐿 β€² } Kleene-closure of a Language: 𝐿 βˆ— ={ 𝑠 0 𝑠 1 … 𝑠 π‘˜ |π‘˜βˆˆβ„•βˆ§ 𝑠 𝑖 ∈𝐿}

19 Kleene-closure can also be defined as a fixed point!
β„’(𝑒 ) βˆ— =𝐿,where 𝑓 𝑒 (𝐿)=𝐿 𝑓 𝑒 (𝐿)=πΏβˆ˜β„’(𝑒)βˆͺ{πœ–}

20 This means kleene star is idempotent
( 𝐿 βˆ— ) βˆ— = 𝐿 βˆ— This means kleene star is idempotent β„’(𝑒 ) βˆ— =𝐿,where 𝑓 𝑒 (𝐿)=𝐿 𝑓 𝑒 (𝐿)=πΏβˆ˜β„’(𝑒)βˆͺ{πœ–}

21 L is the language of odd-length strings of zeros. Give a regex for L.
Try an example: L is the language of odd-length strings of zeros. Give a regex for L. 𝐿={ 𝑠 1 … 𝑠 2π‘˜+1 |π‘˜βˆˆβ„•βˆ§ 𝑠 𝑖 =0}

22 L is the language of odd-length strings of zeros. Give a regex for L.
Try an example: L is the language of odd-length strings of zeros. Give a regex for L. 0(00 ) βˆ—

23 Try an example: L is the language of all strings over alphabet {0,1} where every 1 has an adjacent 1.

24 e+ is syntactic sugar for ee*
Try an example: L is the language of all strings over alphabet {0,1} where every 1 has an adjacent 1. (0| 111 βˆ— ) βˆ— =(0| ) βˆ— e+ is syntactic sugar for ee*

25 Try an example: L is the language of odd decimal integers greater than zero. Give a regex for L.

26 Try an example: L is the language of odd decimal integers greater than zero. Give a regex for L. (1|2|3|4|5|6|7|8|9)(0|1|2|3|4|5|6|7|8|9)βˆ—(1|3|5|7|9)|1|3|5|7|9

27 [a-g] is a character class, and is syntactic sugar for (a|b|c|d|e|f|g)
Try an example: L is the language of odd decimal integers greater than zero. Give a regex for L. [a-g] is a character class, and is syntactic sugar for (a|b|c|d|e|f|g) [1βˆ’9][0βˆ’9]βˆ—(1|3|5|7|9)|1|3|5|7|9

28 Try an example: L is the language of odd decimal integers greater than zero. Give a regex for L. is syntactic sugar for 𝑒? ([1βˆ’9][0βˆ’9]βˆ—)?(1|3|5|7|9) (𝑒|πœ–)

29 Regexes in Python import re r’a’ r’ab’ π‘Žπ‘ π‘Ž r’b’ r’a|b’ π‘Ž|𝑏 𝑏

30 Regexes in Python import re r’a*’ π‘Ž βˆ— r’b+’ 𝑏 + r’c?’ 𝑐?

31 (𝑧𝑧|𝑧𝑧𝑧|𝑧𝑧𝑧𝑧) r’\d’ r’[0-9]’ r’z{2,4}’ Regexes in Python import re
(0|1|2|3|4|5|6|7|8|9) r’[0-9]’ (0|1|2|3|4|5|6|7|8|9) r’z{2,4}’ (𝑧𝑧|𝑧𝑧𝑧|𝑧𝑧𝑧𝑧)

32 Regexes in Python import re >>> m = re.match(r’\d’,’5’)
>>> m.group(0) ’5’

33 Regexes in Python import re >>> m = re.match(r’\d’,’’)
>>> m == None True

34 Regexes in Python import re
>>> m = re.match(r’(\d)\d\d’,’456’) >>> m.group(0) β€˜456’ >>> m.group(1) β€˜4’

35 Regexes in Python import re
>>> m = re.match(r’(\d)\d\d’,’4567’) >>> m.group(0) β€˜456’ >>> m.group(1) β€˜4’

36 Regexes in Python import re
>>> m = re.match(r’(\d)\d\d’,’4567’) >>> m.group(0) β€˜456’ >>> m.group(1) β€˜4’ >>> m = re.match(r’^(\d)\d\d$’,’4567’) >>> m == None True

37 Finite Automata

38 Every automata has a set of states, one of which must be a designated start state. This state is marked by an incoming arrow, like so. q0

39 There may also be zero or more final statesβ€”also called accept states
There may also be zero or more final statesβ€”also called accept states. These are shown with an extra circle around them. q0 q1

40 The starting state may also be an accept state.
q0 The starting state may also be an accept state.

41 These are also two of the simplest languages.
β„’(Ø) β„’(πœ–) q0 q0 These are also two of the simplest languages.

42 Ξ£ Edges are labeled with characters from .
This DFA is equivalent to the RE: a Ξ£ a q0 q1 …it reads the character a, and then accepts it.

43 This encodes the language
b q0 q1 q2 This encodes the language {a,ab}

44 This encodes the language
b a b q0 q1 q2 This encodes the language π‘Ž + 𝑏 βˆ—

45 Edges not shown implicitly reach a dead state.
b a b q0 q1 q2 b a dead a,b

46 Try an example: L is the language of all strings over {0,1} where there are an even number of 1s.

47 Try an example: L is the language of all strings over {0,1} where there are an even number of 1s. 1 1 q0 q1

48 Try an example: L is the language of all strings over {0,1,2} where there are an odd number of 1s and no 2s.

49 Try an example: L is the language of all strings over {0,1,2} where there are an odd number of 1s and no 2s. 1 1 q0 q1

50 Deterministic Finite Automata (DFA)
Non-deterministic Finite Automata (NFA)

51 Equivalent models of regular languages
β€œConverts to” GNFA β€œConverts to” RE DFA β€œMinimizes to” NFA β€œConverts to” β€œConverts to”


Download ppt "Regular languages, regular expressions, & finite automata (intro)"

Similar presentations


Ads by Google