Download presentation
Presentation is loading. Please wait.
1
Languages and Machines Unit two: Regular languages and Finite State Automata
2
2 Review of week one A language is a set of strings (the set of different things you can say). May be infinite. A string is a sequence of symbols. Minimum length zero, maximum length some finite number. A symbol is just some mark on the page or screen. A language has a finite alphabet of symbols.
3
3 Review of week one In a context-dependent language, the meaning of a phrase depends on the context In a context-sensitive language, the structure of a phrase depends on the context Most natural languages are context-dependent but not context-sensitive A context-free language is one where the structure of a phrase is always the same, independent of context A regular language is a context-free language which has simple rules for forming valid strings (e.g. "94", "getWidth()“)
4
4 Classes of formal language regular phrase structure context-free context-sensitive
5
5 Regular languages Here are examples of strings from a regular language with alphabet {a,b}: a b ab aaaaa ababab
6
6 Regular languages 1.the empty set is a regular language 2.the set consisting of the empty string ( ) is a regular language 3.the set consisting of a one-symbol string is a regular language 4.a new regular language can be made by taking a string from a regular language and concatenating it with a string from a regular language 5.a new regular language can be made by taking the disjoint union of two regular languages
7
7 Recognizing regular languages regular languages can be recognized and interpreted by a finite-state machine for example, here is a machine to recognize a two-bit string: 0 1 0 1 acceptor states
8
8 Regular expressions Wouldn’t it be nice if we had a compact way of specifying a regular language? we have! it’s a special notation called a regular expression
9
9 Examples of regular languages 1.the set of all two-symbol strings containing the letters a and b (a|b) 2 2.the set of all two-bit strings (0|1) 2 3.the set of all possible words (a|..|z) + 4.the set of all decimal integers (0|(1|..|9)(0|..|9) * ) 5.the set of Java identifiers JavaLetter JavaLetterOrDigit *
10
10 More examples of regular languages 1.all the possible three-bit strings (0|1) 3 2.all the single-digit decimal numbers (0|1|2|3|4|5|6|7|8|9) (0|..|9) 3.all the possible repetitions of the traffic-light sequence (red, amber, green, amber) (red amber green amber) *
11
11 Activity Write down the regular expression denoting the following regular languages: The language with two strings “the cat” and “the mat” Arithmetic expressions with two operands, e.g. 1 + 2, 3 × 4 The allowed operator are: +, -, ×, ÷ The allowed operands are: single digit decimal numbers The language consisting of all possible binary strings The language of HTML tags such as
12
12 Suggested Answers The language with two strings “the cat” and “the mat” the (cat | mat) or (the (c|m)at) Arithmetic expressions with two operands, e.g. 1 + 2, 3 × 4. (0|..|9) (+|-|×|÷) (0|..|9) The language consisting of all possible binary strings (0|1) * The language of HTML tags such as
13
13 A cautionary note You have been using a metalanguage! The regular expression strings form a language having terminal symbols ( ) + * | plus literal symbols e.g. a stands for the letter a this can cause problems when the metalanguage and the language get confused e.g. the language consisting of strings of one to three vertical bars: | | || | |||
14
14 A cautionary note we can fix this by some ghastly escape convention, e.g. convert the above to "|" | "||" | "|||" now we have problems with the quote symbol! the best idea is to choose metalanguage symbols which are rarely encountered in the language being described, and use bold-face or color to distinguish
15
15 Regular languages and regular expressions Regular Language 1.the empty set 2.the set consisting of the empty string ( ) 3.the set consisting of a one- symbol string (e.g. "a") 4.a new regular language can be made by taking a string from a regular language and concatenating it with a string from a regular language 5.a new regular language can be made by taking the union of two regular languages Regular Expression 1. 2. 3. a 4. a b 5. a | b
16
16 Regular languages and regular expressions The other ways of forming regular expressions are just shorthand: a 0 = a 1 =a a 2 =aa a*= | a | aa | aaa |... a + =a | aa | aaa |...
17
17 Regular languages and regular expressions Brackets are used to show precedence of the operations (a | b ) * a | b * default precedence is: * or + or n concatenation |
18
18 Activity Give examples of the following languages: 1. (x | y | z) 3 2. x | y | z * 3. a b 2 4. (a b) 2
19
19 Suggested Answers Give examples of the following languages: 1. (x | y | z) 3 xzy 2. x | y | z * 3. a b 2 abb 4. (a b) 2 abab
20
20 From Regular Expressions to Finite State Automata 1.It is an amazing fact that any regular expression has an equivalent finite state automaton which recognizes it 2.and every finite state automaton recognizes some regular expression we will prove these propositions later
21
21 01 E D 00 Finite State Machines an FSM to add two binary numbers A B C F 0 1 0 0 1 1 10 start state transition input symbol end state output symbol
22
22 Finite state automata These are simple machines with no output symbols they can only recognize strings of input symbols acceptance is shown by a special state
23
23 NFAs The kind of finite state automata we shall be using are called nondeterministic finite automata "nondeterministic" means we can do naughty things like: have a transition without a symbol label two exit transitions with the same symbol not show the paths which lead to failure
24
24 Example of an NFA what regular language does this NFA represent? a b | a b c | a+ a a a a b bc
25
25 a Examples of conversion from REs to NFAs (a b) 2 a b 2 (a | b) 2 (a | b) * a ba b a bb b a b a b
26
26 Convert the following regular expressions to NFAs: 1. JavaLetter JavaLetterOrDigit * 2. (red amber green amber) * Convert the following NFAs to REs: 3. 4. Activity a b a b c d
27
27 Suggested answer 1. 2. 3. (ab) * 4. (ac|bd) + javaLetter javaLetterOrDigit red amber green amber
28
28 Summary regular expressions give us a neat notation for describing regular languages nondeterministic finite automata (NFAs) provide a diagrammatic version of regular expressions these notations are equivalent finite automata theory is crucial in generating lexical analyzers from regular expressions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.