Presentation is loading. Please wait.

Presentation is loading. Please wait.

Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite.

Similar presentations


Presentation on theme: "Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite."— Presentation transcript:

1 Regular Expressions CIS 361

2 Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite alphabet is uncountable, while the set of descriptions is countable Fundamental Problems

3 Regular Expressions Language L is regular if there exists a finite acceptor for it Any language that is described by a regular expression can be accepted by some finite automaton

4 Regular Expressions Regular expressions Combination of strings of symbols from some alphabet, parentheses and operators U,., * U is union (some literature uses +). (or nothing) is concatenation * is star closure or Kleene star superscripted repetition, 0 or more times + is closure superscripted repetition, 1 or more times

5 Specifying Lexical Structure Using Regular Expressions Have some alphabet  = set of symbols Regular expressions are built from:  - empty string Any letter from  r 1 r 2 – String r 1 followed by r 2 (concatenation) r 1 U r 2 (r 1 + r 2 ) – either regular expression r 1 or r 2 (union) r* - iterated sequence and choice  | r | r r | … Parentheses to indicate grouping/precedence

6 Regular Expressions Operations Union Complement Intersection Difference Concatenation Repetition Kleene star Plus operator

7 Regular Expressions Union L  M The union of two regular expressions Q and R is Q U R In terms of automata A and B, respectively create a new initial state q connect it to the initial states of A and B by  transitions

8 Regular Expressions Complement  * - L To construct the complement of a regular expression L, inspect the automaton that accepts its strings convert the automaton for L to a deterministic automaton flips favorable and nonfavorable states construct a regular expression for strings accepted by the updated automaton

9 Regular Expressions Complement of bit strings with at least one “1” = bit strings containing no “1”s = 0* Complement of bit strings with exactly one “1” = bit strings containing no “1”s U bit strings with at least two “1”s = 0* U (0* 1 0* 1 0*)(0 U 1)*

10 Regular Expressions Intersection L  M Apply DeMorgan’s law Union of the complements of L and M

11 Regular Expressions Difference L – M Can be expressed as the intersection of languages L and  * - M

12 Regular Expressions Concatenation Strings u and v over alphabet  is string uv Languages L 1 and L 2 concatenated L 1 L 2 ={uv|u  L 1, v  L 2 } Can be extended to any finite number of languages

13 Regular Expressions Concatenation LM Algorithm connects every favorable state of L to the initial state of M by an arrow labeled  Favorable states of L become non-favorable Favorable states of M become favorable states of the new automaton

14 Regular Expressions Kleene star L * In terms of automaton connect every favorable state of L to the initial state of L by a transition labeled  create a new initial state s, make it the only favorable state and connect it to the old initial state by  transition

15 Regular Expressions Plus (+) L + In terms of automaton connect every favorable state of L to the initial state of L by a transition labeled  That’s it. This gets one or more times to a favorable state

16 Naming Languages Regular sets can be named using the derivation in terms of the seed elements and the closure operations. Regular expressions formalize this approach. Regular sets  Regular Expressions Numbers  Numerals Semantics  Syntax

17 Regular expressions for strings over {a,b} containing at least one “a”. Focus on the one “a” (a u b)*a(a u b)* Focus on the leftmost “a” b*a(a u b)* Focus on the “a”s b*ab*(ab*)* Further optimization b*(ab*) + Example

18 Two regular expressions are equivalent if they represent the same regular set. Equivalence of regular expressions

19 Concept of Language Generated by Regular Expressions Set of all strings generated by a regular expression is the language of the regular expression In general, a language may be (countably) infinite A string in a language is often called a token

20 Examples of Languages and Regular Expressions  = { 0, 1,. } (0 U 1)*.(0 U 1)* - Binary floating point numbers (00)* - even-length all-zero strings 1*(01*01*)* - strings with even number of zeros  = {A,…,Z, a,…,z, 0,…,9,_ } (A U … U z)(A U … U z U 0 U … U 9 U _) * identifiers (1 U … U 9)(0 U … U 9) * natural numbers (no negatives) (0|1|2)* - trinary (base 3) numbers

21 Finite-State Automata Alphabet  Set of states with initial and accepting states Transitions between states, labeled with symbol(s) 1 0 1 0 (0 | 1)*.(0|1)*


Download ppt "Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite."

Similar presentations


Ads by Google