COP4620 – Programming Language Translators Dr. Manuel E. Bermudez

Slides:



Advertisements
Similar presentations
Grammar types There are 4 types of grammars according to the types of rules: – General grammars – Context Sensitive grammars – Context Free grammars –
Advertisements

COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
CS 3240 – Chapter 3.  How would you delete all C++ files from a directory from the command line?  How about all PowerPoint files that start with the.
Chapter 2 Languages.
Basics of automata theory
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Grammars CPSC 5135.
What is a language? An alphabet is a well defined set of characters. The character ∑ is typically used to represent an alphabet. A string : a finite.
Regular Expressions Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Translators.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
Regular Expressions and Languages A regular expression is a notation to represent languages, i.e. a set of strings, where the set is either finite or contains.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Finite Automata Chapter 1. Automatic Door Example Top View.
Regular Languages Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Translators.
Regular Grammars Reading: 3.3. What we know so far…  FSA = Regular Language  Regular Expression describes a Regular Language  Every Regular Language.
Lecture 8 NFA Subset Construction & Epsilon Transitions
1 Section 11.1 Regular Languages Problem: Suppose the input strings to a program must be strings over the alphabet {a, b} that contain exactly one substring.
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
Chapter 2 Scanning From Regular Expression to DFA Gang S.Liu College of Computer Science & Technology Harbin Engineering University.
Theory of Computation Automata Theory Dr. Ayman Srour.
FINITE-STATE AUTOMATA COP4620 – Programming Language Translators Dr. Manuel E. Bermudez.
PROGRAMMING LANGUAGES
Recap lecture 5 Different notations of transition diagrams, languages of strings of even length, Odd length, starting with b, ending in a (with different.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
Lexical Analyzer in Perspective
Context-free grammars
Theory of Computation Lecture #
Language Theory Module 03.1 COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez.
Standard Representations of Regular Languages
Context-Free Grammars: an overview
Context-free grammars, derivation trees, and ambiguity
Regular Expressions.
Language and Grammar classes
Complexity and Computability Theory I
CS314 – Section 5 Recitation 3
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
PUSHDOWN AUTOMATA. PUSHDOWN AUTOMATA Hierarchy of languages Regular Languages  Finite State Machines, Regular Expression Context Free Languages 
Recognizer for a Language
Finite-state automata
Regular grammars Module 04.1 COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
REGULAR LANGUAGES AND REGULAR GRAMMARS
Chapter Seven: Regular Expressions
Regular Expressions Prof. Busch - LSU.
Context-Free Grammars
Regular expressions Module 04.3 COP4020 – Programing Language Concepts Dr. Manuel E. Bermudez.
Solve a system of linear equation in two variables
Review: Compiler Phases:
Lexical Analysis Lecture 2 Mon, Jan 17, 2005.
CHAPTER 2 Context-Free Languages
Context-Free Grammars 1
DFA-> Minimum DFA Module 05.4 COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez.
NFA->DFA Module 05.3 COP4020 – Programming Language Concepts Dr. Manuel E. Bermudez.
Finite Automata Reading: Chapter 2.
Regular Expressions.
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
COP4620 – Programming Language Translators Dr. Manuel E. Bermudez
Programming Language Concepts
Specification of tokens using regular expressions
Regular Expression to NFA
Regular Expression to NFA
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Recap Lecture-2 Kleene Star Closure, Plus operation, recursive definition of languages, INTEGER, EVEN, factorial, PALINDROME, {anbn}, languages of strings.
COP46– Programming Language Translators Dr. Manuel E. Bermudez
Operator precedence and AST’s
CSC312 Automata Theory Transition Graphs Lecture # 9
Operator Precedence and Associativity
Recap Lecture-2 Kleene Star Closure, Plus operation, recursive definition of languages, INTEGER, EVEN, factorial, PALINDROME, {anbn}, languages of strings.
Prepared by- Patel priya( ) Guided by – Prof. Archana Singh Gandhinagar Institute of Technology SUBJECT - CD ( ) Introcution to Regular.
Presentation transcript:

COP4620 – Programming Language Translators Dr. Manuel E. Bermudez Regular expressions COP4620 – Programming Language Translators Dr. Manuel E. Bermudez

Topics Define Regular Expressions Conversion from Right- Linear Grammar to Regular Expression

Regular expressions A compact, easy-to-read language description. Use operators to denote the language constructors described earlier, to build complex languages from simple atomic ones.

Regular expressions Definition: A regular expression over an alphabet Σ is recursively defined as follows: ø denotes language ø ε denotes language {ε} a denotes language {a}, for all a  Σ. (P + Q) denotes L(P) U L(Q), where P, Q are r.e.’s. (PQ) denotes L(P)·L(Q), where P, Q are r.e.’s. P* denotes L(P)*, where P is a r.e. To prevent excessive parentheses, we assume left associativity, and the following operator precedence: * (highest), · , + (lowest)

Regular expressions Examples: (O + 1)*: any string of O’s and 1’s. (O + 1)*1: any string of O’s and 1’s, ending with a 1. 1*O1*: any string of 1’s with a single O inserted. Letter (Letter + Digit)*: an identifier. Digit Digit*: an integer. Quote Char* Quote: a string. † # Char* Eoln: a comment. † {Char*}: another comment. † † Assuming that Char does not contain quotes, eoln’s, or } .

Regular expressions Aditional Regular Expression Operators: a+ = aa* (one or more a’s) a?= a + ε (one or zero a’s, i.e. a is optional) a list b = a (b a )* (a list of a’s, separated by b’s) Examples: Syntax for a function call: Name '(' Expression list ',' ')' Identifier: Floating-point constant:

Regular expressions Conversion from Right-linear grammars to regular expressions S → aS R → aS S → aS means L(S) ⊇ {a}·L(S) → bR S → bR means L(S) ⊇ {b}·L(R) → ε S → ε means L(S) ⊇ {ε} Together, they mean that L(S) = {a}·L(S) + {b}·L(R) + {ε}, or S = aS + bR + ε Similarly, R → aS means L(R) = {a} ·L(S), or R = aS. Thus, S = aS + bR + ε System of simultaneous equations. R = aS The variables are the nonterminals.

Regular expressions Solving a system of simultaneously equations. S = aS + bR + ε R = aS Back substitute R = aS: S = aS + baS + ε S = (a + ba) S + ε S = (a + ba)* ε S = (a + ba)*

Regular expressions In general, what to do with equations of the form X = X + β ? Answer: β  L(x), so αβ  L(x), ααβ  L(x), αααβ  L(x), … Thus α*β = L(x).

Regular expressions Conversion from Right-linear grammars to regular expressions Set up equations: A = α1 + α2 + … + αn if A → α1 → α2 . . . → αn

Regular expressions If equation is of the form X = α, and X does not appear in α, then replace every occurrence of X with α in all other equations, and delete equation X = α. 3. If equation is of the form X = αX + β, and X does not occur in α or β, then replace the equation with X = α*β. Note: Some algebraic manipulations may be needed to obtain the form X = αX + β. Important: Catenation is not commutative!!

Regular expressions Example: S → a R → abaU U → aS → bU → U → b → bR Equations: S = a + bU + bR R = abaU + U = (aba + ε) U U = aS + b Back substitute R: S = a + bU + b(aba + ε) U

Regular expressions S = a + bU + b(aba + ε) U U = aS + b Back substitute U: S = a + b(aS + b) + b(aba + ε)(aS + b) = a + baS + bb + babaaS + babab + baS + bb = a + baS + bb + babaaS + babab = (ba + babaa) S + (a + bb + babab) and therefore S = (ba + babaa)*(a + bb + babab) repeats

Regular expressions Summarizing: RGR RGL Minimum DFA RE NFA DFA Done Soon