Context-Free Grammars 1

Slides:



Advertisements
Similar presentations
Chapter 5: Languages and Grammar 1 Compiler Designs and Constructions ( Page ) Chapter 5: Languages and Grammar Objectives: Definition of Languages.
Advertisements

Context-Free Grammars
Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
ISBN Chapter 3 Describing Syntax and Semantics.
CS5371 Theory of Computation
Chapter 3 Describing Syntax and Semantics Sections 1-3.
1 Context-Free Grammars Formalism Derivations Backus-Naur Form Left- and Rightmost Derivations.
Normal forms for Context-Free Grammars
Chapter 3: Formal Translation Models
Context-Free Grammars Chapter 3. 2 Context-Free Grammars and Languages n Defn A context-free grammar is a quadruple (V, , P, S), where  V is.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Languages & Strings String Operations Language Definitions.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Grammars CPSC 5135.
C H A P T E R TWO Syntax and Semantic.
CMSC 330: Organization of Programming Languages Context-Free Grammars.
Context Free Grammars CFGs –Add recursion to regular expressions Nested constructions –Notation expression  identifier | number | - expression | ( expression.
Context Free Grammars 1. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Syntax analysis Jianguo Lu School of Computer Science University of Windsor.
5. Context-Free Grammars and Languages
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
Context-Free Grammars: an overview
Formal Language & Automata Theory
CS510 Compiler Lecture 4.
Properties of Context-Free Languages
Fall Compiler Principles Context-free Grammars Refresher
Chapter 3 Context-Free Grammar and Parsing
Chapter 3 – Describing Syntax
Syntax (1).
L-systems L-systems are grammatical systems introduced by Lyndenmayer to describe biological developments such as the growth of plants and cellular organisms.
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Context-Free Grammars
Context free grammar.
CSE 105 theory of computation
PARSE TREES.
Syntax Analysis Sections :.
Lecture 14 Grammars – Parse Trees– Normal Forms
Context-Free Languages
Context-Free Grammars
Context-Free Grammars
Context-Free Grammars
5. Context-Free Grammars and Languages
CS21 Decidability and Tractability
CHAPTER 2 Context-Free Languages
CSC 4181Compiler Construction Context-Free Grammars
R.Rajkumar Asst.Professor CSE
Context–free Grammar CFG is also called BNF (Backus-Naur Form) grammar
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
BNF 23-Feb-19.
CSC 4181 Compiler Construction Context-Free Grammars
Context-Free Grammars
Fall Compiler Principles Context-free Grammars Refresher
BNF 9-Apr-19.
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
CSE 105 theory of computation
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Formal Languages Context free languages provide a convenient notation for recursive description of languages. The original goal of formalizing the structure.
Conversion of CFG to PDA Conversion of PDA to CFG
Context-Free Grammars
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Programming Languages 2nd edition Tucker and Noonan
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
CSE 105 theory of computation
Presentation transcript:

Context-Free Grammars 1 Formalism Derivations Backus-Naur Form Left- and Rightmost Derivations

Informal Comments A context-free grammar is a notation for describing languages. It is more powerful than finite automata or RE’s, but still cannot define all possible languages. Useful for nested structures, e.g., parentheses in programming languages. based on Stanford InfoLab Total: 28

Informal Comments – (2) Basic idea is to use “variables” to stand for sets of strings (i.e., languages). These variables are defined recursively, in terms of one another. Recursive rules (“productions”) involve only concatenation. Alternative rules for a variable allow union. based on Stanford InfoLab Total: 28

Example: CFG for { 0n1n | n > 1} Productions: S  01 S  0S1 Basis: 01 is in the language. Induction: if w is in the language, then so is 0w1. based on Stanford InfoLab Total: 28

CFG Formalism Terminals = symbols of the alphabet of the language being defined. Variables = nonterminals = a finite set of other symbols, each of which represents a language (terminal words that can be derived from it) Start symbol = the variable whose language is the one being defined. based on Stanford InfoLab Total: 28

Productions A production has the form variable  string of variables and terminals. Convention: A, B, C,… are variables. a, b, c,… are terminals. …, X, Y, Z are either terminals or variables. …, w, x, y, z are strings of terminals only. , , ,… are strings of terminals and/or variables. based on Stanford InfoLab Total: 28

Example: Formal CFG Here is a formal CFG for { 0n1n | n > 1}. Terminals = {0, 1}. Variables = {S}. Start symbol = S. Productions = S  01 S  0S1 based on Stanford InfoLab Total: 28

Derivations – Intuition We derive strings in the language of a CFG by starting with the start symbol, and repeatedly replacing some variable A by the right side of one of its productions. That is, the “productions for A” are those that have A on the left side of the . based on Stanford InfoLab Total: 28

Derivations – Formalism We say A =>  if A   is a production. Example: S  01; S  0S1. S => 0S1 => 00S11 => 000111. based on Stanford InfoLab Total: 28

Iterated Derivation =>* means “zero or more derivation steps.” Basis:  =>*  for any string . Induction: if  =>*  and  => , then  =>* . based on Stanford InfoLab Total: 28

Example: Iterated Derivation S  01; S  0S1. S => 0S1 => 00S11 => 000111. S =>* S; S =>* 0S1; S =>* 00S11; S =>* 000111. based on Stanford InfoLab Total: 28

Sentential Forms Any string of variables and/or terminals derived from the start symbol is called a sentential form. Formally,  is a sentential form iff S =>* . based on Stanford InfoLab Total: 28

Language of a Grammar If G is a CFG, then L(G), the language of G, is {w | S =>* w}. Note: w must be a terminal string, S is the start symbol. Example: G has productions S  ε and S  0S1. L(G) = {0n1n | n > 0}. Note: ε is a legitimate right side. based on Stanford InfoLab Total: 28

Context-Free Languages A language that is defined by some CFG is called a context-free language. There are CFL’s that are not regular languages, such as the example just given. But not all languages are CFL’s. based on Stanford InfoLab Total: 28

BNF Notation Grammars for programming languages are often written in BNF (Backus-Naur Form ). Variables are words in <…>; Example: <statement>. Terminals are often multicharacter strings indicated by boldface or underline; Example: while or WHILE. based on Stanford InfoLab Total: 28

BNF Notation – (2) Symbol ::= is often used for  . Symbol | is used for “or.” A shorthand for a list of productions with the same left side. Example: S  0S1 | 01 is shorthand for S  0S1 and S  01. based on Stanford InfoLab Total: 28

BNF Notation – Kleene Closure Symbol … is used for “one or more.” Example: <digit> ::= 0|1|2|3|4|5|6|7|8|9 <unsigned integer> ::= <digit>… Note: that’s not exactly the * of RE’s. Translation: Replace … with a new variable A and productions A  A | . based on Stanford InfoLab Total: 28

Example: Kleene Closure Grammar for unsigned integers can be replaced by: U  UD | D D  0|1|2|3|4|5|6|7|8|9 based on Stanford InfoLab Total: 28

BNF Notation: Optional Elements Surround one or more symbols by […] to make them optional. Example: <statement> ::= if <condition> then <statement> [; else <statement>] Translation: replace [] by a new variable A with productions A   | ε. based on Stanford InfoLab Total: 28

Example: Optional Elements Grammar for if-then-else can be replaced by: S  iCtSA A  ;eS | ε based on Stanford InfoLab Total: 28

BNF Notation – Grouping Use {…} to surround a sequence of symbols that need to be treated as a unit. Typically, they are followed by a … for “one or more.” Example: <statement list> ::= <statement> [{;<statement>}…] based on Stanford InfoLab Total: 28

Translation: Grouping You may, if you wish, create a new variable A for {}. One production for A: A  . Use A in place of {}. based on Stanford InfoLab Total: 28

Example: Grouping L  S [{;S}…] Replace by L  S [A…] A  ;S A stands for {;S}. Then by L  SB B  A… | ε A ;S B stands for [A…] (zero or more A’s). Finally by L  SB B  C | ε C  AC | A A  ;S C stands for A… . based on Stanford InfoLab Total: 28

Leftmost and Rightmost Derivations Derivations allow us to replace any of the variables in a string. Leads to many different derivations of the same string. By forcing the leftmost variable (or alternatively, the rightmost variable) to be replaced, we avoid these “distinctions without a difference.” based on Stanford InfoLab Total: 28

Leftmost Derivations Say wA =>lm w if w is a string of terminals only and A   is a production. Also,  =>*lm  if  becomes  by a sequence of 0 or more =>lm steps. based on Stanford InfoLab Total: 28

Example: Leftmost Derivations Balanced-parentheses grammmar: S  SS | (S) | () S =>lm SS =>lm (S)S =>lm (())S =>lm (())() Thus, S =>*lm (())() S => SS => S() => (S)() => (())() is a derivation, but not a leftmost derivation. based on Stanford InfoLab Total: 28

Rightmost Derivations Say Aw =>rm w if w is a string of terminals only and A   is a production. Also,  =>*rm  if  becomes  by a sequence of 0 or more =>rm steps. based on Stanford InfoLab Total: 28

Example: Rightmost Derivations Balanced-parentheses grammmar: S  SS | (S) | () S =>rm SS =>rm S() =>rm (S)() =>rm (())() Thus, S =>*rm (())() S => SS => SSS => S()S => ()()S => ()()() is neither a rightmost nor a leftmost derivation. based on Stanford InfoLab Total: 28