Compilers Principles, Techniques, & Tools Taught by Jing Zhang

Slides:



Advertisements
Similar presentations
Lecture # 8 Chapter # 4: Syntax Analysis. Practice Context Free Grammars a) CFG generating alternating sequence of 0’s and 1’s b) CFG in which no consecutive.
Advertisements

Grammars, constituency and order A grammar describes the legal strings of a language in terms of constituency and order. For example, a grammar for a fragment.
1 、 Alphabet Non-empty set of symbols , usually expressed in  、 V or Other Upper-case Greece Letter 2 、 Symbol(Character) Elements in alphabet, finest.
Discussion #31/20 Discussion #3 Grammar Formalization & Parse-Tree Construction.
Chapter 3: Formal Translation Models
COP4020 Programming Languages
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Formal Grammars Denning, Sections 3.3 to 3.6. Formal Grammar, Defined A formal grammar G is a four-tuple G = (N,T,P,  ), where N is a finite nonempty.
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Grammars CPSC 5135.
PART I: overview material
Lecture # 9 Chap 4: Ambiguous Grammar. 2 Chomsky Hierarchy: Language Classification A grammar G is said to be – Regular if it is right linear where each.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
Chapter 3 Describing Syntax and Semantics
LESSON 04.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Grammars Hopcroft, Motawi, Ullman, Chap 5. Grammars Describes underlying rules (syntax) of programming languages Compilers (parsers) are based on such.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
Unit-3 Parsing Theory (Syntax Analyzer) PREPARED BY: PROF. HARISH I RATHOD COMPUTER ENGINEERING DEPARTMENT GUJARAT POWER ENGINEERING & RESEARCH INSTITUTE.
Grammars A grammar is a 4-tuple G = (V, T, P, S) where 1)V is a set of nonterminal symbols (also called variables or syntactic categories) 2)T is a finite.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Introduction to Parsing
Modeling Arithmetic, Computation, and Languages Mathematical Structures for Computer Science Chapter 8 Copyright © 2006 W.H. Freeman & Co.MSCS SlidesAlgebraic.
Chapter 3 – Describing Syntax
Introduction to Parsing
Parsing & Context-Free Grammars
Context-Free Grammars: an overview
Formal Language & Automata Theory
CS 404 Introduction to Compiler Design
CS510 Compiler Lecture 4.
Fall Compiler Principles Context-free Grammars Refresher
Introduction to Parsing (adapted from CS 164 at Berkeley)
Compiler Construction
PARSE TREES.
Lecture 14 Grammars – Parse Trees– Normal Forms
Context-Free Languages
Parsing & Context-Free Grammars Hal Perkins Autumn 2011
Context-Free Grammars
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Lecture 7: Introduction to Parsing (Syntax Analysis)
CHAPTER 2 Context-Free Languages
CSC 4181Compiler Construction Context-Free Grammars
Context-Free Grammars 1
Finite Automata and Formal Languages
CSC 4181 Compiler Construction Context-Free Grammars
Theory of Computation Lecture #
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Fall Compiler Principles Context-free Grammars Refresher
Parsing & Context-Free Grammars Hal Perkins Summer 2004
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Parsing & Context-Free Grammars Hal Perkins Autumn 2005
Programming Languages 2nd edition Tucker and Noonan
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

Compilers Principles, Techniques, & Tools Taught by Jing Zhang (jzhang@njust.edu.cn)

Grammar Description

2.1 Some Concepts Let be an alphabet,where each element is called a symbol. A string on is a finite sequence which is composed of symbols in . The string that contains NO symbol is called an ε-string (or ε). Let be the full set of all strings on , including ε . Let φ be an empty set {}. Concatenation of U and V is defined as Self-concatenation: , The closure of V is denoted by The regular closure:

2.2 Context-free Grammar A grammar is a set of formal regulations that describes the syntax structures of a language. A context-free grammar has four components (1) A set of terminal symbols, sometimes referred to as "tokens." The terminals are the elementary symbols of the language defined by the grammar. (2) A set of nonterminals, sometimes called "syntactic variables." Each nonterminal represents a set of strings of terminals. (3) A set of productions, where each production consists of a nonterminal, called the head or left side of the production, an arrow, and a sequence of terminals and/or nonterminals , called the body or right side of the production. The intuitive intent of a production is to specify one of the written forms of a construct; if the head nonterminal represents a construct, then the body represents a written form of the construct . (4) A designation of one of the nonterminals as the start symbol

2.2 Context-free Grammar – Formal Definition The context-free grammar G is a 4-tuple VT is a non-empty finite set, where each element is a terminal. VN is a non-empty finite set, where each element is a nonterminal, S is a nonterminal, called start symbol is as finite set of productions, where each production has the form . . S must appear in the left part of a production at least once.

2.2 Context-free Grammar – Notational Conventions

2.2 Context-free Grammar – Notational Conventions

2.2 Context-free Grammar – Derivations A grammar derives strings by beginning with the start symbol and repeatedly replacing a nonterminal by the body of a production for that nonterminal. Strictly, we call that derives in one step, i.e., if and only if is a production, and . means “derives in one step”. When a sequence of derivation steps rewrites , we say . We use symbol” and symbol to represent “derives in zero or more steps” and symbol to represent “derives in one or more steps”. If , where S is the start symbol of a grammar G, we say that is a sentential form of G.

2.2 Context-free Grammar – Derivations Note that a sentential form may contain both terminals and nonterminals, and may be empty. A sentence of G is a sentential form with no nonterminals. The language generated by a grammar G is its set of sentences, denoted by L(G) . Thus, a string of terminals w is in L(G) , if and only if w is a sentence of G (or ). If two grammars generate the same language, the grammars are said to be equivalent.

2.2 Context-free Grammar – Derivations We consider derivations in which the nonterminal to be replaced at each step is chosen as follows: 1. In leftmost derivations, the leftmost nonterminal in each sentential is always chosen. If is a step in which the leftmost nonterminal in is replaced, we write 2. In rightmost derivations, the rightmost nonterminal is always chosen; we write in this case. Rightmost derivations are sometimes called canonical derivations.

Parse Tree A parse tree pictorially shows how the start symbol of a grammar derives a string in the language. If nonterminal A has a production , then a parse tree may have an interior node labeled A with three children labeled X, Y, and Z, from left to right: A X Y Z The root is labeled by the start symbol Each leaf is labeled by a terminal or by . Each interior node is labeled by a nonterminal If A is the nonterminal labeling some interior node and Xl , X2, • • • , Xn are the labels of the children of that node from left to right, then there must be a production . Here, XI , X2 , . . . , Xn each stand for for a symbol that is either a terminal or a nonterminal . As a special case, if is a production, then a node labeled A may have a single child labeled . From left to right , the leaves of a parse tree form the yield of the tree , which is the string generated or derived from the nonterminal at the root of the parse tree. The process of finding a parse tree for a given string of terminals is called parsing that string.

Ambiguity A grammar can have more than one parse tree generating a given string of terminals. Such a grammar is said to be ambiguous. Since a string with more than one parse tree usually has more than one meaning, we need to design unambiguous grammars for compiling applications, or to use ambiguous grammars with additional rules to resolve the ambiguities.

An overview of formal languages Chomsky Hierarchy Type-0 grammar (Recursively enumerable) > Type-1 grammar (Context- sensitive) > Type-2 grammar(Context-free) > Type-3 grammar (Regular) is a type-0 grammar, if each production has the form , and at least has one nonterminal, If we applied the following i-th constraint to G we have i-type grammar Any production satisfies with an exception Any production has the form (right linear grammar) (left linear grammar) Context-free grammar can describe the syntax structures of most modern programming languages

Homework (1) Grammar G is What is the language L(G) specified by G? Write the leftmost and rightmost derivations of the sentences 0127, 34 and 568. (2) Write a grammar, whose specified language is the set of odd numbers and each odd number does not start with 0. (3) Write the grammars for the following languages