Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.

Slides:



Advertisements
Similar presentations
Chapter Chapter Summary Languages and Grammars Finite-State Machines with Output Finite-State Machines with No Output Language Recognition Turing.
Advertisements

Session 14 (DM62) / 15 (DM63) Recursive Descendent Parsing.
ISBN Chapter 3 Describing Syntax and Semantics.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
PZ02A - Language translation
Context-Free Grammars Lecture 7
A basis for computer theory and A means of specifying languages
Chapter 3 Describing Syntax and Semantics Sections 1-3.
1 Foundations of Software Design Lecture 23: Finite Automata and Context-Free Grammars Marti Hearst Fall 2002.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
Chapter 3: Formal Translation Models
COP4020 Programming Languages
Languages and Grammars MSU CSE 260. Outline Introduction: E xample Phrase-Structure Grammars: Terminology, Definition, Derivation, Language of a Grammar,
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
1 Introduction to Parsing Lecture 5. 2 Outline Regular languages revisited Parser overview Context-free grammars (CFG’s) Derivations.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
Chapter 10: Compilers and Language Translation Invitation to Computer Science, Java Version, Third Edition.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Syntax Analyzer Syntax Analyzer creates the syntactic structure of the given source program. This.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Winter 2007SEG2101 Chapter 71 Chapter 7 Introduction to Languages and Compiler.
Languages, Grammars, and Regular Expressions Chuck Cusack Based partly on Chapter 11 of “Discrete Mathematics and its Applications,” 5 th edition, by Kenneth.
Classification of grammars Definition: A grammar G is said to be 1)Right-linear if each production in P is of the form A  xB or A  x where A and B are.
Context-Free Grammars
Grammars CPSC 5135.
PART I: overview material
LANGUAGE DESCRIPTION: SYNTACTIC STRUCTURE
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
11 Chapter 4 Grammars and Parsing Grammar Grammars, or more precisely, context-free grammars, are the formalism for describing the structure of.
Bernd Fischer RW713: Compiler and Software Language Engineering.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
Introduction to Parsing
CPS 506 Comparative Programming Languages Syntax Specification.
D Goforth COSC Translating High Level Languages.
Programming Languages and Design Lecture 3 Semantic Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern.
Chapter 3 Describing Syntax and Semantics
Syntax The Structure of a Language. Lexical Structure The structure of the tokens of a programming language The scanner takes a sequence of characters.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
ISBN Chapter 3 Describing Syntax and Semantics.
Syntax Analysis - Parsing Compiler Design Lecture (01/28/98) Computer Science Rensselaer Polytechnic.
Grammars CS 130: Theory of Computation HMU textbook, Chap 5.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
LECTURE 4 Syntax. SPECIFYING SYNTAX Programming languages must be very well defined – there’s no room for ambiguity. Language designers must use formal.
Chapter 4: Syntax analysis Syntax analysis is done by the parser. –Detects whether the program is written following the grammar rules and reports syntax.
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
©SoftMoore ConsultingSlide 1 Context-Free Grammars.
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 3.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
Compiler Chapter 5. Context-free Grammar Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Describing Syntax and Semantics Chapter 3: Describing Syntax and Semantics Lectures # 6.
Chapter 3: Describing Syntax and Semantics
Chapter 3 – Describing Syntax
CS 326 Programming Languages, Concepts and Implementation
CS 404 Introduction to Compiler Design
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Automata and Languages What do these have in common?
Compiler Construction
R.Rajkumar Asst.Professor CSE
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
COMPILER CONSTRUCTION
Presentation transcript:

Programming Languages and Design Lecture 2 Syntax Specifications of Programming Languages Instructor: Li Ma Department of Computer Science Texas Southern University, Houston January, 2008

2 Review and Preview Last lecture  Introduction to programming languages Fundamental concepts Computation models Programming models/paradigms Program processing Today’s lecture  Syntax specifications of programming languages Reference: Chapter 4 of “Foundations of Programming Languages: Design and Implementation”, S. H. Roosta Three mechanisms: regular expressions, formal grammars, attribute grammars

3 Language Description A formal language is any set of character strings with characters chosen from a fixed, finite set of an alphabet of symbols  The strings that belong to the language are called its constructs, or phrases Any programming language description can be classified according to its  Syntax, which deals with the formation of phrases  Semantics, which deals with the meaning of phrases  Pragmatics, which deals with the practical use of phrases

4 Syntax Syntax refers to the formation of constructs in the language and defines relations between them  It describes the structure of the language without addressing the meaning of the constructs of the language  Syntax of a programming language is similar to the grammar of a natural language Three mechanisms describe the design and implementation of programming languages  Regular expressions  Formal grammars  Attribute grammars

5 Regular Expressions Invented by Stephen Kleene in about 1950 Represent a form of language definition  Each regular expression E denotes some language L(E) defined over the alphabet of the language Defined by the following set of rules  Alternation If a and b are regular expressions, then so is (a+b) The language defined by (a+b) has all the strings from the language identified by a and all strings from the language identified by b  Concatenation If a and b are regular expressions, then so is (a*b) The language defined by (a*b) has all the strings formed by concatenating a string from the set of strings identified by a to the end of a string in the set identified by b

6 Regular Expressions (cont’) Defined by the following set of rules (cont’)  Kleene closure If a is a regular expression, then so is a* The defined language of a* consists of all the strings formed by concatenating zero or more strings in the language identified by a  Positive closure If a is a regular expression, then so is a + The defined language of a + consists of all the strings formed by concatenating one or more strings in the language identified by a a + is the same as a* except that ε is excluded  Empty ø is a regular expression and defined language consisting of no strings  Atom any single symbol such as a or ε is a regular expression with a defined language consisting of the single string {a} or {ε}

7 Defined Language of the Regular Expressions Regular Expression Denoted Language ø L ø = { } Ε L 0 = { ε } aL 1 = {a} (A*B)L(A)*L(B) = {ab | a in L(A) and b in L(B)} (A+B)L(A)+L(B) = {a | a in L(A) or a in L(B)} (A*)L* = {a 1 a 2 … a n | a 1, a 2, …, a n in L(A) and n≥0} (A + )L + = {a 1 a 2 … a n | a 1, a 2, …, a n in L(A) and n>0}

8 Formal Grammars A grammar is a notation that you can use to specify a structural description of the various constructs in the language Four components of the grammar of a programming language  Terminal symbols  Variable symbols (nonterminal)  Production rules  Start symbol

9 Production Rules Each production rule has  symbols as its left side  the symbol =>  a string over the set of terminals and variables as its right side A production rule indicates that the left-side symbols drive or simply imply the right-side symbols Derivation begins with the start symbol  Each successive string in the sequence derived from the preceding string

10 Definitions for Grammar The grammar of a programming language can be defined as a quadruple, G = (T, V, P, S)  T is a finite set of terminal symbols, lowercase characters  V is a finite set of variable symbols (V∩T = ø ), uppercase characters  P is a finite set of production rules of the form α.X.β => δ, where α, β, and δ in (VUT)* and X in V  S in V is the start symbol of the phrase Two grammars, G1 and G2, are equivalent if and only if L(G1) = L(G2)

11 Classification of Grammars Type 0: unrestricted grammar  Requires at least one nonterminal symbol on the left side of a production rule Form α => β, where α in (VUT) + and β in (VUT)*  Recursively enumerable grammar, or phrase structured grammar Type 1: context-sensitive grammar  Requires that the right side of a production rule have no fewer symbols than the left side Form α => β, where α = δ 1 Aδ 2, β = δ 1 ωδ 2, A in V, ω in (VUT) + and δ 1, δ 2 in (VUT)*

12 Classification of Grammars (cont’) Type 2: context-free grammar  Requires that the left side of a production rule be a single variable symbol and the right side be a combination of terminal and variable terminals Form A => α, where A in V and α in (VUT)*  Backus-Naur Form (BNF) grammar Equivalent to context-free grammar Differ only in the notation Nonterminal enclosed by The symbol ::= is used for derivation

13 Classification of Grammars (cont’) Type 3: regular grammar  Restricted to only one terminal or one terminal and one variable on the right side of a production rule  Restrictive grammar  Right-linear grammar Form A => xB or A => x, where A, B in V, x in T Rightmost derivation  Left-linear grammar Form A => Bx or A => x, where A, B in V, x in T Leftmost derivation

14 Syntax Tree Two parts of programming language syntax  Lexical syntax: describes the smallest units with significance, called tokens  Phrase-structure syntax: explains how tokens are arranged into programs The syntactic structure of a phrase can be represented with a syntax tree (derivation tree or parse tree)  Terminal nodes – terminal symbols  Internal nodes – variable symbols  Root – start symbol  The label of an internal node – left side of the production rule; the labels of the children of the node (from left to right) – right side of the production rule

15 Syntax Tree (cont’) Recognition/representation  Determining whether the phrase is syntactically valid  Production rules are used to construct a syntax tree The grammar-oriented compiling technique consists of two components  A lexical analyzer: convert the stream of input characters to a stream of tokens  A syntactic analyzer: form a derivation tree from the token list, is a combination of A parser An intermediate code generator

16 Parsers Parsing: deriving the parse tree Two basic approaches to deriving parse trees  Top-down parsers Begin with the start symbol as the root of the tree Repeatedly replace variable symbols with a string of terminal symbols  Bottom-up parsers Begin with a string of terminal symbols Repeatedly replace sequences in the string with variable symbols The process continues until the start symbol is produced In both cases, the tree is the result of a syntactic analysis of the grammar

17 Ambiguity Ambiguous grammar: A grammar represents a phrase of its language in two or more derivation tree  Due to lack of syntactic structure  Should eliminate ambiguity whenever possible Revise the grammar Introduce a disambiguity rule

18 BNF Variations Other notational variations  Example: Notation { … } i j can be used to express any number n of occurrences of the enclosed sequence of symbols, for i≤n≤j  Extended BNF grammar Add some extra notations to allow easier description of languages Anything that can be specified with BNF can also be specified with Extended BNF (EBNF) grammar Increases the readability and writability of the production rules  Syntax diagram A pictorial technique, equivalent to BNF grammar In this approach, each production rule is represented as a directed graph whose vertices are symbols Terminal symbols: circles Variable symbols: rectangles

19 Attribute Grammars Developed by Donald Knuth in 1968 Powerful and elegant mechanisms that formalize both the context-free and context-sensitive aspects of a language’s syntax  Can be used to determine whether a variable has been declared and whether the use of the variable is consistent with its declaration An extension to a context-free grammar with certain formal primitives  enable syntax aspects of a language to be specified more precisely