Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.

Slides:



Advertisements
Similar presentations
ICE1341 Programming Languages Spring 2005 Lecture #5 Lecture #5 In-Young Ko iko.AT. icu.ac.kr iko.AT. icu.ac.kr Information and Communications University.
Advertisements

Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Chapter 2 Syntax. Syntax The syntax of a programming language specifies the structure of the language The lexical structure specifies how words can be.
ISBN Chapter 3 Describing Syntax and Semantics.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
Programming Languages 2nd edition Tucker and Noonan
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Chapter 3 Describing Syntax and Semantics Sections 1-3.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Chapter 2: Syntax Fall 2009 Marco.
30-Jun-15 BNF. Metalanguages A metalanguage is a language used to talk about a language (usually a different one) We can use English as its own metalanguage.
Dr. Muhammed Al-Mulhem 1ICS ICS 535 Design and Implementation of Programming Languages Part 1 Fundamentals (Chapter 4) Compilers and Syntax.
S YNTAX. Outline Programming Language Specification Lexical Structure of PLs Syntactic Structure of PLs Context-Free Grammar / BNF Parse Trees Abstract.
(2.1) Grammars  Definitions  Grammars  Backus-Naur Form  Derivation – terminology – trees  Grammars and ambiguity  Simple example  Grammar hierarchies.
Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth.
Describing Syntax and Semantics
1 Syntax and Semantics The Purpose of Syntax Problem of Describing Syntax Formal Methods of Describing Syntax Derivations and Parse Trees Sebesta Chapter.
Compiler Principle and Technology Prof. Dongming LU Mar. 7th, 2014.
UNIVERSITY OF SOUTH CAROLINA Department of Computer Science and Engineering CSCE 330 Programming Language Structures Syntax (Slides mainly based on Tucker.
CS 355 – PROGRAMMING LANGUAGES Dr. X. Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax.
Syntax and Backus Naur Form
1 Chapter 3 Describing Syntax and Semantics. 3.1 Introduction Providing a concise yet understandable description of a programming language is difficult.
CS 331, Principles of Programming Languages Chapter 2.
Context-Free Grammars
Context-Free Grammars and Parsing
PART I: overview material
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
C H A P T E R TWO Syntax and Semantic.
ISBN Chapter 3 Describing Syntax and Semantics.
TextBook Concepts of Programming Languages, Robert W. Sebesta, (10th edition), Addison-Wesley Publishing Company CSCI18 - Concepts of Programming languages.
Copyright © by Curt Hill Grammar Types The Chomsky Hierarchy BNF and Derivation Trees.
1 Syntax In Text: Chapter 3. 2 Chapter 3: Syntax and Semantics Outline Syntax: Recognizer vs. generator BNF EBNF.
Dr. Philip Cannata 1 Lexical and Syntactic Analysis Chomsky Grammar Hierarchy Lexical Analysis – Tokenizing Syntactic Analysis – Parsing Hmm Concrete Syntax.
CFG1 CSC 4181Compiler Construction Context-Free Grammars Using grammars in parsers.
CPS 506 Comparative Programming Languages Syntax Specification.
CSCE 330 Programming Language Structures Syntax (Slides mainly based on Tucker and Noonan) Fall 2012 A language that is simple to parse for the compiler.
D Goforth COSC Translating High Level Languages.
Syntax and Semantics Structure of programming languages.
Chapter 3 Describing Syntax and Semantics
ISBN Chapter 3 Describing Syntax and Semantics.
CS 331, Principles of Programming Languages Chapter 2.
Chapter 3 Context-Free Grammars and Parsing. The Parsing Process sequence of tokens syntax tree parser Duties of parser: Determine correct syntax Build.
Chapter 3 Context-Free Grammars Dr. Frank Lee. 3.1 CFG Definition The next phase of compilation after lexical analysis is syntax analysis. This phase.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse.
9/15/2010CS485, Lecture 2, Fall Lecture 2: Introduction to Syntax (Revised based on the Tucker’s slides)
C H A P T E R T W O Syntax and Semantic. 2 Introduction Who must use language definitions? Other language designers Implementors Programmers (the users.
1 CS Programming Languages Class 04 September 5, 2000.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Syntax.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
Structure of programming languages
Chapter 3 – Describing Syntax CSCE 343. Syntax vs. Semantics Syntax: The form or structure of the expressions, statements, and program units. Semantics:
Syntax(1). 2 Syntax  The syntax of a programming language is a precise description of all its grammatically correct programs.  Levels of syntax Lexical.
Chapter 3 – Describing Syntax
CSCE 330 Programming Language Structures Syntax (Slides mainly based on Tucker and Noonan) Fall 2012 A language that is simple to parse for the compiler.
Chapter 3 – Describing Syntax
Syntax (1).
Compiler Construction (CS-636)
Syntax One - Hybrid CMSC 331.
Lecture 3: Introduction to Syntax (Cont’)
Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
Programming Languages 2nd edition Tucker and Noonan
CSC 4181Compiler Construction Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
CSC 4181 Compiler Construction Context-Free Grammars
Context-Free Grammars
Programming Languages 2nd edition Tucker and Noonan
COMPILER CONSTRUCTION
Presentation transcript:

Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth

Copyright © 2006 The McGraw-Hill Companies, Inc. Contents 2.1 Grammars Backus-Naur Form Derivations Parse Trees Associativity and Precedence Ambiguous Grammars 2.2 Extended BNF 2.3 Syntax of a Small Language: Clite Lexical Syntax Concrete Syntax 2.4 Compilers and Interpreters 2.5 Linking Syntax and Semantics Abstract Syntax Abstract Syntax Trees Abstract Syntax of Clite

Copyright © 2006 The McGraw-Hill Companies, Inc. Translation/Execution – Compiler review

Copyright © 2006 The McGraw-Hill Companies, Inc. Thinking about Syntax The syntax of a programming language is a precise description of all its structurally correct programs. Grammar rules are a common technique for describing language syntax precisely. Precise syntax was first used with Algol 60. Three levels: –Lexical syntax –Concrete syntax –Abstract syntax

Copyright © 2006 The McGraw-Hill Companies, Inc. Levels of Syntax Lexical syntax: describes the basic symbols of the language (names, values, operators, etc.) Concrete syntax: rules for writing expressions, statements and programs. Abstract syntax: describes an internal representation of the program, emphasizes content over form. The authors define Clite, a mini-language, to use as a teaching tool in the study of syntax and semantics.

Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1 Grammars A metalanguage is a language used to define other languages. cf metaknowledge A grammar is a set of rules, written in a metalanguage, and used to define the syntax of a language.

Copyright © 2006 The McGraw-Hill Companies, Inc Backus-Naur Form (BNF) Notation for describing a context-free grammar (see Chomsky hierarchy) Sometimes called Backus Normal Form First used to define syntax of Algol 60 Now used to define syntax of most major languages

Copyright © 2006 The McGraw-Hill Companies, Inc. Elements of a Context-Free Grammar Set of productions: P terminal symbols: T nonterminal symbols: N start symbol: A production has the form and ω is a string from N and T.

Copyright © 2006 The McGraw-Hill Companies, Inc. Example: Binary Digits Consider the grammar: binaryDigit  0 binaryDigit  1 or equivalently: binaryDigit  0 | 1 Here, | and  are metacharacters (metasymbols)

Copyright © 2006 The McGraw-Hill Companies, Inc Derivations Consider the grammar: Integer  Digit | Integer Digit Digit  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 We can derive any unsigned integer, like 352, from this grammar. (Derivations can (1) produce all legal integers or (2) show that a particular integer is correctly formed)

Copyright © 2006 The McGraw-Hill Companies, Inc. Derivation of 352 as an Integer A 6-step process, begins with the start symbol Step 1: Integer  Integer Digit Replace a nonterminal by a RHS of one of its rules: Step 1: Integer  Integer Digit Step 2:  Integer Digit Digit Step 3:  Digit Digit Digit Step 4:  3 Digit Digit Step 5:  3 5 Digit Step 6:  Finished when there are only terminals on the RHS

Copyright © 2006 The McGraw-Hill Companies, Inc. A Different Derivation of 352 Integer  Integer Digit  Integer 2  Integer Digit 2  Integer 5 2  Digit 5 2  This is called a rightmost derivation, since at each step the rightmost nonterminal is replaced. (The first one was a leftmost derivation.)

Copyright © 2006 The McGraw-Hill Companies, Inc. Notation for Derivations Integer  * 352 Means that 352 can be derived in a finite number of steps using the grammar for Integer. 352  L(G) Means that 352 is a member of the language defined by grammar G. Definition: the language L defined by a BNF grammar G is the set of all terminal strings that can be derived from the start symbol.

Copyright © 2006 The McGraw-Hill Companies, Inc Parse Trees A parse tree is a graphical representation of a derivation. The root node of the tree is the start symbol. Each internal node of the tree corresponds to a non-terminal The child(ren) of a node represent a right-hand side of a production for which the node is the left-hand side. Each leaf node represents a terminal symbol of the derived string, reading from left to right.

Copyright © 2006 The McGraw-Hill Companies, Inc. E.g., The step Integer  Integer Digit appears in the parse tree as: Integer Digit

Copyright © 2006 The McGraw-Hill Companies, Inc. Parse Tree for 352 as an Integer Figure 2.1

Copyright © 2006 The McGraw-Hill Companies, Inc. Arithmetic Expression Grammar The following grammar defines the language of arithmetic expressions with 1-digit integers, addition, and subtraction. Expr  Expr + Term | Expr – Term | Term Term  0 |... | 9 | ( Expr )

Copyright © 2006 The McGraw-Hill Companies, Inc. Parse of the String Figure 2.2

Copyright © 2006 The McGraw-Hill Companies, Inc Associativity and Precedence Grammars define associativity and precedence among the operators in an expression. Precedence: which operator is evaluated first; e.g., in the expression “a + b / c” Associativity: evaluation order for equal precedence (adjacent) operators; e.g., in the expression “a - b + c”

Copyright © 2006 The McGraw-Hill Companies, Inc. Grammar G 1: Consider the more interesting grammar G 1 : Expr  Expr + Term | Expr – Term | Term Term  Term * Factor | Term / Factor | Term % Factor | Factor Factor  Primary ** Factor | Primary Primary  0 |... | 9 | ( Expr )

Copyright © 2006 The McGraw-Hill Companies, Inc. Parse of 4**2**3+5*6+7 for Grammar G 1 Figure 2.3 Expr  Expr + Term |Expr – Term | Term Term  Term * Factor | Term / Factor | Term % Factor | Factor Factor  Primary ** Factor | Primary Primary  0 |... | 9 | ( Expr )

Copyright © 2006 The McGraw-Hill Companies, Inc. PrecedenceAssociativityOperators 3right ** 2left * / % 1left + - The structure of the parse tree shows operator precedence & associativity: Operators lower in the tree are evaluated first. An operation can’t be performed until its operands are evaluated Associativity and Precedence for Grammar G 1 Table 2.1

Copyright © 2006 The McGraw-Hill Companies, Inc. Precedence & Associativity in Grammars An operator’s precedence is determined by the length of the shortest derivation from the start symbol to the operator (see Figure 2.3) Left- or right- associativity is determined by left- or right- recursion. compare the operators ** and + in Figure 2.3

Copyright © 2006 The McGraw-Hill Companies, Inc Ambiguous Grammars A grammar is ambiguous if one of its strings has two or more different parse trees. C, C++, and Java have a large number of –operators and –precedence levels Instead of using a large grammar, we can: –Write a smaller ambiguous grammar, and –Give separate precedence and associativity rules (e.g., Table 2.1)

Copyright © 2006 The McGraw-Hill Companies, Inc. An Ambiguous Expression Grammar G 2 Expr → Expr Op Expr | ( Expr ) | Integer Op → + | - | * | / | % | ** Notes: –G 2 is equivalent to G 1. i.e., its language is the same. –G 2 has fewer productions and nonterminals than G 1. –However, G 2 is ambiguous.

Copyright © 2006 The McGraw-Hill Companies, Inc. Ambiguous Parse of Using Grammar G 2 Figure 2.4

Copyright © 2006 The McGraw-Hill Companies, Inc. The Dangling Else IfStatement → if ( Expression ) Statement | if ( Expression ) Statement else Statement where Statement → Assignment | IfStatement | Block Suppose one of the statements was another If?

Copyright © 2006 The McGraw-Hill Companies, Inc. Example With which ‘if’ does the following ‘else’ associate? if (x < 0) if (y < 0) y = y - 1; else y = 0; Answer: either one!

Copyright © 2006 The McGraw-Hill Companies, Inc. The Dangling Else Ambiguity Figure 2.5

Copyright © 2006 The McGraw-Hill Companies, Inc. Solving The Dangling Else Ambiguity Algol 60, C, C++: associate each else with closest if ; use {} or begin…end to override. Algol 68, Modula, Ada: use explicit delimiter to end every conditional (e.g., if…fi ) if (x < 0) if (y<0) if (y<0) y = y - 1; y = y - 1; else fi; y = x / y; else fi; y = x / y; fi; fi;

Copyright © 2006 The McGraw-Hill Companies, Inc. Solving The Dangling Else Ambiguity Java: rewrite the grammar to limit what can appear in a conditional: IfThenStatement → if ( Expression ) Statement IfThenElseStatement → if ( Expression ) StatementNoShortIf else Statement The category StatementNoShortIf includes all statement types except IfThenStatement.

Copyright © 2006 The McGraw-Hill Companies, Inc. 2.2 Extended BNF (EBNF) BNF: recursion to represent iteration EBNF: additional metacharacters represent iteration –{ } braces: show a series of zero or more occurrences –( ) parens: pick exactly one from the enclosed list –[ ] brackets: pick zero or one from the enclosed list How are metacharacters distinguished from terminal symbols?

Copyright © 2006 The McGraw-Hill Companies, Inc. Compare BNF/EBNF Examples BNF Expr → Term | Exp + Term | Exp - Term IfStatement → if ( Exp ) Statement | if ( Exp ) Statement else Statement EBNF Expr → Term { ( + | - ) Term } IfStatement → if ( Expr ) Statement [ else Statement ]

Copyright © 2006 The McGraw-Hill Companies, Inc. C-style EBNF C-style EBNF lists alternatives on separate lines and uses opt to signify optional parts. e.g., IfStatement: if ( Expression ) Statement ElsePart opt ElsePart: else Statement

Copyright © 2006 The McGraw-Hill Companies, Inc. EBNF to BNF We can always rewrite an EBNF grammar as a BNF grammar. e.g., A → x { y } z can be rewritten: A → x A' z A' → ε | y A' (Rewriting EBNF rules with ( ), [ ] is left as an exercise.)

Copyright © 2006 The McGraw-Hill Companies, Inc. Syntax Diagram for Expressions with Addition – Figure 2.6 Syntax diagrams are another way to describe grammar rules. Popularized when they were used to describe Pascal grammar.

Copyright © 2006 The McGraw-Hill Companies, Inc. All Three are Equally Powerful BNF is considered equivalent to context-free grammars because it can express any rule in the grammar EBNF is no more (or less) powerful or expressive than BNF. Its virtue is compactness. Syntax diagrams are equally expressive.

Copyright © 2006 The McGraw-Hill Companies, Inc. Summary & Preview Grammars –BNF notation –Grammars & parse trees –Grammars, parse trees, associativity & precedence –Ambiguity in grammars Next up: –Clite syntax –Lexical and concrete syntax –Compilers & interpreters –Abstract syntax