Download presentation
Presentation is loading. Please wait.
Published byPhilip Sanders Modified over 9 years ago
1
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 2 Syntax A language that is simple to parse for the compiler is also simple to parse for the human programmer. N. Wirth
2
Copyright © 2006 The McGraw-Hill Companies, Inc. Contents 2.1 Grammars 2.1.1 Backus-Naur Form 2.1.2 Derivations 2.1.3 Parse Trees 2.1.4 Associativity and Precedence 2.1.5 Ambiguous Grammars 2.2 Extended BNF 2.3 Syntax of a Small Language: Clite 2.3.1 Lexical Syntax 2.3.2 Concrete Syntax 2.4 Compilers and Interpreters 2.5 Linking Syntax and Semantics 2.5.1 Abstract Syntax 2.5.2 Abstract Syntax Trees 2.5.3 Abstract Syntax of Clite
3
Copyright © 2006 The McGraw-Hill Companies, Inc. Translation/Execution – Compiler review
4
Copyright © 2006 The McGraw-Hill Companies, Inc. Thinking about Syntax The syntax of a programming language is a precise description of all its grammatically correct programs. Grammar rules are a common technique for describing language syntax precisely. Precise syntax was first used with Algol 60. Three levels: –Lexical syntax –Concrete syntax –Abstract syntax
5
Copyright © 2006 The McGraw-Hill Companies, Inc. Levels of Syntax Lexical syntax: describes the basic symbols of the language (names, values, operators, etc.) Concrete syntax: rules for writing expressions, statements and programs. Abstract syntax: describes an internal representation of the program, emphasizes content over form. The authors define Clite, a mini-language, to use as a teaching tool in the study syntax and semantics.
6
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1 Grammars A metalanguage is a language used to define other languages. cf metaknowledge A grammar is a set of rules, written in a metalanguage, and used to define the syntax of a language.
7
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1.1 Backus-Naur Form (BNF) Notation for describing a context-free grammar (see Chomsky hierarchy) Sometimes called Backus Normal Form First used to define syntax of Algol 60 Now used to define syntax of most major languages
8
Copyright © 2006 The McGraw-Hill Companies, Inc. Elements of a Context-Free Grammar Set of productions: P terminal symbols: T nonterminal symbols: N start symbol: A production has the form and ω is a string from N and T.
9
Copyright © 2006 The McGraw-Hill Companies, Inc. Example: Binary Digits Consider the grammar: binaryDigit 0 binaryDigit 1 or equivalently: binaryDigit 0 | 1 Here, | and are metacharacters (metasymbols)
10
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1.2 Derivations Consider the grammar: Integer Digit | Integer Digit Digit 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 We can derive any unsigned integer, like 352, from this grammar. (Derivations can (1) produce all legal integers or (2) show that a particular integer is correctly formed)
11
Copyright © 2006 The McGraw-Hill Companies, Inc. Derivation of 352 as an Integer A 6-step process, begins with the start symbol Step 1: Integer Integer Digit Replace a nonterminal by a RHS of one of its rules: Step 1: Integer Integer Digit Step 2: Integer Digit Digit Step 3: Digit Digit Digit Step 4: 3 Digit Digit Step 5: 3 5 Digit Step 6: 3 5 2 Finished when there are only terminals on the RHS
12
Copyright © 2006 The McGraw-Hill Companies, Inc. A Different Derivation of 352 Integer Integer Digit Integer 2 Integer Digit 2 Integer 5 2 Digit 5 2 3 5 2 This is called a rightmost derivation, since at each step the rightmost nonterminal is replaced. (The first one was a leftmost derivation.)
13
Copyright © 2006 The McGraw-Hill Companies, Inc. Notation for Derivations Integer * 352 Means that 352 can be derived in a finite number of steps using the grammar for Integer. 352 L(G) Means that 352 is a member of the language defined by grammar G. Definition: the language L defined by a BNF grammar G is the set of all terminal strings that can be derived from the start symbol.
14
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1.3 Parse Trees A parse tree is a graphical representation of a derivation. The root node of the tree is the start symbol. Each internal node of the tree corresponds to a non-terminal The child(ren) of a node represent a right-hand side of a production for which the node is the left-hand side. Each leaf node represents a terminal symbol of the derived string, reading from left to right.
15
Copyright © 2006 The McGraw-Hill Companies, Inc. E.g., The step Integer Integer Digit appears in the parse tree as: Integer Digit
16
Copyright © 2006 The McGraw-Hill Companies, Inc. Parse Tree for 352 as an Integer Figure 2.1
17
Copyright © 2006 The McGraw-Hill Companies, Inc. Arithmetic Expression Grammar The following grammar defines the language of arithmetic expressions with 1-digit integers, addition, and subtraction. Expr Expr + Term | Expr – Term | Term Term 0 |... | 9 | ( Expr )
18
Copyright © 2006 The McGraw-Hill Companies, Inc. Parse of the String 5-4+3 Figure 2.2
19
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1.4 Associativity and Precedence Grammars define associativity and precedence among the operators in an expression. Precedence: which operator is evaluated first; e.g., in the expression “a + b / c” Associativity: evaluation order for equal precedence (adjacent) operators; e.g., in the expression “a + b + c”
20
Copyright © 2006 The McGraw-Hill Companies, Inc. Grammar G 1: Consider the more interesting grammar G 1 : Expr Expr + Term | Expr – Term | Term Term Term * Factor | Term / Factor | Term % Factor | Factor Factor Primary ** Factor | Primary Primary 0 |... | 9 | ( Expr )
21
Copyright © 2006 The McGraw-Hill Companies, Inc. Parse of 4**2**3+5*6+7 for Grammar G 1 Figure 2.3 Expr Expr + Term |Expr – Term | Term Term Term * Factor | Term / Factor | Term % Factor | Factor Factor Primary ** Factor | Primary Primary 0 |... | 9 | ( Expr )
22
Copyright © 2006 The McGraw-Hill Companies, Inc. PrecedenceAssociativityOperators 3right ** 2left * / % 1left + - The structure of the parse tree shows operator precedence & associativity: Operators lower in the tree are evaluated first. An operation can’t be performed until its operands are evaluated Associativity and Precedence for Grammar G 1 Table 2.1
23
Copyright © 2006 The McGraw-Hill Companies, Inc. Precedence & Associativity in Grammars An operator’s precedence is determined by the length of the shortest derivation from the start symbol to the operator (see Figure 2.3) Left- or right- associativity is determined by left- or right- recursion. compare the operators ** and + in Figure 2.3
24
Copyright © 2006 The McGraw-Hill Companies, Inc. 2.1.5 Ambiguous Grammars A grammar is ambiguous if one of its strings has two or more different parse trees. C, C++, and Java have a large number of –operators and –precedence levels Instead of using a large grammar, we can: –Write a smaller ambiguous grammar, and –Give separate precedence and associativity rules (e.g., Table 2.1)
25
Copyright © 2006 The McGraw-Hill Companies, Inc. An Ambiguous Expression Grammar G 2 Expr -> Expr Op Expr | ( Expr ) | Integer Op -> + | - | * | / | % | ** Notes: –G 2 is equivalent to G 1. i.e., its language is the same. –G 2 has fewer productions and nonterminals than G 1. –However, G 2 is ambiguous.
26
Copyright © 2006 The McGraw-Hill Companies, Inc. Ambiguous Parse of 5-4+3 Using Grammar G 2 Figure 2.4
27
Copyright © 2006 The McGraw-Hill Companies, Inc. The Dangling Else IfStatement → if ( Expression ) Statement | if ( Expression ) Statement else Statement where Statement → Assignment | IfStatement | Block Suppose one of the statements was another If?
28
Copyright © 2006 The McGraw-Hill Companies, Inc. Example With which ‘if’ does the following ‘else’ associate? if (x < 0) if (y < 0) y = y - 1; else y = 0; Answer: either one!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.