Download presentation
Presentation is loading. Please wait.
1
ISBN 0-321-33025-0 Chapter 3 Describing Syntax and Semantics
2
Copyright © 2006 Addison-Wesley. All rights reserved.2 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs: Dynamic Semantics
3
Copyright © 2006 Addison-Wesley. All rights reserved.3 Introduction Syntax: the form or structure of the expressions, statements, and program units Semantics: the meaning of the expressions, statements, and program units Syntax and semantics provide a language’s definition Users of a language definition include: –Other language designers –Implementers (e.g., compiler writers) –Programmers (the users of the language) need agreement!
4
Copyright © 2006 Addison-Wesley. All rights reserved.4 The General Problem of Describing Syntax: Terminology A sentence is a string of characters over some alphabet A language is a set of sentences A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin ) A token is a category of lexemes (e.g., identifier)
5
Copyright © 2006 Addison-Wesley. All rights reserved.5 The General Problem of Describing Syntax: Terminology Example Example: index = 2 * count + 17; LexemesTokens indexidentifier =equal_sign 2int_literal *mult_op countidentifier +plus_op 17int_literal ;semi_colon
6
Copyright © 2006 Addison-Wesley. All rights reserved.6 Formal Definition of Languages Recognizers –A recognition system reads input strings of the language and decides whether the input strings belong to the language –Have a language L that uses alphabet of characters – Construct a mechanism R capable of reading strings and indicating whether given string was or was not in L. If accepts only correct sentences, then R is a description of L. –Example: syntax analysis part of a compiler –Detailed discussion in Chapter 4
7
Copyright © 2006 Addison-Wesley. All rights reserved.7 Formal Definition of Languages Generators –A device that generates sentences of a language –One can determine if the syntax of a particular sentence is correct by comparing it to the structure of the generator –Cuts down on trial-and-error required by recognizer.
8
Copyright © 2006 Addison-Wesley. All rights reserved.8 Context-Free Grammars –Developed by noted linguist Noam Chomsky in the mid-1950s –Language generators, meant to describe the syntax of natural languages –Define a class of languages called context-free languages –NOTE: forms of tokens of programming languages can be described by regular grammars
9
Copyright © 2006 Addison-Wesley. All rights reserved.9 Context-Free Grammar Example -> ->John ->Jill ->car ->hamburger ->a ->the -> ->eats ->drives ->slowly ->frequently Look at it intuitively here, we’ll get to formal definition in a minute….
10
Copyright © 2006 Addison-Wesley. All rights reserved.10 Context-Free Grammar Derivation Example -> *example from Sudkamp, Languages and Machines: An Introduction to the Theory of Computer Science
11
Copyright © 2006 Addison-Wesley. All rights reserved.11 Formal Methods of Describing Syntax Backus-Naur Form (BNF) and Context- Free Grammars –Most widely known method for describing programming language syntax Extended BNF –Improves readability and writability of BNF
12
Copyright © 2006 Addison-Wesley. All rights reserved.12 Backus-Naur Form (BNF) Backus-Naur Form (1959) –Invented by John Backus to describe Algol 58 –Updated by Peter Naur for Algol 60 –BNF is equivalent to context-free grammars –BNF is a metalanguage used to describe another language –In BNF, abstractions are used to represent classes of syntactic structures--they act like syntactic variables (also called nonterminal symbols)
13
Copyright © 2006 Addison-Wesley. All rights reserved.13 An Example Grammar begin end | ; = a | b | c | d + | - | | const non-terminal terminal
14
Copyright © 2006 Addison-Wesley. All rights reserved.14 BNF Fundamentals Terminals: lexemes and tokens Non-terminals: BNF abstractions, constructed by grouping smaller constructs. Each nonterminal has a rule to show how it is constructed. Grammar: a collection of rules –Examples of BNF rules: → identifier | identifier, → if then
15
Copyright © 2006 Addison-Wesley. All rights reserved.15 BNF Rules A rule has a symbol being defined (the left-hand side (LHS)), an arrow, and a right-hand side (RHS) A grammar is a finite nonempty set of rules An nonterminal symbol can have more than one RHS | begin end
16
Copyright © 2006 Addison-Wesley. All rights reserved.16 Describing Lists Syntactic lists are described using recursion ident | ident,
17
Copyright © 2006 Addison-Wesley. All rights reserved.17 Derivation A derivation is a repeated application of rules, starting with the start symbol and ending with a sentence (all terminal symbols) => begin end => begin = end => begin a = end => begin a = + end => begin a = b + end => begin a = b + const end
18
Copyright © 2006 Addison-Wesley. All rights reserved.18 Derivation Every string of symbols in the derivation is a sentential form A sentence is a sentential form that has only terminal symbols A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded A derivation may be neither leftmost nor rightmost Most languages are infinite
19
Copyright © 2006 Addison-Wesley. All rights reserved.19 Parse Tree A hierarchical representation of a derivation const a = b + beginend
20
Copyright © 2006 Addison-Wesley. All rights reserved. Correspondence to Code begin a = b + const end begin c = a + b; d = b - const end begin c = a * b; d = b + const; end begin end | ; = a | b | c | d + | - | | const Which ones match the grammar?
21
Copyright © 2006 Addison-Wesley. All rights reserved. BNF Exercise Add rules to the example expression grammar to create a BNF for the following constructs: –while loop –for loop Use the BNF to create a derivation and parse tree for: begin a = const; while (a > const) b = b + a; a = a – const; end –NOTE: depending on how you defined your while, you may need to add { } or begin/end to this code, remove ; etc.
22
Copyright © 2006 Addison-Wesley. All rights reserved. Review – where are we? Context-Free Grammars –describes a language by specifying how a sentence can be derived from a set of rules or productions –rules consist of terminals and non-terminals, and a specific start symbol BNF –formal notation for representing context-free grammars Why do we care? –we can use context-free grammars to describe many (but not all!) constructs of programming languages
23
Copyright © 2006 Addison-Wesley. All rights reserved.23 Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees Compiler chooses code to generate based on parse tree, if ambiguous may not be able to parse correctly. Characteristics of grammar to determine if ambiguous: –generates sentence with > 1 leftmost derivation –generates sentence with > 1 rightmost derivation What constructs do you think might lead to ambiguity?
24
Copyright © 2006 Addison-Wesley. All rights reserved.24 Two Grammars for Assignment -> = -> A|B|C -> + | * | ( ) | -> = -> A|B|C -> + | * | ( ) | Allows growth on both left and right.
25
Copyright © 2006 Addison-Wesley. All rights reserved.25 An Unambiguous Grammar A * B + C A B C * + -> = -> A|B|C -> + | * | ( ) |
26
Copyright © 2006 Addison-Wesley. All rights reserved.26 An Ambiguous Expression Grammar AB C A B C* * + + A * B + C -> = -> A|B|C -> + | * | ( ) | NOTE: these are two leftmost derivations?
27
Copyright © 2006 Addison-Wesley. All rights reserved.27 Another issue: Operator Precedence x + y * z –Must ensure that operators with higher precedence are performed first –Higher precedence operations should be lower in the parse tree –Unambiguous grammar does not yield correct precedence. Rightmost operator will be lowest in the parse tree –Must specify separate nonterminal operands with different precedence
28
Copyright © 2006 Addison-Wesley. All rights reserved.28 Grammar with Precedence -> = -> A|B|C -> + | -> * factor> | -> ( ) | Notice lower precedence listed in earlier rule…
29
Copyright © 2006 Addison-Wesley. All rights reserved.29 Operator Precedence Derivation A * B + C + * + A * + A * B + A * B + C -> = -> A|B|C -> + | -> * | -> ( ) |
30
Copyright © 2006 Addison-Wesley. All rights reserved.30 Operator Precedence Parse Tree A * B + C + + * C B A
31
Copyright © 2006 Addison-Wesley. All rights reserved.31 Associativity of Operators When two operators have the same precedence, the associativity determines the order of evaluation: –A / B * C could be (A/B) * C or A / (B * C) Addition is typically associative, meaning left and right associative orders have same value. Not necessarily true if floating point. Subtraction and division are not associative.
32
Copyright © 2006 Addison-Wesley. All rights reserved.32 Associativity of Operators When a grammar has its LHS appearing at beginning of RHS, it is left recursive. Specifies left associativity. Left recursion disallows some important syntax analysis algorithms. Grammars must be modified to remove left recursion. Compilers must then enforce, even though not dictated by grammar. Right associativity is specified by right recursion. We’ll do an exercise in Chapter 4 to remove left recursion
33
Copyright © 2006 Addison-Wesley. All rights reserved.33 Associativity of Operators Right operator associativity can also be indicated by a grammar -> ** | -> ( ) | LHS appears on right, so right associativity
34
Copyright © 2006 Addison-Wesley. All rights reserved.34 Associativity of Operators A + B + C showing associativity ++ + C B A Notice A added to B, then sum added to C Why? – because is leftmost in rule
35
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity Exercise Consider the following If-Else Grammar => if then | if then else | … (assume other types of stmts are possible) Show that this grammar is ambiguous. Hint: try if Expr1 then if Expr2 then Stmt1 else Stmt2 Rewrite the grammar to remove the ambiguity. Hint: Add non-terminals to distinguish two cases (with and without else)
36
Copyright © 2006 Addison-Wesley. All rights reserved.36 Extended BNF Extensions to BNF enhance readability, not its descriptive power Optional parts are placed in brackets [ ] -> ident [( )] Alternative parts of RHSs are placed inside parentheses and separated via vertical bars → (+|-) const Repetitions (0 or more) are placed inside braces { } → letter {letter|digit}
37
Copyright © 2006 Addison-Wesley. All rights reserved.37 BNF and EBNF Example BNF + | - | * | / | EBNF {(+ | -) } {(* | /) }
38
Copyright © 2006 Addison-Wesley. All rights reserved.38 Attribute Grammars Context-free grammars (CFGs) and BNF cannot easily describe all of the structure of programming languages –type compatibility (e.g., float = int OK, int = float not OK) would require too many rules –declaring all variables before they are referenced can’t be specified in BNF Static semantics –legal form of program (syntax not runtime semantics) –static because checked at compile time
39
Copyright © 2006 Addison-Wesley. All rights reserved.39 Attribute Grammars Knuth designed Attribute Grammars to describe both syntax & static semantics Basic concepts used at least informally in every compiler* CFGs plus: –attributes (values assigned to grammar symbols) –attribute computation functions (how to compute attribute values) –predicate functions (static semantic rules) * ad hoc methods more often used in compiler implementation, but AG used in other applications, such as natural-language processing
40
Copyright © 2006 Addison-Wesley. All rights reserved.40 Attribute Grammars : Definition An attribute grammar is a context-free grammar G = (V t, V n, P, S) with the following additions: –For each grammar symbol x there is a set A(x) of attribute values –A(x) consists of two disjoint sets, synthesized attributes S(x) and inherited attributes I(x) –Each grammar rule has a set of semantic functions that define certain attributes of the nonterminals in the rule –Each rule has a (possibly empty) set of predicates to check for attribute consistency
41
Copyright © 2006 Addison-Wesley. All rights reserved. Attribute Grammars: Attributes Synthesized attributes: pass semantic information up a parse tree (e.g., calculated at a child node and passed to parent) Inherited attributes: pass semantic information down and across tree (e.g., passed from parent to child) Intrinsic attributes: synthesized attributes of leaf nodes whose values are determined outside the parse tree (e.g., type info stored in a symbol table)
42
Copyright © 2006 Addison-Wesley. All rights reserved.42 Attribute Grammars: An Example Idea: define an attribute grammar that can enforce the type rules of an assignment statement Syntax -> = -> + | -> A | B | C Attributes actual_type : synthesized for lhs and expected_type : inherited for Type Requirements Variables can be either int or real. Mixed expression type is always real. If operands are the same, result type matches types of operands. (from child) (from parent – think about declaration)
43
Copyright © 2006 Addison-Wesley. All rights reserved.43 Attribute Grammar (Example cont) 1.Syntax rule: = Semantic rules:.expected_type .actual_type 2.Syntax rule: [2] + [3] Semantic rules:.actual_type .actual_type if ( [2].actual_type = int) and ( [3].actual_type = int) then int else real Predicate:.expected_type ==.actual_type
44
Copyright © 2006 Addison-Wesley. All rights reserved. Attribute Grammar (Example cont) 3.Syntax rule: Semantic rules:.actual_type .actual_type Predicate:.expected_type ==.actual_type 4.Syntax rule: A|B|C Semantic rules:.actual_type look-up(.string) (intrinsic)
45
Copyright © 2006 Addison-Wesley. All rights reserved. Example Parse Tree A = A + B [2] [3] A A B = +
46
Copyright © 2006 Addison-Wesley. All rights reserved.46 Attribute Grammars (continued) How are attribute values computed? –If all attributes were inherited, the tree could be decorated in top-down order. –If all attributes were synthesized, the tree could be decorated in bottom-up order. –In many cases, both kinds of attributes are used, and it is some combination of top-down and bottom-up that must be used.
47
Copyright © 2006 Addison-Wesley. All rights reserved. Decorated Parse Tree A = A + B Order: 1.Rule 4 (lookup) 2.Rule 1 (var=expr) 3.Rule 4 (lookup) 4.Rule 2 (expr = var + var) 5.Rule 2 (predicate) [2] [3] A A B = + actual type expected type lookup actual type expected type = actual_type
48
Copyright © 2006 Addison-Wesley. All rights reserved. Evaluation Checking static semantics is essential part of compilers Large number of attributes and rules make attribute grammars difficult to read and write Attribute values on large parse tree are costly to evaluate This is one area where formalism is good to understand, but ad-hoc methods (e.g., symbol table) have prevailed
49
Copyright © 2006 Addison-Wesley. All rights reserved. EBNF Exercise Do the EBNF COBOL Exercise
50
Copyright © 2006 Addison-Wesley. All rights reserved.50 Summary BNF and context-free grammars are equivalent meta-languages –Well-suited for describing the syntax of programming languages An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.