Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISBN 0-321-33025-0 Chapter 3 Describing Syntax and Semantics.

Similar presentations


Presentation on theme: "ISBN 0-321-33025-0 Chapter 3 Describing Syntax and Semantics."— Presentation transcript:

1 ISBN 0-321-33025-0 Chapter 3 Describing Syntax and Semantics

2 Copyright © 2006 Addison-Wesley. All rights reserved.1-2 Chapter 3 Topics Introduction The General Problem of Describing Syntax Formal Methods of Describing Syntax Attribute Grammars Describing the Meanings of Programs: Dynamic Semantics

3 Copyright © 2006 Addison-Wesley. All rights reserved.1-3 Introduction Syntax: the form or structure of the expressions, statements, and program units Semantics: the meaning of the expressions, statements, and program units Syntax and semantics provide a language’s definition – Users of a language definition Other language designers Implementers Programmers (the users of the language)

4 Copyright © 2006 Addison-Wesley. All rights reserved.1-4 The syntax of a Java while statement: while ( ) The semantic of the above statement: when the current value of the Boolean expression is true, the embedded statement is executed; otherwise, control continues after the while construct. Then control implicitly returns to the Boolean expression to repeat the process An Example of Syntax and Semantic

5 Copyright © 2006 Addison-Wesley. All rights reserved.1-5 The General Problem of Describing Syntax: Terminology A sentence is a string of characters over some alphabet A language is a set of sentences A lexeme is the lowest level syntactic unit of a language (e.g., *, sum, begin ) A token is a category of lexemes (e.g., identifier) –a token may have only a single lexeme –a lexeme of a token can also be called an instance of the token

6 Copyright © 2006 Addison-Wesley. All rights reserved.1-6 Lexemes and Tokens [csusb], [Lee] [csusb][Lee] Lexemes: a string of characters in a language that is treated as a single unit in the syntax and semantics. –For example identifiers, numbers, and operators are often lexemes In a programming language, there are a very large number of lexemes, perhaps even an infinite number; however, there are only a small number of tokens.

7 Copyright © 2006 Addison-Wesley. All rights reserved.1-7 while (y >= t) y = y - 3 ; will be represented by the set of pairs: Examples of Lexemes and Tokens [Lee] [Lee] a token with only one lexeme

8 Copyright © 2006 Addison-Wesley. All rights reserved.1-8 Ways to Define a Language - Recognition Recognition –Construct a recognition device that reads strings of characters from the alphabet of the language and decides whether the input strings belong to the language –Example: syntax analysis part of a compiler is a recognizer for the language the compiler translates –are not used to enumerate all of the sentences of a language –Detailed discussion in Chapter 4

9 Copyright © 2006 Addison-Wesley. All rights reserved.1-9 Ways to Define a Language - Generation Generation –Create a device that generates sentences of a language –One can determine if the syntax of a particular sentence is correct by comparing it to the structure of the generator –People learn a language from examples of its sentences

10 Copyright © 2006 Addison-Wesley. All rights reserved.1-10 Grammars [wiki] [wiki] In computer science and linguistics, a formal grammar, or sometimes simply grammar, is a precise description of a formal language — that is, of a set of strings. Commonly used to describe the syntax of programming languages

11 Copyright © 2006 Addison-Wesley. All rights reserved.1-11 Formal Grammars [wiki] [wiki] A formal grammar, or sometimes simply grammar, consists of: –a finite set of terminal symbols; –a finite set of nonterminal symbols; –a finite set of production rules with a left- and a right-hand side consisting of a sequence of these symbols –a start symbol.

12 Copyright © 2006 Addison-Wesley. All rights reserved.1-12 Grammar Example [wiki] [wiki] Nonterminals are usually represented by uppercase letters, terminals by lowercase letters, and the start symbol by S. For example, the grammar with terminals {a,b}, nonterminals {S,A,B}, production rules –S  ABS –S  ε (where ε is the empty string) –BA  AB –BS  b –Bb  bb –Ab  ab –Aa  aa and start symbol S, defines the language of all words of the form a n b n (i.e. n copies of a followed by n copies of b).

13 Copyright © 2006 Addison-Wesley. All rights reserved.1-13 Formal Grammars and Formal Languages [wiki] [wiki] A formal grammar defines (or generates) a formal language, which is a (possibly infinite) set of sequences of symbols that may be constructed by applying production rules to a sequence of symbols which initially contains just the start symbol.

14 Copyright © 2006 Addison-Wesley. All rights reserved.1-14 Grammar Classes [wiki] [wiki] In the mid-1950s, according to the format of the production rules, Chomsky described four classes of generative devices or grammars that define four classes of languages: –recursively enumerable –context-sensitive –context-free –regular

15 Copyright © 2006 Addison-Wesley. All rights reserved.1-15 Application of Context-free and Regular Grammars The tokens of programming languages can be described by regular grammars. Whole programming languages, with minor exceptions, can be described by context- free grammars.

16 Copyright © 2006 Addison-Wesley. All rights reserved.1-16 Review of Context-Free Grammars Context-Free Grammars –Developed by Noam Chomsky in the mid-1950s –Language generators, meant to describe the syntax of natural languages –Define a class of languages called context-free languages

17 Copyright © 2006 Addison-Wesley. All rights reserved.1-17 Terminology - metalanguage A metalanguage is a language that is used to describe another language.

18 Copyright © 2006 Addison-Wesley. All rights reserved.1-18 Backus-Naur Form (BNF) Backus-Naur Form (1959) –Invented by John Backus to describe Algol 58 –Most widely known method for describing programming language syntax –BNF is equivalent to context-free grammars –BNF is a metalanguage used to describe another language

19 Copyright © 2006 Addison-Wesley. All rights reserved.1-19 BNF Abstraction In BNF, abstractions are used to represent classes of syntactic structures--they act like syntactic variables (also called nonterminal symbols)

20 Copyright © 2006 Addison-Wesley. All rights reserved.1-20 Abstraction Example A simple Java assignment statement might be represented by the abstraction.

21 Copyright © 2006 Addison-Wesley. All rights reserved.1-21 Extended BNF –Improves readability and writability of BNF

22 Copyright © 2006 Addison-Wesley. All rights reserved.1-22 BNF Fundamentals Non-terminals: BNF abstractions Terminals: lexemes and tokens Grammar: a collection of rules –Examples of BNF rules: → identifier | identifier, → if then

23 Copyright © 2006 Addison-Wesley. All rights reserved.1-23 BNF Rules A rule, also called a production, has a left-hand side (LHS) and a right-hand side (RHS), and consists of terminal and nonterminal symbols A grammar is a finite nonempty set of rules An abstraction (or nonterminal symbol) can have more than one RHS  | begin end

24 Copyright © 2006 Addison-Wesley. All rights reserved.1-24 A Rule Example  = The above rule specifies that the abstraction is defined as an instance of the of the abstraction, followed by the lexeme =, followed by an instance of the abstraction. A sentence example : total = subtotal1 + subtotal2

25 Copyright © 2006 Addison-Wesley. All rights reserved.1-25 Recursive Rule A rule is recursive if its LHS appears in its RHS.

26 Copyright © 2006 Addison-Wesley. All rights reserved.1-26 Describing Lists Syntactic lists are described using recursion  ident | ident,

27 Copyright © 2006 Addison-Wesley. All rights reserved.1-27 Derivation A derivation is a repeated application of rules, starting with a special nonterminal of the grammar called the start symbol and ending with a sentence (all terminal symbols)

28 Copyright © 2006 Addison-Wesley. All rights reserved.1-28 Example 3.1 - An Example Grammar P.S.: is the start symbol.

29 Copyright © 2006 Addison-Wesley. All rights reserved.1-29 An Example Derivation => begin end => begin ; end => begin = ; end => begin A = ; end => begin A = + ; end => begin A = B + ; end => begin A = B + C ; end => begin A = B + C ; := end => begin A=B + C;B= end => begin A = B + C ; B = C end derive

30 Copyright © 2006 Addison-Wesley. All rights reserved.1-30 Derivation - Terminology Every string of symbols in the derivation is a sentential form A sentence is a sentential form that has only terminal symbols A leftmost derivation is one in which the leftmost nonterminal in each sentential form is the one that is expanded Rightmost derivation A derivation may be neither leftmost nor rightmost

31 Copyright © 2006 Addison-Wesley. All rights reserved.1-31 Example 3.2

32 Copyright © 2006 Addison-Wesley. All rights reserved.1-32 The Leftmost Derivation for A=B*(A+C) => = => A = => A= * => A = B * => A = B * ( ) => A = B* ( + ) => A = B* (A+ ) => A = B* ) => A=B*(A+C)

33 Copyright © 2006 Addison-Wesley. All rights reserved.1-33 Hierarchical Syntactic Structure One of the most attractive features of grammars is that they naturally describe the hierarchical syntactic structure of the sentences of the languages they define. These hierarchical structures are called parse trees. Since a grammar can describe a certain syntactic structure so that part of the meaning of the structure can follow from its parse tree.

34 Copyright © 2006 Addison-Wesley. All rights reserved.1-34 Figure 3.1 - a Parse Tree for A=B*(A+C) A hierarchical representation of a derivation

35 Copyright © 2006 Addison-Wesley. All rights reserved.1-35 Mapping between a Grammar and Its Corresponding Parse Tree Every internal node of a parse tree is labeled with a nonterminal symbol; every leaf is labeled with a terminal symbol. Every subtree of a parse tree describes one instance of an abstraction in the statement.

36 Copyright © 2006 Addison-Wesley. All rights reserved.1-36 Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has two or more distinct parse trees.

37 Copyright © 2006 Addison-Wesley. All rights reserved.1-37 Example 3.3 - An Ambiguous Grammar for Simple Assignment Statements

38 Copyright © 2006 Addison-Wesley. All rights reserved.1-38 Figure 3.2 - Two Distinct Parse Trees for the Same Sentence A = B + C * A

39 Copyright © 2006 Addison-Wesley. All rights reserved.1-39 Causes of Ambiguity Rather than allowing the parse tree of an expression to grow only on the right, this grammar allows growth on both the left and the right.

40 Copyright © 2006 Addison-Wesley. All rights reserved.1-40 Drawbacks of Ambiguous Grammars Syntactic ambiguity of language structures is a problem because compilers often base the semantics of those structures on their syntactic form. In particular, the compiler decides what code to generate for a statement by examining its parse tree. If a language structure has more than one parse tree, then the meaning of the structure cannot be determined uniquely.

41 Copyright © 2006 Addison-Wesley. All rights reserved.1-41 Operator Precedence If an operator in an arithmetic expression is generated lower in the parse tree (and therefore must be evaluated first), then this fact can be used to indicate that it has precedence over an operator produced higher up in the tree.

42 Copyright © 2006 Addison-Wesley. All rights reserved.1-42 Conflicting Operator Precedence due to a Ambiguous Grammar In the first parse tree of Figure 3.2, for example, the multiplication operator is generated lower in the tree, which could indicate that it has precedence over the addition operator in the expression. The second parse tree, however, indicates just the opposite.

43 Copyright © 2006 Addison-Wesley. All rights reserved.1-43 Unusual Operator Precedence Created by the Grammar in Example 3.2 In the grammar of example 3.2, a parse tree of a sentence with multiple operators, regardless of the particular operators involved, has the rightmost operator in the expression at the lowest point in the parse tree, with the other operators in the tree moving progressively higher as one moves to the left in the expression. –For example, in the parse tree in Figure 3.1, the plus operator is the rightmost operator in the expression and the lowest in the tree, giving it precedence over the multiplication operator to its left.

44 Copyright © 2006 Addison-Wesley. All rights reserved.1-44 Example 3.4 – An Unambiguous Grammar for Assignment Statements The above grammar generates the same language as the grammars of Examples 3.2 and 3.3.

45 Copyright © 2006 Addison-Wesley. All rights reserved.1-45 Usual Operator Precedence Created by the Grammar in Example 3.2 The grammar of example 3.4 indicates the usual precedence order of multiplication and addition operators.

46 Copyright © 2006 Addison-Wesley. All rights reserved.1-46 Derivations vs. Parse Trees Every derivation with an unambiguous grammar has a unique parse tree, although that tree can be represented by different derivations. => = => = + => = + * => = + * A => = + C * A => =B + C*A => A= B + C*A = > = => A = => A = + => A = B + => A = B + * => A = B + C* => A = B+C*A leftmost derivationrightmost derivation

47 Copyright © 2006 Addison-Wesley. All rights reserved.1-47 Figure 3.3 – the Unique Parse Tree for A= B + C * A Using an Unambiguous Grammar

48 Copyright © 2006 Addison-Wesley. All rights reserved.1-48 Question? Do the parse trees for expressions with two or more adjacent occurrences of operators with equal precedence have those occurrences in proper hierarchical order? An example of an assignment statement with such an expression is A = B + C + A

49 Copyright © 2006 Addison-Wesley. All rights reserved.1-49 Associativity of Operators Operator associativity can also be indicated by a grammar

50 Copyright © 2006 Addison-Wesley. All rights reserved.1-50 Figure 3.4 - Parse Tree for A = B + C + A

51 Copyright © 2006 Addison-Wesley. All rights reserved.1-51 Operator Associativity vs. Parse Trees The parse tree of Figure 3.4 shows the left addition operator lower than the right addition operator. This is the correct order if addition is meant to be left associative, which is typical.

52 Copyright © 2006 Addison-Wesley. All rights reserved.1-52 Left Recursive When a grammar rule has its LHS also appearing at the beginning of its RHS, the rule is said to be left recursive. This left recursion specifies left associativity. For example, the left recursion of the rules of the grammar of Example 3.4 causes it to make both addition and multiplication left associative.

53 Copyright © 2006 Addison-Wesley. All rights reserved.1-53 Right Recursion In most languages that provide it, the exponentiation operator is right associative. To indicate right associativity, right recursion can be used. A grammar rule is right recursive if the LHS appears at the right end of the RHS, Rules such as -> * * | -> ( ) | id could be used to describe exponentiation as a right- associative operator.

54 Copyright © 2006 Addison-Wesley. All rights reserved.1-54 Motivation for an Extended BNF Because of a few minor inconveniences in BNF, it has been extended in several ways. Most extended versions are called Extended BNF, or simply EBNF, even though they are not all exactly the same. The extensions do not enhance the descriptive power of BNF; they only increase its readability and writability.

55 Copyright © 2006 Addison-Wesley. All rights reserved.1-55 Commonly Used Extensions in Various Extended BNF Optional parts of RHS are placed in brackets [ ] -> if ( ) [else ]; Repetitions (0 or more) are placed inside braces { } -> {, } Alternative parts of RHSs are placed inside parentheses and separated via vertical bars → (*|/|%)

56 Copyright © 2006 Addison-Wesley. All rights reserved.1-56 BNF and EBNF Versions of an Expression Grammar

57 Copyright © 2006 Addison-Wesley. All rights reserved.1-57 Metasymbols Brackets, braces, and parentheses in the EBNF extensions are metasymbols, which means they are notational tools and not terminal symbols in the syntactic entities they help describe. In cases where these metasymbols are also terminal symbols in the language being described, the instances that are terminal symbols can be underlined or quoted.

58 Copyright © 2006 Addison-Wesley. All rights reserved.1-58 More Extensions in Some EBNF Some versions of EBNF allow a numeric superscript to be attached to the right brace to indicate an upper limit to the number of times the enclosed part can be repeated. Also, some versions use a plus (+) superscript to indicate one or more repetitions. –For example, -> begin { } end and -> begin { } + end are equivalent.

59 Copyright © 2006 Addison-Wesley. All rights reserved.1-59 Grammars and Recognizers There was a close relationship between generation and recognition devices for a given language. In fact, given a context-free grammar, a recognizer for the language generated by the grammar can be algorithmically constructed. A number of software systems allow the quick creation of the syntax analysis part of a compiler for a new language and are therefore quite valuable. –One of the first of these syntax analyzer generators is named yacc.

60 Copyright © 2006 Addison-Wesley. All rights reserved.1-60 Attribute Grammars An attribute grammar is an extension to a context-free grammar. Additions to CFGs to carry some semantic info along parse trees Primary value of attribute grammars (AGs) –Static semantics specification –Compiler design (static semantics checking)

61 Copyright © 2006 Addison-Wesley. All rights reserved.1-61 Limitations of BNF Context-free grammars (CFGs) cannot describe all of the syntax of programming languages There are some characteristics of the structure of programming languages that are difficult to describe with BNF, and some that are impossible.

62 Copyright © 2006 Addison-Wesley. All rights reserved.1-62 Example - a Language Rule Difficult to Specify with BNF Consider type compatibility rules. In Java, for example, a floating-point value cannot be assigned to an integer type variable, although the opposite is legal. Although this restriction can be specified in BNF, it requires additional nonterminal symbols and rules. If all of the typing rules of Java were specified in BNF, the grammar would become too large to be useful because the size of the grammar determines the size of the parser.

63 Copyright © 2006 Addison-Wesley. All rights reserved.1-63 Example – a Language Rule That Cannot Be Specified in BNF Consider the common rule that all variables must he declared before they are referenced. It can be proven that this rule cannot be specified in BNF.

64 Copyright © 2006 Addison-Wesley. All rights reserved.1-64 Static Semantics Rules The example in the previous slide exemplifies the category of language rules called static semantics rules. The static semantics of a language is only indirectly related to the meaning of programs during execution; rather, it has to do with the legal forms of programs (syntax rather than semantics). In many cases, the static semantic rules of a language state its type constraints. Static semantics is so named because the analysis required to check these specifications can be done at compile time.

65 Copyright © 2006 Addison-Wesley. All rights reserved.1-65 Motivation for the Creation of Attribute Grammars Because of the problems of describing static semantics with BNF, a variety of more powerful mechanisms has been devised for that task. –One such mechanism, attribute grammars, was designed by Knuth (1968a) to describe both the syntax and the static semantics of programs.

66 Copyright © 2006 Addison-Wesley. All rights reserved.1-66 Usage of Attribute Grammars Attribute grammars are a formal approach to both describing and checking the correctness of the static semantics rules of a program. Although they are not always used in a formal way in compiler design, the basic concepts of attribute grammars are at least informally used in every compiler.

67 Copyright © 2006 Addison-Wesley. All rights reserved.1-67 Components of Attribute Grammars Attribute grammars are grammars to which have been added –attributes, –attribute computation functions –predicate functions

68 Copyright © 2006 Addison-Wesley. All rights reserved.1-68 Attributes Attributes, which are associated with grammar symbols, are similar to variables in the sense that they can have values assigned to them.

69 Copyright © 2006 Addison-Wesley. All rights reserved.1-69 Attribute Computation Functions Attribute computation functions, sometimes called semantic functions, are associated with grammar rules. They are used to specify how attribute values are computed.

70 Copyright © 2006 Addison-Wesley. All rights reserved.1-70 Predicate Functions Predicate functions, which state some of the syntax and static semantic rules of the language, are associated with grammar rules.

71 Copyright © 2006 Addison-Wesley. All rights reserved.1-71 Attribute Grammars : Definition An attribute grammar is a context-free grammar G = (S, N, T, P) with the following additions: –For each grammar symbol x there is a set of attributes A(x) –Associated with each grammar rule is a set of semantic functions and a possibly empty set of predicate functions over the attributes of the symbols in the grammar rule. –Each rule has a (possibly empty) set of predicates to check for attribute consistency

72 Copyright © 2006 Addison-Wesley. All rights reserved.1-72 Attributes of Grammar Symbols Associated with each grammar symbol X is a set of attributes A(X). The set A(X) consists of two disjoint sets S(X) and I(X), called synthesized and inherited attributes, respectively. –synthesized attributes are used to pass semantic information up a parse tree –inherited attributes pass semantic information down a tree.

73 Copyright © 2006 Addison-Wesley. All rights reserved.1-73 Attribute Grammars: Definition Let X 0  X 1... X n be a rule The synthesized attributes of X 0 are computed with semantic functions of the form S(X 0 ) = f(A(X 1 ),..., A(X n )). –So the value of a synthesized attribute on a parse tree node depends only on the values of the attributes on that node's children nodes. Inherited attributes of symbols X j, 1 ≤ j ≤ n (in the rule above), are computed with a semantic function of the form I(X j ) = f(A(X 0 ),..., A(X n )), –So the value of an inherited attribute on a parse tree node depends on the attribute values of that node's parent node and those of its sibling nodes. Initially, there are intrinsic attributes on the leaves

74 Copyright © 2006 Addison-Wesley. All rights reserved.1-74 Circularity To avoid circularity, inherited attributes are often restricted to functions of the form I(X j ) = f(A(X 0 ),..., A(X J-1 ))) The above form prevents an inherited attribute from depending on itself or on attributes to the right in the parse tree.

75 Copyright © 2006 Addison-Wesley. All rights reserved.1-75 Predicate Functions A predicate function has the form of a Boolean expression on the attribute set (A(X 0 ),...,A(X n )} and a set of literal attribute values. The only derivations allowed with an attribute grammar are those in which every predicate associated with every nonterminal is true. –A false predicate function value indicates a violation of the syntax or static semantics rules of the language.

76 Copyright © 2006 Addison-Wesley. All rights reserved.1-76 A Parse Tree of an Attribute Grammar A parse tree of an attribute grammar is the parse tree based on its underlying BNF grammar, with a possibly empty set of attribute values attached to each node. If all the attribute values in a parse tree have been computed, the tree is said to be fully attributed. –Although in practice it is not always done this way, it is convenient to think of attribute values as being computed after the complete unattributed parse tree has been constructed.

77 Copyright © 2006 Addison-Wesley. All rights reserved.1-77 Intrinsic Attributes Intrinsic attributes are synthesized attributes of leaf nodes whose values are determined outside the parse tree. –For example, the type of an instance of a variable in a program could come from the symbol table, which is used to store variable names and their types. The contents of the symbol table are determined from earlier declaration statements.

78 Copyright © 2006 Addison-Wesley. All rights reserved.1-78 From Intrinsic Attributes to All Attributes Initially, assuming that an unattributed parse tree has been constructed and that attribute values are desired, the ONLY attributes with values are the intrinsic attributes of the leaf nodes. Given the intrinsic attribute values on a parse tree, the semantic functions can be used to compute the remaining attribute values.

79 Copyright © 2006 Addison-Wesley. All rights reserved.1-79 Subscript Notice that when there is more than one occurrence of a nonterminal in a syntax rule in an attribute grammar, the nonterminals are subscripted with brackets to distinguish them. Neither the subscript nor the brackets are part of the described language.

80 Copyright © 2006 Addison-Wesley. All rights reserved.1-80 An Attribute Grammar Example The following is a fragment of an attribute grammar that describes the rule that the name on the end of an Ada procedure must match the procedure's name. The string attribute of, denoted by.string, is the actual string of characters that were found by the lexical analyzer. Syntax rule: → procedure [l] end [2]; Predicate: [1].string == [2].string

81 Copyright © 2006 Addison-Wesley. All rights reserved.1-81 Use an Attribute Grammar to Check the Type Rules of a Simple Assignment Statement – Assumptions (1) The syntax and semantics of this assignment statement are as follows: The only variable names are A, B, and C. –The right side of the assignments can either be a variable or an expression in the form of a variable added to another variable. –The variables can be one of two types: int or real. –When there are two variables on the right side of an assignment, they need not be the same type.

82 Copyright © 2006 Addison-Wesley. All rights reserved.1-82 Use an Attribute Grammar to Check the Type Rules of a Simple Assignment Statement – Assumptions (2) –The type of the expression when the operand types are not the same is always real. –When they are the same, the expression type is that of the operands. –The type of the left side of the assignment must match the type of the right side. –So the types of operands in the right side can be mixed, but the assignment is valid only if the LHS and the value resulting from evaluating the RHS have the same type.

83 Copyright © 2006 Addison-Wesley. All rights reserved.1-83 Use an Attribute Grammar to Check the Type Rules of a Simple Assignment Statement - Syntax The syntax portion of our example attribute grammar is → = → + | → A | B | c

84 Copyright © 2006 Addison-Wesley. All rights reserved.1-84 Use an Attribute Grammar to Check the Type Rules of a Simple Assignment Statement - Attributes The attributes for the nonterminals in the example attribute grammar are described as follows: actual_type –A synthesized attribute associated with the nonterminals and. –It is used to store the actual type, int or real in the example, of a variable or expression. –In the case of a variable, the actual type is intrinsic. –In the case of an expression, it is determined from the actual types of the child node or children nodes of the nonterminal, expected_type –An inherited attribute associated with the nonterminal. –It is used to store the type, either int or real, that is expected for the expression, as determined by the type of the variable on the left side of the assignment statement.

85 Copyright © 2006 Addison-Wesley. All rights reserved.1-85 Example 3.6

86 Copyright © 2006 Addison-Wesley. All rights reserved.1-86 Figure 3.6 - A Parse Tree for A = A + B Based on the Grammar of Example 3.6

87 Copyright © 2006 Addison-Wesley. All rights reserved.1-87 The Process of Decorating the Parse Tree with Attributes This could proceed in a completely top- down order, from the root to the leaves, if all attributes were inherited. Alternatively, it could proceed in a completely bottom-up order, from the leaves to the root, if all the attributes were synthesized. Because our grammar has both synthesized and inherited attributes, the evaluation process cannot be in any single direction.

88 Copyright © 2006 Addison-Wesley. All rights reserved.1-88 An Example of Attribute Evaluation The following is an evaluation of the attributes in Figure 3.6, in an order in which it is possible to compute them. 1..actual_type ← look-up(A) (Rule 4) 2..expected_type ←.actual_type (Rule 1) 3. [2].actual_type ← look-up(A) (Rule 4) [3].actual_type ← look-up(B) (Rule 4) 4.,actual_type ← either int or real (Rule 2) 5..expected_type ==.actual_type is either TRUE or FALSE (Rule 2)

89 Copyright © 2006 Addison-Wesley. All rights reserved.1-89 The Flow of Attributes in the Tree Solid lines are used for the parse tree. Dashed lines show attribute flow in the tree.

90 Copyright © 2006 Addison-Wesley. All rights reserved.1-90 A Fully Attributed Parse Tree A is defined as a real and B is defined as an int.

91 Copyright © 2006 Addison-Wesley. All rights reserved.1-91 Evaluation Order for the General Case of an Attribute Grammar Determining attribute evaluation order for the general case of an attribute grammar is a complex problem, requiring the construction of a dependency graph to show all attribute dependencies.

92 Copyright © 2006 Addison-Wesley. All rights reserved.1-92 Pro and Con One of the main difficulties in using an attribute grammar to describe all of the syntax and static semantics of a real contemporary programming language is its SIZE and COMPLEXITY. The large number of attributes and semantic rules required for a complete programming language make such grammars difficult to write and read. Furthermore, the attribute values on a large parse tree are costly to evaluate. On the other hand, less formal attribute grammars are a powerful and commonly used tool for compiler writers, who are more interested in the process of producing a compiler than they are in formalism.

93 Copyright © 2006 Addison-Wesley. All rights reserved.1-93 Semantics There is no single widely acceptable notation or formalism for describing semantics. People usually find out the semantic of a programming language by reading English explanations in language manuals.

94 Copyright © 2006 Addison-Wesley. All rights reserved.1-94 Material in the following Slides is not covered in this course !!!

95 Copyright © 2006 Addison-Wesley. All rights reserved.1-95 Operational Semantics – (1) Describe the meaning of a program by executing its statements on a machine, either simulated or actual. The change in the state of the machine (memory, registers, etc.) defines the meaning of the statement

96 Copyright © 2006 Addison-Wesley. All rights reserved.1-96 Operational Semantics – (2) To use operational semantics for a high- level language, a virtual machine is needed A hardware pure interpreter would be too expensive A software pure interpreter also has problems –The detailed characteristics of the particular computer would make actions difficult to understand –Such a semantic definition would be machine- dependent

97 Copyright © 2006 Addison-Wesley. All rights reserved.1-97 Operational Semantics (continued) A better alternative: A complete computer simulation The process: –Build a translator (translates source code to the machine code of an idealized computer) –Build a simulator for the idealized computer Evaluation of operational semantics: –Good if used informally (language manuals, etc.) –Extremely complex if used formally (e.g., VDL), it was used for describing semantics of PL/I.

98 Copyright © 2006 Addison-Wesley. All rights reserved.1-98 Axiomatic Semantics Based on formal logic (predicate calculus) Original purpose: formal program verification Axioms or inference rules are defined for each statement type in the language (to allow transformations of expressions to other expressions) The expressions are called assertions

99 Copyright © 2006 Addison-Wesley. All rights reserved.1-99 Axiomatic Semantics (continued) An assertion before a statement (a precondition) states the relationships and constraints among variables that are true at that point in execution An assertion following a statement is a postcondition A weakest precondition is the least restrictive precondition that will guarantee the postcondition

100 Copyright © 2006 Addison-Wesley. All rights reserved.1-100 Axiomatic Semantics Form Pre-, post form: {P} statement {Q} An example –a = b + 1 {a > 1} –One possible precondition: {b > 10} –Weakest precondition: {b > 0}

101 Copyright © 2006 Addison-Wesley. All rights reserved.1-101 Program Proof Process The postcondition for the entire program is the desired result –Work back through the program to the first statement. If the precondition on the first statement is the same as the program specification, the program is correct.

102 Copyright © 2006 Addison-Wesley. All rights reserved.1-102 Axiomatic Semantics: Axioms An axiom for assignment statements (x = E): {Q x->E } x = E {Q} The Rule of Consequence:

103 Copyright © 2006 Addison-Wesley. All rights reserved.1-103 Axiomatic Semantics: Axioms An inference rule for sequences {P1} S1 {P2} {P2} S2 {P3}

104 Copyright © 2006 Addison-Wesley. All rights reserved.1-104 Axiomatic Semantics: Axioms An inference rule for logical pretest loops {P} while B do S end {Q} where I is the loop invariant (the inductive hypothesis)

105 Copyright © 2006 Addison-Wesley. All rights reserved.1-105 Axiomatic Semantics: Axioms Characteristics of the loop invariant: I must meet the following conditions: –P => I -- the loop invariant must be true initially –{I} B {I} -- evaluation of the Boolean must not change the validity of I –{I and B} S {I} -- I is not changed by executing the body of the loop –(I and (not B)) => Q -- if I is true and B is false, is implied –The loop terminates

106 Copyright © 2006 Addison-Wesley. All rights reserved.1-106 Loop Invariant The loop invariant I is a weakened version of the loop postcondition, and it is also a precondition. I must be weak enough to be satisfied prior to the beginning of the loop, but when combined with the loop exit condition, it must be strong enough to force the truth of the postcondition

107 Copyright © 2006 Addison-Wesley. All rights reserved.1-107 Evaluation of Axiomatic Semantics Developing axioms or inference rules for all of the statements in a language is difficult It is a good tool for correctness proofs, and an excellent framework for reasoning about programs, but it is not as useful for language users and compiler writers Its usefulness in describing the meaning of a programming language is limited for language users or compiler writers

108 Copyright © 2006 Addison-Wesley. All rights reserved.1-108 Denotational Semantics Based on recursive function theory The most abstract semantics description method Originally developed by Scott and Strachey (1970)

109 Copyright © 2006 Addison-Wesley. All rights reserved.1-109 Denotational Semantics (continued) The process of building a denotational specification for a language Define a mathematical object for each language entity –Define a function that maps instances of the language entities onto instances of the corresponding mathematical objects The meaning of language constructs are defined by only the values of the program's variables

110 Copyright © 2006 Addison-Wesley. All rights reserved.1-110 Denotation Semantics vs Operational Semantics In operational semantics, the state changes are defined by coded algorithms In denotational semantics, the state changes are defined by rigorous mathematical functions

111 Copyright © 2006 Addison-Wesley. All rights reserved.1-111 Denotational Semantics: Program State The state of a program is the values of all its current variables s = {,, …, } Let VARMAP be a function that, when given a variable name and a state, returns the current value of the variable VARMAP(i j, s) = v j

112 Copyright © 2006 Addison-Wesley. All rights reserved.1-112 Decimal Numbers  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9| (0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9) M dec ('0') = 0, M dec ('1') = 1, …, M dec ('9') = 9 M dec ( '0') = 10 * M dec ( ) M dec ( '1’) = 10 * M dec ( ) + 1 … M dec ( '9') = 10 * M dec ( ) + 9

113 Copyright © 2006 Addison-Wesley. All rights reserved.1-113 Expressions Map expressions onto Z  {error} We assume expressions are decimal numbers, variables, or binary expressions having one arithmetic operator and two operands, each of which can be an expression

114 Copyright © 2006 Addison-Wesley. All rights reserved.1-114 3.5 Semantics (cont.) M e (, s)  = case of => M dec (, s) => if VARMAP(, s) == undef then error else VARMAP(, s) => if (M e (., s) == undef OR M e (., s) = undef) then error else if (. == ‘+’ then M e (., s) + M e (., s) else M e (., s) * M e (., s)...

115 Copyright © 2006 Addison-Wesley. All rights reserved.1-115 Assignment Statements Maps state sets to state sets Ma(x := E, s)  = if Me(E, s) == error then error else s’ = {,,..., }, where for j = 1, 2,..., n, v j ’ = VARMAP(i j, s) if i j <> x = Me(E, s) if i j == x

116 Copyright © 2006 Addison-Wesley. All rights reserved.1-116 Logical Pretest Loops Maps state sets to state sets M l (while B do L, s)  = if M b (B, s) == undef then error else if M b (B, s) == false then s else if M sl (L, s) == error then error else M l (while B do L, M sl (L, s))

117 Copyright © 2006 Addison-Wesley. All rights reserved.1-117 Loop Meaning The meaning of the loop is the value of the program variables after the statements in the loop have been executed the prescribed number of times, assuming there have been no errors In essence, the loop has been converted from iteration to recursion, where the recursive control is mathematically defined by other recursive state mapping functions Recursion, when compared to iteration, is easier to describe with mathematical rigor

118 Copyright © 2006 Addison-Wesley. All rights reserved.1-118 Evaluation of Denotational Semantics Can be used to prove the correctness of programs Provides a rigorous way to think about programs Can be an aid to language design Has been used in compiler generation systems Because of its complexity, they are of little use to language users

119 Copyright © 2006 Addison-Wesley. All rights reserved.1-119 Summary BNF and context-free grammars are equivalent meta-languages –Well-suited for describing the syntax of programming languages An attribute grammar is a descriptive formalism that can describe both the syntax and the semantics of a language Three primary methods of semantics description –Operation, axiomatic, denotational


Download ppt "ISBN 0-321-33025-0 Chapter 3 Describing Syntax and Semantics."

Similar presentations


Ads by Google