Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,

Similar presentations


Presentation on theme: "Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,"— Presentation transcript:

1 Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis, where we delve deeper to check whether they form a sensible set of instructions in the programming language. For a program to be semantically correct, all variables, functions, classes, etc. are properly defined, expressions and variables are used in ways that respect the type system, access control isn’t violated, and so on. Semantic analysis is the next-to-last phase of the front end and the compiler’s last chance to weed out incorrect programs. We need to ensure the program is well-formed enough to continue on to the next phase where we generate code.

2 semantic analysis and symbol table A large part of semantic analysis consists of tracking variable/function/type declarations.As we enter each new identifier in our symbol table, we need to record the type information of the declaration. Then, as we continue parsing the rest of the program, we make sure that the type of each identifier and expression is respected in terms of the operations being performed

3 Examples of the things we check in the semantic analysis phase.. The type of the right-side expression of an assignment statement should match the type of the left-side, and the left-side needs to be a properly declared and assignable identifier (i.e. not some sort of constant). The parameters of a function should match the arguments of a function call in both number and type. The language may require that identifiers are unique, disallowing a global variable and function of the same name. The operands to multiplication operation will need to be of numeric type, perhaps even the exact same type depending on the strictness of the language.

4 As we encounter identifiers in a program, we need to determine if the identifier is accessible at that point in the program. This is called scope checking One additional issue in semantic analysis is dealing with scopes.. A scope is a section of program text enclosed by basic program delimiters, e.g., {} in C, or begin- end in Pascal. Many languages allow nested scopes that are scopes defined within other scopes. The scope defined by the innermost such unit is called the current scope. The scope defined by the current scope and by any enclosing program units are known as open scopes. Any other scope is a closed scope.

5 Syntax-directed translation refers to a method of compiler implementation where the source language translation is completely driven by the parser. In other words, the parsing process and parse trees are used to direct semantic analysis and the translation of the source program. This can be a separate phase of a compiler or we can augment our conventional grammar with information to control the semantic analysis and translation. Such grammars are called attribute grammars.

6 attribute grammars. We augment a grammar by associating attributes with each grammar symbol that describes its properties. An attribute has a name and an associated value— a string, a number, a type, a memory location, an assigned register, whatever information we need. With each production in a grammar, we give semantic rules or actions, which describe how to compute the attribute values associated with each grammar symbol in a production. The attribute value for a parse node may depend on information from its children nodes below or its siblings and parent node above.

7 There are two types of attributes we might encounter: synthesized or inherited. Synthesized attributes are those attributes that are passed up a parse tree, i.e., the left-side attribute is computed from the right-side attributes. Values for the attributes of terminals are usually supplied by the lexical analyzer and the synthesized ones are passed up from there. Inherited attributes are those that are passed down a parse tree, i.e., the right-side attributes are derived from the left-side attributes (or other right-side attributes). These attributes are used for passing information about the context to nodes further down the tree.

8 S - attribute grammars , Only have synthesized attributes L - attribute grammars , for every production A→X 1 X 2 …X n , each attribute is a synthesized attribute or is a Inherited attribute of X j ( 1≤j≤n ),and the Inherited attribute of X j ( 1≤j≤n ) depends on the Inherited attribute of A or depends on the attributes of X 1 , X 2 , … , X j-1 in every semantics rule

9 We can implement syntax-directed translation in either a top-down or a bottom-up parser. Access to attributes in yacc. The syntax $1, $2 is used to access the attribute of the nth token on the right side of the production. The global variable yylval is set by the scanner and that value is saved with the token when placed on the parse stack. When a rule is reduced, a new state is placed on the stack, the default behavior is to just copy the attribute of $1 for that new state, this can be controlled by assigning to $$ in the action for the rule.

10 intermediate representation Most compilers translate the source first to some form of intermediate representation and convert from there into machine code. The intermediate representation is a machine- and languageindependent version of the original source code. Although converting the code twice introduces another step and thus incurs loss in compiler efficiency, use of an intermediate representation provides advantages in increased abstraction, cleaner separation between the front and back ends, adds possibilities for re-targeting/cross- compilation, and works well with many advanced optimization techniques.

11 Intermediate representations are usually categorized according to where they fall between a high-level language like C, and machine code. IRs that are close to a high-level language are called high-level IRs, and IRs that are close to assembly are called low-level IRs. For example, a high-level IR might preserve things like array subscripts or field accesses whereas a low-level IR converts those into explicit addresses and offsets.

12 abstract syntax tree You can think of a parse tree as an example of a high-level intermediate representation. In fact, it is often possible to reconstruct the actual source code from a parse tree and the corresponding symbol table. If we were to build a tree during the parsing phase, it could form the basis of a syntax tree representation of the input program. Typically, this is not quite the literal parse tree recognized by the parser (intermediate nodes may be collapsed, groupings units can be dispensed with, etc.),but it is winnowed down to the sufficient structure to drive the semantic processing and code generation. Such a tree is usually referred to as an abstract syntax tree. In the abstract syntax tree there are not leaf-nodes for operators and keywords,only inter-nodes associated with them.


Download ppt "Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,"

Similar presentations


Ads by Google