Download presentation
Presentation is loading. Please wait.
Published byJasper Briggs Modified over 9 years ago
1
Chapter 5 Compilation of Imperative, Functional, Logical and Object Oriented Languages
2
Compilation of Imperative, Functional, Logical and Object Oriented Languages: Compilation of imperative, functional, logical and object- oriented languages differ in nature. We will explore all four above categories as: –Compilation of Imperative Languages: The P machine Architecture –Compilation of Functional Languages: –Compilation of Logic Programming Languages: –Compilation of Object Oriented Languages:
3
Compilation of Imperative Languages: Imperative programming language possess the following constructs and concepts which be mapped onto the constructs, concepts and instruction sequences of abstract or real computer Variable” are containers for data objects whose contents (values) may changed during the execution of the program. The values are changed by the execution of “statements” such as assignments. ”Expression” are terms formed from constants, names and operators which are evaluated during execution. “Explicit specification of the control flow”. The branch instruction goto, which exists in most imperative programming languages, can be directly compiled into the unconditional branch instruction of the target machine.
4
The P machine Architecture: The (abstract) P machine was developed to make the Zurich implementation of Pascal portable. Anyone wishing to implement Pascal on a real computer had only to write an interpreter for the instructions of this abstract machine. The Pascal compiler, written in Pascal and compiled into P-code, could then be run on the real computer. P machine can be implemented by using Stack, memory etc.
5
Compilation of Functional Languages Functional programming languages originated with LISP. LaMa is also known as functional programming language. Imperative languages have (at least) two worlds, the world of expressions and the world of statements. Expression provide values; statements alter the state of variables or determine the flow of control. Functional languages only contain expression and the execution of a functional program involves the evaluation of the associated program expression which defines the program result. Its evaluation may also involve the evaluation of many other expression, if an expression calls other expressions via a function application; however there is no explicit statement-defined control flow.
6
Compilation of Functional Languages Contd. A variable in a functional program identifies an expression; unlike in imperative languages, it does not identify one or more storage locations. Its value cannot change as a result of the execution of the program; the only possibility is the “reduction” of the expression it identifies to its value. The MaMa machine Architecture was introduced to compile LaMa language.
7
Compilation of Logic Programming Languages: Three different terminologies are used in discussions of logic programs. When programming is involved, we speak of procedures, alternatives of procedures, calls, variables, and so on. When explaining the logical foundations, we use words such as variable, function and predicate symbols, terms, atomic formulae, and so on. Finally, term such as literal, Horn clause, unification and resolution comes from the mechanization of logic in automated theorem-proving procedures. The WiM machine architecture was introduced to compile ProLog language.
8
Compilation of Object Oriented Languages: Software systems are becoming increasingly complex and large. Thus, there is a growing need to make the development of such systems more efficient and more transparent. The ultimate objective is to construct software systems, like present-day hardware systems (e.g. cars, washing machines etc) from ready-made standard building blocks, Attempts to progress towards this objective cover the following areas (among other): Modularization Reusability of modules Extensibility of modules Abstraction Object oriented languages afford new possibilities in these areas. Thus, object orientation is viewed as an important paradigm in relation to management of the complexity of software systems.
9
The Structure of Compilers: Compilers for high-level programming languages are large, complex software systems. the development of large software systems should always begin with the decomposition of the overall system into subsystems (modules) with a well-defined and understood functionality. The division used should also involve sensible interfaces between the modules. The compiler structure described in what follows is a conceptual structure, that is, it identifies the subtasks of the compilation of a source language into a target language and specifies possible interfaces between the modules implementing these subtasks. The real module structure of the compiler will be derived from this conceptual structure latter.
10
The Structure of Compilers Contd. The first coarse structuring of the compilation process is the division into an “analysis phase” and a “synthesis phase”. In the analysis phase the syntactic structure and some of the semantic properties of the source program are computed. The semantic properties that can be computed by a compiler are called the “static” semantics. This includes all semantic information that can be determined solely from the program in question, without executing it with the input data. The results of the analysis phase comprise either messages about syntax or semantic errors in the program (that is, a rejection of the program) or an appropriate representation of the syntactic structure and the static semantic properties of the program. This phase is (ideally) independent of the properties of the target language and the target machine. The synthesis phase for a compiler takes this program representation and converts it (possibly in several steps) into an equivalent target program.
11
Compiler Subtasks: The compilation process decomposes into a sequence of sub- processes. Each sub-processes receives a representation of the program and produces a further representation of a different type or of the same type but with modified content. We shall now follow the sequence of sub-processes step by step to explain their tasks and the structure of the program representation.
12
Compiler Subtasks: The compilation process decomposes into a sequence of sub- processes. Each sub-processes receives a representation of the program and produces a further representation of a different type or of the same type but with modified content. We shall now follow the sequence of sub-processes step by step to explain their tasks and the structure of the program representation.
13
Lexical Analysis: A module, usually called “SCANNER”, carries out the lexical analysis of a source program. It reads the source program in from a file in the form of a character string and decomposes this character string into a sequence of lexical units of the programming language, called “SYMBOLS”. Typical lexical units include the standard representations for object of type integer, real, char, boolean and string, together with identifiers, comments, punctuation symbols and single or multiple character operators such as =, =,, ==, (, ), [, ], and so on. The scanner can distinguish between sequences of space characters and/or line feeds, which only have a meaning as separators and can subsequently be ignored, and relevant sequences of such characters (e.g. String). The output of the scanner, if it does not encounter an error, is a representation of the source program as a sequence of symbols or encoded symbols.
14
Lexical Analysis Contd. For example, the representation is as follows: int a,b; NLa=2; NLb=a*a+1;NL id(“int”) sep id(“a”)com id(“b”)sem sep (it include NL) id(“a”)eq int(“2”) sem sep (it include NL) id(“b”)eq id(“a”)mul id(“a”) add int(“1”) sem sep (it include NL)
15
Screening: The task of the screener is to recognize the following symbols in the symbol string produces by the scanner: 1.Symbols that have a special meaning in the programming language, for example among the identifiers, the reserved symbols of the language such as {, }, int, float etc 2.Symbols that are irrelevant for the subsequent processing and will be eliminated, for example string of space characters and line feeds, which have been used as separators between symbols, and comments 3.Symbols that are not part of the program but directives to the compiler (program), for example the type of diagnosis to be performed, the type of compilation protocol desired, and so on In addition, the screener is often given the task of encoding the symbols of certain classes of symbols (such as the identifiers) in a unique way and replacing each occurrence of a symbol by its code
16
Screening Contd. Thus, for example, if all the occurrences of an identifier in a program are replaced by the same natural number, the character-string representation of the identifier need only be stored once and thus the problem of having to store identifiers of different length is concentrated in a specialized part of the program In practice, the scanner and screener are usually combined into a single procedure (which is simply called the scanner Conceptually, however, they should be separated, because the task of the scanner can be accomplished by a finite automaton, while that of the screener must necessarily (sensibly) be carried out by other functions The representation is as follows int id(1) com id(2) sem id(1)eqint(2)sem id(2)eqid(1)mulid(1)addint(1)
17
Syntax Analysis: The syntax analysis should determine the structure of the program over and above the lexical structure It knows the structure of expressions, statements, declarations, and lists of these constructs and attempts to recognize the structure of a program in the given symbol string The corresponding module, called the “PARSER”, must also be able to detect, locate, and diagnose errors in the syntax structure, there is a wealth of methods for syntax analysis There are various equivalent forms of parser output. In our conceptual compiler structure we use the syntax tree of the program as output
18
Semantic Analysis: The task of semantic analysis is to determine those properties of programs, above and beyond the (context-free) syntactic properties, that can be computed using only the program text. These properties are often called “static semantic” properties, unlike “dynamic” properties, which are properties of programs that can only be determined when the complied program is run. Thus, the two terms, static and dynamic are associated with the two times, compile time and run time. The static semantic properties includes: 1.The type correctness or incorrectness of programs in strongly typed languages such as Pascal. A necessary condition for type correctness is that every identifier must be declared (implicitly or explicitly) and there should be no double declarations. 2.The existence of a consistent type assignment to all functions of a program in (functional) language with polymorphism. Here, a function whose type is only partially defined, for example using type variables, can be applied to combinations of arguments of a different type and essentially do the same thing.
19
Semantic Analysis Contd. For example, for the first statement a=2; one checks whether there is a variable name on the left side and whether the type of the right side matches that of the left side. These two questions are answered positively, since a is declared as a variable and lexically, the character string ‘2’ is recognized as a representation of an integer constant. In the second statement b=a*a+1; the type of the right side has to be computed. This computation involves the types of the terminal operands (all integer) and rules that compute the type of a sum or a product from the types of the operands. Here, we note that the arithmetic operators in most programming languages are “overloaded”, that is, they stand for the operations they designate over both integer and real numbers, possibly even with different precision. In the type computation this overloading is eliminated. In our example, it is established tat an integer multiplication and an integer addition are involved. Thus, the result of the whole expression is of type integer.
20
Machine-independent Optimization: This is optional phase and does not exist in all compilers. It does not necessarily belong to either the analysis phase or the synthesis phase. However, it uses information computed by semantic analysis and, unlike the subtasks of the synthesis phase, it is machine independent The synthesis phase of the compilation begins with storage allocation and address assignment. This involves properties of the target machine such as the word length, the address length, the directly addressable units of the machine and the existence or non-existence of instructions giving efficient access to parts of directly accessible units. Address Assignment (Part of Code Generator):
21
Generation of the Target Program (Part of Code Generator): The code generator generates the instructions of the target program. For this, it uses the addresses assigned in the previous step to address variables. However, the time efficiency of the target program can often be increased if it is possible to hold the values of variables and expressions in machine registers. Access to these is generally faster than access to memory locations. Since each machine has only a limited number of such registers the code generator must use them to the greatest advantage to store frequently used values. This task is called “register allocation”.
22
Real Compiler Structures: So far, we have considered a conceptual compiler structure. Its modular structure was characterized by the following properties. 1.The compilation process is divided into a sequence of sub-processes. 2.Each sub-process communicates with its successor without feedback; the information flows in one direction only. 3.The intermediate representation of the source program can be described by mechanisms from the theory of formal languages, such as regular expression, context-free grammars, attribute grammars, and so on. 4.The distribution of tasks among sub-processes is in part based on the correspondence between the description mechanisms referred to above and automaton models and is in part carried out pragmatically in order to split a complex task into two separate, more manageable subtasks.
23
Real Compiler Structures contd. Why is it not a good real compiler structure? In the design of a real compiler (one that is to be implemented), the structure is influenced by the complexity of the subtasks, the requirements on the compiler and the constraints of the computer and the operating systems. This idea of compiler generation led to the development of other description mechanisms and generation procedures discuss earlier shows the subtasks of the conceptual compiler structure that can be describe (in part) by formal specifications and for which generation procedures exist.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.