Download presentation
Published byCarmel Palmer Modified over 9 years ago
1
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Computer engineering department Gujarat power engineering & Research institute COMPILER DESIGN (170701)
2
Introduction Programming languages are notations for describing computations to people and to machines. The world depend on programming languages because, All the software running on all the computer is written in some programming language. But before a program can be run, it first must be translated into a form in which it can be executed by a computer. The software systems that do this translations are called compilers. GPERI – CD - UNIT-1
3
Language Processors Compiler:
It is a program that can read a program in one language (the source language) and translate it into an equivalent program in another language (the target language). The role of compiler is to report any errors in the source program that it detects during the translation process. GPERI – CD - UNIT-1
4
Language Processors Compiler:
If the target program is an executable machine language program, it can then be called by the user to process inputs and produce outputs. Compiler Target program Source program Fig 1: A compiler Target Program Input Output Fig 2: Running the target program GPERI – CD - UNIT-1
5
Language Processors Interpreter:
Instead of producing a target program as a translation, It appears to directly execute the operations specified in the source program on input supplied by the user. Target Program Source program Output Input Fig 2: Running the target program GPERI – CD - UNIT-1
6
Language Processors Difference between compiler and interpreter:
The machine language target program produced by a compiler is usually much faster than an interpreter. An interpreter can give better error diagnostics than a compiler, because it execute the source program statement by statement. In compiler, several other programs may be required to create an executable target program. GPERI – CD - UNIT-1
7
Language Processors . Source program Preprocessor
Modified source program Compiler Target assembly program Assembler Re-locatable machine code Library files Re-locatable object file. Linker/Loader target machine code GPERI – CD - UNIT-1
8
Language Processors A source program divided into modules stored in separate files. The task of collecting a source program is sometimes entrusted to a separate program, called preprocessor. The modified source program is then fed to a compiler. The compiler produce an assembly-language program as its output. GPERI – CD - UNIT-1
9
Language Processors The assembly language is then processed by a program called an assembler. An assembler produces re-locatable machine code as its output. Large program are often compiled in pieces, so the re-locatable machine code may have to linked together with other re-locatable object files and library files into the code that actually runs on the machine. GPERI – CD - UNIT-1
10
Language Processors The linker resolves (decides) external memory addresses, where the code in one file may refer to location in another file. The loader then puts together all of the executable object files into memory for execution. GPERI – CD - UNIT-1
11
Structure of Compiler (Front end and Back end)
We treated a compiler as a single box, That maps a source program into a semantically equivalent target program. If we open this box there are two parts to this mapping: Analysis and Synthesis. GPERI – CD - UNIT-1
12
Structure of Compiler (Front end and Back end)
Analysis part: Breaks up the source program into constituent pieces and impose (execute or carry out) a grammatical structure on them. Then use this structure to create an intermediate representation of the source program. If this part detects that the source program is either syntactically ill formed or semantically unsound, Then it must provide informative messages, so the user can take corrective action. GPERI – CD - UNIT-1
13
Structure of Compiler (Front end and Back end)
Analysis part: This part also collect information about the source program and store it in a data structure called a symbol table. Analysis determines the operations implied by the source program which are recorded in a tree structure The analysis part is often called the front end of the compiler GPERI – CD - UNIT-1
14
Structure of Compiler (Front end and Back end)
Synthesis part: Synthesis takes the tree structure and translates the operations therein into the target program. or It constructs the target program from the intermediate representation and the information in the symbol table. The synthesis part is the back end. GPERI – CD - UNIT-1
15
Analysis of the source program
Lexical Analysis (Linear Analysis): source program reads from left to right and grouped into token e.g. constants, variables names, keywords etc. (check for valid token set). GPERI – CD - UNIT-1
16
Analysis of the source program
Hierarchical Analysis (Syntax Analysis or Parsing): Grouped tokens into grammatical phase and construct parse tree (check for valid syntax). Semantic Analysis: Certain checks are performed to ensure that the components of a program fit together meaningfully. i.e. its tasks is to determine the meaning of the source program (check for the semantic errors ) GPERI – CD - UNIT-1
17
Phases of compiler . Character stream Intermediate representation
Lexical Analyzer Machine Independent Code Optimizer Token stream Intermediate representation Symbol Table Syntax Analyzer Code Generator Syntax tree Target machine code Semantic Analyzer Machine Dependent Code Optimizer Syntax tree Target machine code Intermediate Code Generator GPERI – CD - UNIT-1
18
(token-name, attribute-value)
Lexical Analysis First phase of compiler. Also called lexical analysis or scanning. The lexical analyzer reads the stream of characters of the source program and groups the character into meaningful sequences called lexeme. For each lexeme lexical analyzer produces token as output. The form of token is: (token-name, attribute-value) GPERI – CD - UNIT-1
19
Lexical Analysis The token is pass to the next phase, syntax analysis.
In token, The first component token-name is an abstract symbol that is used during syntax analysis. The second component attribute-value points to an entry in the symbol table for this token. GPERI – CD - UNIT-1
20
position = initial + rate * 60
Lexical Analysis Example: A source program contain assignment statement. position = initial + rate * 60 It could be group into the following lexeme and mapped into the following tokens. GPERI – CD - UNIT-1
21
Lexical Analysis position is a lexeme, mapped into a token (id,1),
Where: id (identifier) is an abstract symbol, and 1 points to the symbol table entry for position. The assignment symbol = is lexeme, mapped into a token (=), no need attribute value, omitted second component. Initial is a lexeme, mapped into the token (id,2) 2 points to the symbol table entry for initial. GPERI – CD - UNIT-1
22
(id,1) (=) (id,2) (+) (id,3) (*) (60)
Lexical Analysis + is a lexeme, mapped into token (+). rate is a lexeme, mapped into a token (id,3), Where: 3 points to the symbol table entry for rate. * is a lexeme, mapped into token (*). 60 is a lexeme, mapped into token (60). (id,1) (=) (id,2) (+) (id,3) (*) (60) GPERI – CD - UNIT-1
23
Lexical Analysis . GPERI – CD - UNIT-1
24
Syntax Analysis (parsing)
The second phase of compiler. It uses the first component of the tokens produced by the lexical analyzer to create a tree like intermediate representation. Known as syntax tree in which: Interior node represent an operation and child node represent the arguments of the operations. GPERI – CD - UNIT-1
25
Lexical Analysis . GPERI – CD - UNIT-1
26
Semantic Analysis Uses the syntax tree and the information in the symbol table to check the source program for semantic consistency. It also gathers types information and saves it in either the syntax tree or the symbol table for the next phase use. Its important task is type checking, where compiler checks that each operator has matching operands. GPERI – CD - UNIT-1
27
Semantic Analysis For example:
Many programming language require an array index to be an integer; The compiler must report an error if a floating point number is used to index as an array. Also permit some type conversion. For example: a binary arithmetic operator may be applied to either a pair of integers or to a pair of floating points number. GPERI – CD - UNIT-1
28
Lexical Analysis . GPERI – CD - UNIT-1
29
Intermediate Code Generation
During the process of translating, compiler may construct one or more intermediate represent. (Syntax tree) They are commonly used during syntax and semantic analysis. After syntax and semantic analysis of the source program, Many compilers generate an explicit low-level or machine like intermediate representation. It have two important properties: GPERI – CD - UNIT-1
30
Intermediate Code Generation
It have two important properties: It should be easy to produce, It should be easy to translate into the target machine . We consider an intermediate form called three-address code. Consist of a sequence of assembly-like instructions with three operands per instruction. GPERI – CD - UNIT-1
31
Intermediate Code Generation
Each operand can act like a register. t1 = inttofloat(60) t2 = id3 * t1 t3 = id2 * t2 id1 = t3 GPERI – CD - UNIT-1
32
Lexical Analysis . GPERI – CD - UNIT-1
33
Code optimization Attempts to improve the intermediate code so that better target code result t1 = id3 * 60.0 id1 = id2 + t1 GPERI – CD - UNIT-1
34
Lexical Analysis . GPERI – CD - UNIT-1
35
Code Generation Final phase of compiler to generate the target code.
Memory location are selected for each variable used by the program. Intermediate instruction are translated into sequence of m/c instruction having similar meaning. For example using register R1 and R2. GPERI – CD - UNIT-1
36
Code Generation LDF R2, id3 MULF R2, R2, #60.0 LDF R1, id2 ADDF R1, R1, R2 STF id1, R1 GPERI – CD - UNIT-1
37
Lexical Analysis . GPERI – CD - UNIT-1
38
Symbol Table Management
It is the data structure which contains a record for each identifier with its attribute list. As a identifier identified by scanner (lexical analyzer) it will be entered into symbol table. Essential function of compiler is to record the identifiers with its attributs (type, scope, storage location, etc.) GPERI – CD - UNIT-1
39
The grouping of phases Compiler front and back ends:
Front ends: analysis : It consists of those phases, or parts of phases, that depend primarily on the source language and are largely independent of the target machine. GPERI – CD - UNIT-1
40
The grouping of phases Compiler front and back ends:
Back end: synthesis (machine dependent): It includes those portions of the compiler that, depend on the target machine, and generally, those portions do not depend on the source language. GPERI – CD - UNIT-1
41
The grouping of phases Advantage of Analysis – Synthesis concept:
One can take the front end of a compiler and redo its associated back end to produce a compiler for the same source language on a different machine. If the back end design carefully, it may not even be necessary to redesign too much of the back end. GPERI – CD - UNIT-1
42
Compiler Construction Tools
Software development tools are available to implement one or more compiler phases. Scanner generators Parser generators Syntax-directed translation engines Automatic code generators Data-flow engines GPERI – CD - UNIT-1
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.