Compilers.

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Programming Languages and Paradigms
1 Compiler Construction Intermediate Code Generation.
Names and Bindings.
The Assembly Language Level
CS 326 Programming Languages, Concepts and Implementation Instructor: Mircea Nicolescu Lecture 18.
Chapter 8 Runtime Support. How program structures are implemented in a computer memory? The evolution of programming language design has led to the creation.
Memory Allocation. Three kinds of memory Fixed memory Stack memory Heap memory.
Run time vs. Compile time
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Chapter 9: Subprogram Control
Compiler Construction A Compulsory Module for Students in Computer Science Department Faculty of IT / Al – Al Bayt University Second Semester 2008/2009.
Presented by Dr Ioanna Dionysiou
1.3 Executing Programs. How is Computer Code Transformed into an Executable? Interpreters Compilers Hybrid systems.
5.3 Machine-Independent Compiler Features
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
Names Variables Type Checking Strong Typing Type Compatibility 1.
Compiler Construction
1 COMP 3438 – Part II-Lecture 1: Overview of Compiler Design Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ.
Basic Semantics Associating meaning with language entities.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
1.  10% Assignments/ class participation  10% Pop Quizzes  05% Attendance  25% Mid Term  50% Final Term 2.
1 Code optimization “Code optimization refers to the techniques used by the compiler to improve the execution efficiency of the generated object code”
Joey Paquet, 2000, Lecture 10 Introduction to Code Generation and Intermediate Representations.
Introduction to Code Generation and Intermediate Representations
1 Compiler Design (40-414)  Main Text Book: Compilers: Principles, Techniques & Tools, 2 nd ed., Aho, Lam, Sethi, and Ullman, 2007  Evaluation:  Midterm.
國立台灣大學 資訊工程學系 薛智文 98 Spring Run Time Environments (textbook ch# 7.1–7.3 )
1 Compiler & its Phases Krishan Kumar Asstt. Prof. (CSE) BPRCE, Gohana.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer Target Code Generator.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
1 Compiler Construction Run-time Environments,. 2 Run-Time Environments (Chapter 7) Continued: Access to No-local Names.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Lecture 9 Symbol Table and Attributed Grammars
Advanced Computer Systems
Data Types In Text: Chapter 6.
Code Optimization Overview and Examples
Compiler Design (40-414) Main Text Book:
Chapter 1 Introduction.
CS 3304 Comparative Languages
Type Checking Generalizes the concept of operands and operators to include subprograms and assignments Type checking is the activity of ensuring that the.
Type Checking, and Scopes
Intermediate code Jakub Yaghob
A Simple Syntax-Directed Translator
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler design Introduction to code generation
Review: Chapter 5: Syntax directed translation
Chapter 1 Introduction.
课程名 编译原理 Compiling Techniques
CS 153: Concepts of Compiler Design November 28 Class Meeting
Chapter 1: Introduction to Compiling (Cont.)
Compiler Lecture 1 CS510.
Stacks Chapter 4.
Compiler Construction
Chapter 6 Intermediate-Code Generation
Code Optimization Overview and Examples Control Flow Graph
Compiler design.
UNIT V Run Time Environments.
Languages and Compilers (SProg og Oversættere)
Intermediate Code Generation
Run Time Environments 薛智文
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Runtime Environments What is in the memory?.
RUN-TIME STORAGE Chuen-Liang Chen Department of Computer Science
CMPE 152: Compiler Design May 2 Class Meeting
Presentation transcript:

Compilers

Compiler Is a program which takes one language (source program ) as input and translates it into an equivalent another language ( target program) During this process of translation if some errors are encountered, then compiler displays them as error message It takes program s such C, PASCAL, FORTRAN and converts into lower level languages like assembly language

Analysis and synthesis phase Source program is read and broken into constituent pieces Intermediate code is created Synthesis Target program is generated

Phases of Compiler 1) Lexical analysis 2) syntax analysis 3) semantic analysis 4) Intermediate code generation 5) Code optimization 6) code generation Symbol table management Error detection an handling

Lexical analysis Also called as scanning Complete source code is scanned Broken up into group of strings called token

Syntax analysis Also called parsing Tokens are grouped together to form hierarchical structure Determines the structure of the source string by grouping the tokens together The hierarchical structure generated is called parse tree of syntax tree

Semantic analysis Determine meaning of source string Like matching of parenthesis, or matching of if…else statements or performing arithmetic operations that are type compatible, or checking scope of operation.

Intermediate code generation The code which can easily converted into target code This could be in a form of three address code

Code optimization Improve the intermediate code. To have faster executing code or less consumption of memory Code generation Target code gets generated Sequence of machine instructions

Semantic gap Difference between the semantics of two domains For compiler, there are two domains The domain of source language The execution domain The semantics of these two domains is very different and a gap exists Let us see four features that show how the compiler bridges the semantic gap

Causes of semantic gap : Data types A data type is a specification of (1) values that entities of the type may have, and (2) operations that may be performed on entities of the type. We refer to these value and operations as legal value and legal operations. Compiler must check whether the variable of particular data type are assigned with legal values And whether variables and values of a type are manipulated through legal operations And issue error messages when these requirements are not met Type conversion is used to convert the value in one type into other data type for example. Only conversion between few types is possible Int is converted to real data type to perform some arithmetic computations

Data structures Program may use data structures like an array, stack, record or list To generate code for a reference to a specific element of data structure, the compiler must develop a memory mapping for finding the memory words that correspond to the required data element. A record or structure , which is a heterogeneous data structure, required a complex memory mapping.

Scope The scope of a program entity (e.g. data item) is that part of a program in which the entity is accessible. Scope rules determine whether a variable is accessible at a specific place in program Generally scope of data item is restricted to the program block in which the data item is declared

Control structures The control structure of a language is the collection of language features that can be used for altering the flow of control during execution of a program Includes unconditional and conditional transfer of control, iteration control and procedure calls. A compiler must ensures that a source program does not violate the semantics of a control structure

Binding and Binding Times Each program entity pe in program P has a set of attributes. If pe is an identifier , it has attribute kind whose value indicate whether it is a variable, a procedure or a reserved identifier( keyword) A variable has attributes like type, dimensionality, scope, memory address etc. Note : An attribute of one program entity may itself be another program entity. Type can have attributes like size in number if memory bytes.

Binding Binding : is association of attribute of program entity with a value. For example : my_type alpha; Compiler process the statement, binds the type of variable alpha to my_type To facilitate memory allocation to alpha , the size of my_type should be known. So the size attribute of my_type should have been bound sometimes before

Binding Times Language definition time or prog language L Which is the time at which features of a language are specified Language implementation time of a prog language L Which is the time at which the design of a language translator for L is finalized Compilation time of a program P Execution init time of procedure proc Execution time of a procure proc

Language specification L It may specify binding times for the attributes of various entities of program For example, the specification of a block-structured language may state that binding of the local variables of procedure should be performed at execution init time of a procedure

Static and dynamic binding A static binding is a binding performed before the execution of a program begins A dynamic binding is a binding performed after the execution of a program has begun Use of static binding leads to more efficient execution of a program than use of dynamic binding.

Data structures used in compiler There are two data structures used by compiler Stacks: is used for activation records Heap : head is used for allocation and de-allocation of memory The stack is used to manage runtime storage Heap is used to mange dynamic memory allocation Using LIFO, activation records and data objects are pushed onto the stack. Memory allocation is efficient in Heap

Fields of activation record Activation record is block of memory used for managing info needed by a single execution of a procedure Return value : store result of a function call. Actual parameters : information about actual parameters Control link : optional.. Points to the activation record of the calling procedure. Access link: optional.. non local data in other activation record. Saved machine status : status of machine just before the procedure is called Local variables: data that is local to the execution of procedure is stored Temporaries:

Heap data structure Heap is used for allocation and deallocation of objects When an object is created required amount of memory is allocated for it from the heap After the use of that object, the allocated memory can be free and returned to the heap. In C language, the malloc function is used to allocate the memory and using free function the memory is deallocated

Due to frequent allocation and deallocation; Small free areas or holes get created in memory. Hence memory management techniques are required to collect al the such free memory areas and reuse them effectively Two popular techniques of memory management are – Reference count Garbage collction

Reference count In this technique, system associates a reference count with each memory area to indicate how many users or programs are currently using it. This count is incremented when user gains the access to that area And decremented when user free the memory area When reference count is zero then that means the memory area is free

Garbage Collector Makes two passes over the memory to identify unused areas In first pass, it traverse all pointers that point to allocated areas and marks the areas that are in use. In second pass; it finds all areas that are unmarked and declares them to be unused or free. Is also known as automatic memory management

Memory Allocation Strategies A program in OS called ‘memory manager’ handles memory management by allocating required amount of primary memory to the processes Three memory allocation strategies First-fit Best- fit

First -fit Consider there are many free blocks(holes) in the memory It allocates the first hole which is satisfying memory requirement of the process Example :a process requires 15 kb of memory Memory manger has list of 10 kb, 18kb, 16kb, 25kb, 19kb of unallocated memory First –fit will allocate 18kb of memory

Best-fit It will allocate the best suited hole to the process Now from the list memory manager will allocate 16 kb block to the process The very small free memory areas that remain after the allocation of memory is called fragmentation. This problem can be resolved by using memory compaction techniques.

Compilation of Expression Expression contain the arithmetic operators and operands. This can be converted into an intermediate code form Let’s see various forms of intermediate code 1. abstract syntax tree 2. polish notation 3. three address code

Abstract syntax tree Consider a string x:= -a* b + -a* b Natural hierarchical structure is represented by syntax trees.

Polish Notation Postfix or prefix notation For e.g. x = -a* b + - a* b Postfix form xa-b*a-b*+=

Three Address Code Quadruple representation

Triples

Indirect Triples

Code Optimization Techniques 1) Common SubExpression Elimination: Example : t1 = 4 * i; t2= a[t1]; t3= 4 * j; t4= 4 * I; t5 = n; t6= b[t4] + t5; t6= b[t1] + t5;

Code Motion Move some code from the loop to before the loop starts Example : While ( i <= max – 1) { sum = sum + a[i]; } N = max – 1; While ( i<= n)

Strength Reduction Replace heavy operations by light Example : for( i = 0 ; i <=50; i++) { count = i * 7; } Temp = 7; count= temp; temp = temp + 7;

Dead Code Elimination Any variable is dead if it’s value is not used in any code of program. In example shown below i=1 is a dead code because it will never happen . i=0; if(i== 1) { A = x+5; }

Copy Propagation Variable propagation means use of one variable instead of another. x=pi; … area = x * r * r; } Here variable x is eliminated.

Loop optimization Techniques 1) Code Motion 2) Induction variable and strength reduction 3) Loop Unrolling 4) Loop Fusion

Induction variables and reduction in strength A variable x is called an induction variable of loop L if the value gets changed every time It is either incremented or decremented by some constant For example; B1 i = i + 1; t1 = 4 * i; t2 = a [t1]; If t2 < 10 goto B1 Here i and t1 are induction variables. It may be to get rid of not all but one.

Loop unrolling In this method, number of jumps and tests can be reduced by writing the code two times. int i = 1; While(i<=100) { a[i]=b[i]; i++; } int i = 1 ; While(i<= 50) a[i] = b[i];

Loop Fusion In loop fusion method several loops are merged to one loop. For example: for i=1 to n do for j=1 to m do A[i,j] = 10 Can be written as for i =1 to n*m do A[i] = 10