UNIT-III By Mr. M. V. Nikum (B.E.I.T)
Programming Language Lexical and Syntactic features of a programming Language are specified by its grammar Language:- collection of valid sentences. Sentences: These are the sequence of words. Word:- Sequence of letters of graphic symbols. The Language specified in this manner is called as formal Language.
Programming Language Grammar A formal Language is a set of rules which precisely specify the sentence of L. Formal Grammar can be represented as :- G= {∑, NT, S, P} – ∑ set of terminals – NT set of non terminals – S Distinguished symbol (Starting symbol) – P set of productions.
Terminal Symbol:- These are the symbols which can not further sub divided – Ex:- ∑= { a, b,c,d…….0,1,2,3……} No terminal symbols:- These are the combination of terminal symbols which can be further sub divided. – Ex:- or
Production :- It is called as re-writing rule It can be represented as :- – Each production consists of a No terminal followed by an arrow (- ) or equal to (=), followed by string of Non terminals and terminals – Ex:- A b S Aa
Syntax Tree A Graphical representation of any statement of a Language is called as syntax tree.
id id+id*id
Derivation The replacement of Non terminal symbols according to the given production rule is called as derivation Types of Derivation: – Leftmost derivation – Rightmost derivation
Rules for English Language 1) 2) 3) 4) 5) The 6) student 7) studies 8) hard 9) slowly
Derivation Structure Rules applied (1) (2) (4) The (5) The student (6) The student studies (7) The student studies hard (8)
Ex. Consider the full grammar, E → E+E E → E*E E → id Let us derive the string “ id + id * id”
Using leftmost derivation:-
Using Right most derivation:-
Rules
Required string is bbaa
Derive the string – babbaaaba
Required string is baab
Reduction It is the process of replacement of string or part of string by non terminal according to the production rule.
Structure Rules applied The student studies hard student studies hard (5) studies hard (6) hard (7) (8) (2) (4) (1)
Ambiguity of Grammar The Grammar for a language is said to be ambiguous if there exists at least one string which can be generated (or, derived) in more than one way i. e. there can be more than one leftmost derivations, more than one rightmost derivations & also more than one derivation trees associated with such a string.
Ex. Consider the full grammar, E → E+E E → E*E E → id Let us derive the string “ id + id * id”
id
Compiler
Complier:- These are the system programs which will automatically translate the High level language program in to the machine language program Compiler Database Source program High level Lang. Prog. Target program / M/C Lang. Prog.
Cross Compiler:- These are the system programs which will automatically translate the HLL program compatible with M/C A, in to the machine language program compatible with M/C A, but the underlying M/C is M/C B Cross Compiler Source program HLL Prog. Compatible with M/C A Target program / M/C Lang. Prog. M/C B
P Code Compiler Source Program Compiler Obj Program Exexute P-Code interpreter
Very similar in concept of Interpreter In pseudo- code complier program is analyzed and converted into an intermediate form, which is then executed interpretively. The source program is compiled, the resulted object program is in p-code. This p-code program is then read and executed under control of p-code interpreter.
Compilation process overview Compilation process is vast and complex. Hence it is divided into a series of subtasks called as phases. Each of which transform the source program form one representation to another. The compilation process involves two major tasks.
Compilation process overview Analysis of + Synthesis of =Translation source text Target textof prog in obj form
Analysis of source text – Lexical Analysis (Scanner) – Syntax Analysis (Parser) – Semantic analysis. Synthesis of target text – Storage allocation – Code generation
Analysis of source text Determine validity of source statement from the analysis view point. Determine the content of source statement. Construct the suitable representation of the source statement f0r use by the subsequent analysis phase or by the synthesis phase of compiler.
Synthesis of target text Memory allocation:- It is the simple task given to the presence of symbol table. The memory requirement of an identifier is computed from its length, type and dimension and memory is allocated to it. Code generation:- the code generation uses the knowledge of target architecture, knowledge of instruction and also the addressing modes in the target computer to select the appropriate instruction.
Synthesis of target text The important issues in code generation are – Determine the space where the intermediate results should ne kept i.e. whether they should be kept in memory location or held in machine register. – Determine which instruction should be used for type conversion operation. – Determine which addressing mode should be used.
Phase Structure of Compiler
Code Optimization This is an optional phase to improve the intermediate code so that ultimate object program can run faster and also take less phase. The ultimate aim of this technique is to improve the execution efficiency of a program by – Eliminating redundancies – Rearranging the computation in the program with out affecting target machine.
Code optimization techniques 1.Compile time evaluation 2.Elimination of common sub expressions 3.Frequency reduction 4.Strength reduction 5.Dead code elimination 6.Boolean sub expressions optimization 7.Local and Global optimization.