Download presentation
Presentation is loading. Please wait.
1
System Software Theory (5KS03)
2
System software System software is a type of computer program that is designed to run a computer’s hardware and application programs. The system software is the interface between the hardware and user applications. The operating system (OS) is the best-known example of system software. The OS manages all the other programs in a computer.
3
Syllabus Unit I: Introduction to Compiling: Phases of a compiler, Lexical Analysis: The role of lexical analyzer, input buffering, specification of tokens, recognition of tokens, and language for specifying lexical analysis, lex and yacc tools, state minimization of DFA. Unit II: Syntax Analysis: The role of the parser, Review of context free grammar for syntax analysis. Top down parsing: recursive descent parsing, predictive parsers, Transition diagrams for predictive parsers, Non recursive predictive parsing, FIRST and FOLLOW, Construction of predictive parsing tables, LL (1) grammars. Error recovery in predictive parsing. Unit III: Bottom up parsing: Handle pruning, Stack implementation of Shift Reduce Parsing, conflicts during shift reduce parsing, LR parsers: LR parsing algorithm, Construction of SLR parsing table, canonical LR parsing tables and canonical LALR parsing tables. Error recovery in LR parsing.
4
Unit IV: Syntax Directed Translation: Syntax directed definitions, attributes, dependency graphs, construction of syntax trees. Syntax directed definition for constructing syntax trees, directed acyclic graphs for expressions. Bottom up evaluation of s-attributed definitions, L-attributed definition. Top down translation, Design of a predictive translator. Unit V: Run Time Environments: Source language issues: Activation trees, control stacks, storage organization, subdivision of run time memory, activation records, Storage allocation strategies, static allocation, stack allocation, dangling references. Symbol table: Entries, Storage allocation, Hash tables, Scope information. Unit VI: Code Generation: Intermediate languages, Translation of Declarations & Assignments statements. Design issues of a Code generator, Target machine, Runtime storage management, Basic blocks and flow graphs.
5
Text Book: A V Aho, R Sethi, J D Ullman “Compilers Principles, Techniques and Tools”, Pearson Education (LPE). Reference Books: 1. D. M. Dhamdhere, Compiler Construction—Principles and Practice, (2/e), Macmillan India 2. Andrew Appel, Modern Compiler Implementation in C, Cambridge University press 3. K C. Louden “Compiler Construction—Principles and Practice” India Edition, CENGAGE 4. Bennett J.P., “Introduction to Compiling Techniques”, 2/e (TMH).
6
Phases of Compiler The compilation process is a sequence of various phases. Each phase takes input from its previous stage, has its own representation of source program, and feeds its output to the next phase of the compiler. Lexical Analysis The first phase of scanner works as a text scanner. This phase scans the source code as a stream of characters and converts it into meaningful lexemes. Lexical analyzer represents these lexemes in the form of tokens as: <token-name, attribute-value>
8
Tokens, patterns and lexemes
Tokens:-In most programming language, keywords, constants, identifiers, strings, numbers, operators and punctuations symbols, commas and semicolons can be considered as tokens. Lexemes:-a lexeme is a sequence of characters in the source program that is matched by the pattern. Patterns:- A pattern is a rule describing the set of lexemes, that can represent a particular token in source program. Eg- position:= initial +rate *60.
9
Token Lexeme Pattern ID x y n0 letter followed by letters and digits NUM e-5 any numeric constant IF if LPAREN ( LITERAL ``Hello'' any string of characters (except ``) between `` and ``
10
Syntax Analysis The next phase is called the syntax analysis or parsing. It takes the token produced by lexical analysis as input and generates a parse tree (or syntax tree). In this phase, token arrangements are checked against the source code grammar, i.e. the parser checks if the expression made by the tokens is syntactically correct.
11
Semantic Analysis Semantic analysis checks whether the parse tree constructed follows the rules of language. For example, assignment of values is between compatible data types, and adding string to an integer. Also, the semantic analyzer keeps track of identifiers, their types and expressions; whether identifiers are declared before use or not etc. The semantic analyzer produces an annotated syntax tree as an output.
12
Intermediate Code Generation
After semantic analysis the compiler generates an intermediate code of the source code for the target machine. It represents a program for some abstract machine. It is in between the high-level language and the machine language. This intermediate code should be generated in such a way that it makes it easier to be translated into the target machine code.
13
Code Optimization The next phase does code optimization of the intermediate code. Optimization can be assumed as something that removes unnecessary code lines, and arranges the sequence of statements in order to speed up the program execution without wasting resources (CPU, memory). Code Generation In this phase, the code generator takes the optimized representation of the intermediate code and maps it to the target machine language. The code generator translates the intermediate code into a sequence of (generally) re-locatable machine code. Sequence of instructions of machine code performs the task as the intermediate code would do.
14
Symbol Table It is a data-structure maintained throughout all the phases of a compiler. All the identifier's names along with their types are stored here. The symbol table makes it easier for the compiler to quickly search the identifier record and retrieve it. The symbol table is also used for scope management.
15
Error Handler Each phase can encounter errors. However, after detecting an error, a phase must somehow deal with that error, so that compilation can proceed, allowing further errors in the source program to be detected. The syntax and semantic analysis phases usually handle a large fraction of the errors detectable by the compiler. The lexical phase can detect errors where the characters remaining in the input do not form any token of the language. The syntax analysis phase can detect error where the token stream violates the syntax rules of the languages
16
Compiler Compiler is a program that takes source program as input and produces assembly language program as output. Assembler Assembler is a program that converts assembly language program into machine language program. It produces re-locatable machine code as its output. Loader and link-editor • The re-locatable machine code has to be linked together with other re-locatable object files and library files into the code that actually runs on the machine. • The linker resolves external memory addresses, where the code in one file may refer to a location in another file. • The loader puts together the entire executable object files into memory for execution.
17
Input Buffer LINKs:-
18
Input Buffer The amount of time taken to process characters and the large number of characters that must be processed during the compilation of a large source program, specialized buffering techniques have been developed to reduce the amount of overhead required to process a single input character. Specified buffering techniques have been developed to reduce the large amount of time consumed in moving characters. Two buffer input scheme that is useful when look ahead is necessary- Buffer Pairs Sentinels
19
Buffer Pair A buffer (array) divided into two N-character halves of, say 100 characters each, where N=number of characters on one disk block ‘eof’ marks the end of source file Two pointers are maintained: beginning of the lexeme pointer forward pointer. Initially, both pointers point to the first character of the next lexeme to be found. Forward pointer scans ahead until a match for a pattern is found. Once the next lexeme is determined, processed and both pointers are set to the character immediately past the lexeme.
20
Sentinels It is an extra key inserted at the end of the array. It is a special, dummy character that can’t be part of source program. With respect to buffer pairs, the code for advancing forward pointer is:
21
Algorithm:- If forward is at the end of first half then begin reload second half forward=forward+1 end else if forward is at the end of second half then begin reload first half move forward to beginning of first half end else forward=forward+1 Instead of this, we provide an extra character, sentinel at the end of each half of the buffer. Sentinel is not part of our source program and works as ‘eof’. Now only one test is sufficient, that is, if forward=‘eof’ or not.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.