Compiler Construction Dr. Naveed Ejaz Lecture 5. Lexical Analysis.

Slides:



Advertisements
Similar presentations
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Advertisements

A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
From Cooper & Torczon1 The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language?
Compilers and Language Translation
1 Lexical Analysis. 2 Structure of a Compiler Source Language Target Language Semantic Analyzer Syntax Analyzer Lexical Analyzer Front End Code Optimizer.
1 IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a DFA or an NFA into code. Consider, again the example of a DFA that.
Compiler Construction Dr. Naveed Ejaz Lecture 2. 2 Two-pass Compiler Front End Back End source code IR machine code errors.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
CS 310 – Fall 2006 Pacific University CS310 Parsing with Context Free Grammars Today’s reference: Compilers: Principles, Techniques, and Tools by: Aho,
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Chapter 2 Chang Chi-Chung Lexical Analyzer The tasks of the lexical analyzer:  Remove white space and comments  Encode constants as tokens.
Lexical Analysis Recognize tokens and ignore white spaces, comments
MIT Top-Down Parsing Martin Rinard Laboratory for Computer Science Massachusetts Institute of Technology.
Automata and Regular Expression Discrete Mathematics and Its Applications Baojian Hua
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
INTRODUCTION TO COMPUTING CHAPTER NO. 06. Compilers and Language Translation Introduction The Compilation Process Phase 1 – Lexical Analysis Phase 2 –
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
CSC 338: Compiler design and implementation
1 Outline Informal sketch of lexical analysis –Identifies tokens in input string Issues in lexical analysis –Lookahead –Ambiguities Specifying lexers –Regular.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
Lexical and Syntax Analysis
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
Lexical Analysis: Regular Expressions CS 671 January 22, 2008.
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
CS30003: Compilers Lexical Analysis Lecture Date: 05/08/13 Submission By: DHANJIT DAS, 11CS10012.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
CS375 Compilers Lexical Analysis 4 th February, 2010.
Comp 311 Principles of Programming Languages Lecture 3 Parsing Corky Cartwright August 28, 2009.
ITEC 320 Lecture 11 Application Part Deux Look Ahead.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Joey Paquet, 2000, Lecture 2 Lexical Analysis.
Syntax (2).
School of Computer Science & Information Technology G6DICP - Lecture 4 Variables, data types & decision making.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
What am I? while b != 0 if a > b a := a − b else b := b − a return a AST == Abstract Syntax Tree.
The Role of Lexical Analyzer
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 January 18, January 18, 2016January 18, 2016January 18, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Module 2 Compiler and their Working Software Construction Lecture 10,11 and 12.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 2: Lexical Analysis.
Prof. Necula CS 164 Lecture 31 Lexical Analysis Lecture 3-4.
Lexical Analysis: Regular Expressions CS 471 September 3, 2007.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Sudeshna Sarkar, IIT Kharagpur 1 Programming and Data Structure Sudeshna Sarkar Lecture 3.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Compiler Chapter 4. Lexical Analysis Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Lecture #12 Parsing Types.
Lecture 2 Lexical Analysis Joey Paquet, 2000, 2002, 2012.
Parsing with Context Free Grammars
Compiler Construction
Lexical analysis Jakub Yaghob
Syntax-Directed Definition
Lecture 8.
Review: Compiler Phases:
Lexical Analysis Lecture 3-4 Prof. Necula CS 164 Lecture 3.
CMPE 152: Compiler Design August 21/23 Lab
Compiler Structures 2. Lexical Analysis Objectives
Compiler Construction
Compiler Construction
Data Types Every variable has a given data type. The most common data types are: String - Text made up of numbers, letters and characters. Integer - Whole.
Presentation transcript:

Compiler Construction Dr. Naveed Ejaz Lecture 5

Lexical Analysis

3 Recall: Front-End  Output of lexical analysis is a stream of tokens scannerparser source code tokens IR errors

4 TokensTokens Example: if( i == j ) z = 0; else z = 1;

5 TokensTokens  Input is just a sequence of characters: if( \b i == j \n\t....

6 TokensTokens Goal:  partition input string into substrings  classify them according to their role

7 TokensTokens  A token is a syntactic category  Natural language: “He wrote the program”  Words: “He”, “wrote”, “the”, “program”

8 TokensTokens  Programming language: “if(b == 0) a = b”  Words: “if”, “(”, “b”, “==”, “0”, “)”, “a”, “=”, “b”

9 TokensTokens  Identifiers: x y11 maxsize  Keywords: if else while for  Integers: L  Floats: e5  Symbols: ( ) + * / { } ==  Strings: “enter x” “error”

10 Ad-hoc Lexer  Hand-write code to generate tokens.  Partition the input string by reading left-to-right, recognizing one token at a time

11 Ad-hoc Lexer  Look-ahead required to decide where one token ends and the next token begins.

12 Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

13 Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

14 Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

15 Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

16 Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

17 Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString();...

18 Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString();...

19 Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString();...

20 Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString();...

21 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

22 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

23 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

24 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

25 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

26 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

27 Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

28 Ad-hoc Lexer boolean idChar(char c) { if( isAlpha(c) ) return true; if( isDigit(c) ) return true; if( c == ‘_’ ) return true; return false; }

29 Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

30 Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

31 Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

32 Ad-hoc Lexer Problems:  Do not know what kind of token we are going to read from seeing first character.

33 Ad-hoc Lexer Problems:  If token begins with “i”, is it an identifier “i” or keyword “if”?  If token begins with “=”, is it “=” or “==”?

34 Ad-hoc Lexer  Need a more principled approach  Use lexer generator that generates efficient tokenizer automatically.