Compiler Construction

Slides:



Advertisements
Similar presentations
A question from last class: construct the predictive parsing table for this grammar: S->i E t S e S | i E t S | a E -> B.
Advertisements

Compiler Construction Sohail Aslam Lecture StackInput ¤0¤0 id – id  id $ s4 ¤0 id 4 – id  id $ r6 F → id ¤0F3¤0F3 – id  id $ r5 T → F ¤0T2¤0T2.
Control Flow Analysis (Chapter 7) Mooly Sagiv (with Contributions by Hanne Riis Nielson)
From Cooper & Torczon1 The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language?
1 IMPLEMENTATION OF FINITE AUTOMAT IN CODE There are several ways to translate either a DFA or an NFA into code. Consider, again the example of a DFA that.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
1 Week 4 Questions / Concerns Comments about Lab1 What’s due: Lab1 check off this week (see schedule) Homework #3 due Wednesday (Define grammar for your.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
1 Top Down Parsing. CS 412/413 Spring 2008Introduction to Compilers2 Outline Top-down parsing SLL(1) grammars Transforming a grammar into SLL(1) form.
Lexical Analysis (I) Compiler Baojian Hua
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
Lexical Analysis: Regular Expressions CS 671 January 22, 2008.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
CS375 Compilers Lexical Analysis 4 th February, 2010.
Compiler Construction Dr. Naveed Ejaz Lecture 5. Lexical Analysis.
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
ITEC 320 Lecture 11 Application Part Deux Look Ahead.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Muhammad Idrees, Lecturer University of Lahore 1 Top-Down Parsing Top down parsing can be viewed as an attempt to find a leftmost derivation for an input.
Joey Paquet, 2000, Lecture 2 Lexical Analysis.
Compiler Construction Sohail Aslam Lecture 9. 2 DFA Minimization  The generated DFA may have a large number of states.  Hopcroft’s algorithm: minimizes.
Syntax (2).
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Lexical Analysis - Scanner- Contd Computer Science Rensselaer Polytechnic Compiler Design Lecture 3(01/21/98)
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 2: Lexical Analysis.
Lexical Analysis: Regular Expressions CS 471 September 3, 2007.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
CC410: System Programming Dr. Manal Helal – Fall 2014 – Lecture 12–Compilers.
Compiler Chapter 4. Lexical Analysis Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
CS 3304 Comparative Languages
Lexical and Syntax Analysis
Chapter 3 Lexical Analysis.
Tutorial On Lex & Yacc.
Lecture #12 Parsing Types.
Lecture 2 Lexical Analysis Joey Paquet, 2000, 2002, 2012.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Finite-State Machines (FSMs)
Overview of Compilation The Compiler Front End
Overview of Compilation The Compiler Front End
PROGRAMMING LANGUAGES
Finite-State Machines (FSMs)
Parsing with Context Free Grammars
Compiler Construction
Compiler Construction
CSE 3302 Programming Languages
Lexical Analysis - An Introduction
Review: Compiler Phases:
Lecture 5: Lexical Analysis III: The final bits
Compilers B V Sai Aravind (11CS10008).
CS 3304 Comparative Languages
Lecture 4: Lexical Analysis & Chomsky Hierarchy
CMPE 152: Compiler Design August 21/23 Lab
CS 3304 Comparative Languages
Lexical Analysis - An Introduction
Compiler Structures 2. Lexical Analysis Objectives
High-Level Programming Language
Compiler Construction
Compiler Construction
Chapter 10: Compilers and Language Translation
Compiler Construction
Faculty of Computer Science and Information System
Presentation transcript:

Compiler Construction Sohail Aslam Lecture 5 compiler: intro

Lexical Analysis

Recall: Front-End Output of lexical analysis is a stream of tokens source code IR scanner parser errors Output of lexical analysis is a stream of tokens

Tokens Example: if( i == j ) z = 0; else z = 1;

Tokens Input is just a sequence of characters: i f ( \b = j \n \t ....

Tokens Goal: partition input string into substrings classify them according to their role

Tokens A token is a syntactic category Natural language: “He wrote the program” Words: “He”, “wrote”, “the”, “program”

Tokens Programming language: “if(b == 0) a = b” Words: “if”, “(”, “b”, “==”, “0”, “)”, “a”, “=”, “b”

Tokens Identifiers: x y11 maxsize Keywords: if else while for Integers: 2 1000 -44 5L Floats: 2.0 0.0034 1e5 Symbols: ( ) + * / { } < > == Strings: “enter x” “error”

Ad-hoc Lexer Hand-write code to generate tokens. Partition the input string by reading left-to-right, recognizing one token at a time

Ad-hoc Lexer Look-ahead required to decide where one token ends and the next token begins.

Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead s = _s; next = s.read(); }

Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead s = _s; next = s.read(); }

Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead s = _s; next = s.read(); }

Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead s = _s; next = s.read(); }

Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead s = _s; next = s.read(); }

Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ...

Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ...

Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ...

Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ...

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) return new Token(TID,id); id = id + string(c); }

Ad-hoc Lexer boolean idChar(char c) { if( isAlpha(c) ) return true; if( isDigit(c) ) if( c == ‘_’ ) return false; }

Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) return new Token(TNUM,num); num = num+string(next); }

Ad-hoc Lexer Problems: Do not know what kind of token we are going to read from seeing first character.

Ad-hoc Lexer Problems: If token begins with “i”, is it an identifier “i” or keyword “if”? If token begins with “=”, is it “=” or “==”?

Ad-hoc Lexer Need a more principled approach Use lexer generator that generates efficient tokenizer automatically. end of lecture 5 compiler: intro