1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.

Slides:



Advertisements
Similar presentations
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Advertisements

Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
COMP-421 Compiler Design Presented by Dr Ioanna Dionysiou.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source program) – divides it into tokens.
Topic #3: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Lexical Analysis Natawut Nupairoj, Ph.D.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
COMP 3438 – Part II - Lecture 2: Lexical Analysis (I) Dr. Zili Shao Department of Computing The Hong Kong Polytechnic Univ. 1.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
___________________________________________ COMPILER Theory___________________________________________ Fourth Year (First Semester) Dr. Hamdy M. Mousa.
CS30003: Compilers Lexical Analysis Lecture Date: 05/08/13 Submission By: DHANJIT DAS, 11CS10012.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
Lexical Analyzer in Perspective
Regular Expressions CIS 361. Need finite descriptions of infinite sets of strings. Discover and specify “regularity”. The set of languages over a finite.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
CPS 506 Comparative Programming Languages Syntax Specification.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Lexical Analysis S. M. Farhad. Input Buffering Speedup the reading the source program Look one or more characters beyond the next lexeme There are many.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Scanner Introduction to Compilers 1 Scanner.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Lexical Analysis (Scanning) Lexical Analysis (Scanning)
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
Scanning & Regular Expressions CPSC 388 Ellen Walker Hiram College.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Deterministic Finite Automata Nondeterministic Finite Automata.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
June 13, 2016 Prof. Abdelaziz Khamis 1 Chapter 2 Scanning – Part 2.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
CS 3304 Comparative Languages
Lexical Analyzer in Perspective
CS510 Compiler Lecture 2.
Scanner Scanner Introduction to Compilers.
Chapter 3 Lexical Analysis.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Compiler Construction (CS-636)
COMPILER CONSTRUCTION
PROGRAMMING LANGUAGES
Compiler Construction
JLex Lecture 4 Mon, Jan 26, 2004.
Review: Compiler Phases:
Recognition of Tokens.
CS 3304 Comparative Languages
Scanner Scanner Introduction to Compilers.
CS 3304 Comparative Languages
Scanner Scanner Introduction to Compilers.
Lexical Analysis - An Introduction
COMPILERS LECTURE(6-Aug-13)
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Scanner Scanner Introduction to Compilers.
Presentation transcript:

1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi

Outline 1. The Scanning Process 1. The Function of the Scanner 2. The Categories of Tokens 3. Relation between Tokens and its String 4. Some Practical Issues of the Scanner 2. Regular Expressions 3. Course Project 4. Summary 2

Lexical Analysis (Scanning) Lecture: 3-4 3

The Function of the Scanner To read characters from the source code and form them into logical units called Tokens Tokens are logical entries that are usually defined as an enumerated type;  For example in C, we may define it as; Typedef enum {IF, THEN, ELSE, PLUS, NUM, ID,…} TokenType; 4

The Categories of the Tokens RESERVED WORDS  Such as IF and THEN, which represent the strings of characters “if” and “then” SPECIAL SYMBOLS  Such as PLUS and MINUS, which represent the characters “+” and “-“ OTHER TOKENS  Such as NUM and ID, which represent numbers and identifiers 5

Relationship between Tokens and its String Token string is called STRING VALUE or LEXEME Some tokens have only one lexeme, such as reserved words A token may have infinitely many lexemes, such as the token ID. Any value associated to a token is called an attributes of a token  A NUM token may have a string value such as “32767” and actual value  A PLUS token has the string value “+” as well as arithmetic operation + 6

Relationship between Tokens and its String (Continue…) The token can be viewed as the collection of all of its attributes  Only need to compute as many attributes as necessary to allow further processing  The numeric value of a NUM token need not compute immediately 7

Some Practical Issues of Scanner One structured data type to collect all the attributes of a token, called a token record The scanner returns the token value only and places the other attributes in variables Scanner may not scan all source code at once a[index] = a[index]=4+2 a[index]=4+2

Regular Expressions Regular expressions  represent patterns of strings of characters. A regular expression r  completely defined by the set of strings it matches.  The set is called the language of r written as L(r) The set elements  referred to as symbols This set of legal symbols  called the alphabet and written as the Greek symbol ∑ 9

Regular Expressions (Continue…) A regular expression r  contains characters from the alphabet, indicating patterns, such a is the character a used as a pattern A regular expression r  may contain special characters called meta-characters or meta-symbols An escape character can be used to turn off the special meaning of a meta-character.  Such as backslash and quotes 10

Regular Expression Operations Choice among alternatives, indicated by the meta-character: + Concatenation, indicated by juxtaposition without having any meta-character: ab Repetition or “closure”, indicated by the meta-character: * 11

Regular Expression Examples Example 1:  ∑={ a,b,c}  the set of all strings over this alphabet that contain exactly one b.  (a|c)*b(a|c)* Write a regular expression of language having words that end with ‘ab’  ∑={ a,b} 12

Regular Expression Examples (Continue…) Write a regular expression that starts with an ‘a’ and ends with a ‘b’ 13

Regular Expression Examples (Continue…) Write a regular expression for language of all works containing exactly one ‘a’ 14

Regular Expression Examples (Continue…) Write a regular expression for language of all words containing all ‘a’s or ‘b’s 15

Regular Expression Examples (Continue…) Write a regular expression for language of all words containing at least one ‘b’ 16

Regular Expression Examples (Continue…) Write a regular expression for language of all words containing at most one ‘b’ 17

Regular Expression Examples (Continue…) Write a regular expression for language of all words that starts and ends with different letters 18

Regular Expression Examples (Continue…) Write a regular expression for language of all words that contains odd number of ‘a’s 19

Course Project Compiler Construction 20

Course Project Devise Groups Project should be implemented in Java Project should be implemented for Java programming language Project should be a GUI based program accepting input of the code from a text file and as an output it should display all the tokens in the code and their description The first deliverable is “Lexical Analyzer’ and its demo will be conducted in 7 th week 21

22 Summary Any Questions?