College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.

Slides:



Advertisements
Similar presentations
4b Lexical analysis Finite Automata
Advertisements

From Cooper & Torczon1 The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source language?
1 Pass Compiler 1. 1.Introduction 1.1 Types of compilers 2.Stages of 1 Pass Compiler 2.1 Lexical analysis 2.2. syntactical analyzer 2.3. Code generation.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Compiler Construction
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Compiler1 Chapter V: Compiler Overview: r To study the design and operation of compiler for high-level programming languages. r Contents m Basic compiler.
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
CSC 338: Compiler design and implementation
1 Chapter 3 Scanning – Theory and Practice. 2 Overview Formal notations for specifying the precise structure of tokens are necessary –Quoted string in.
Compiler Construction Lexical Analysis. The word lexical means textual or verbal or literal. The lexical analysis implemented in the “SCANNER” module.
Lexical Analysis - An Introduction. The Front End The purpose of the front end is to deal with the input language Perform a membership test: code  source.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Joey Paquet, Lecture 12 Review. Joey Paquet, Course Review Compiler architecture –Lexical analysis, syntactic analysis, semantic.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source.
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Compiler Introduction 1 Kavita Patel. Outlines 2  1.1 What Do Compilers Do?  1.2 The Structure of a Compiler  1.3 Compilation Process  1.4 Phases.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
The Model of Compilation Natawut Nupairoj, Ph.D. Department of Computer Engineering Chulalongkorn University.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
CSC 4181 Compiler Construction
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
1 Compiler Construction Vana Doufexi office CS dept.
Presented by : A best website designer company. Chapter 1 Introduction Prof Chung. 1.
CS416 Compiler Design1. 2 Course Information Instructor : Dr. Ilyas Cicekli –Office: EA504, –Phone: , – Course Web.
Deterministic Finite Automata Nondeterministic Finite Automata.
Compiler Chapter 4. Lexical Analysis Dept. of Computer Engineering, Hansung University, Sung-Dong Kim.
Chapter 1 Introduction.
Chapter 3 Lexical Analysis.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Lexical Analysis (Sections )
Finite-State Machines (FSMs)
Lexical analysis Finite Automata
Overview of Compilation The Compiler Front End
Overview of Compilation The Compiler Front End
Chapter 1 Introduction.
PROGRAMMING LANGUAGES
Finite-State Machines (FSMs)
Compiler Lecture 1 CS510.
Recognizer for a Language
CS416 Compiler Design lec00-outline September 19, 2018
Course supervisor: Lubna Siddiqui
Introduction CI612 Compiler Design CI612 Compiler Design.
Chapter 3: Lexical Analysis
CS 3304 Comparative Languages
4b Lexical analysis Finite Automata
CMPE 152: Compiler Design August 21/23 Lab
Subject: Language Processor
4b Lexical analysis Finite Automata
Compiler Structures 2. Lexical Analysis Objectives
Compiler Construction
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Presentation transcript:

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation Techniques Dr. Ying JIN Associate Professor Oct. 2007

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -2- What we have already introduced –What this course is about? –What is a compiler? –The ways to design and implement a compiler; –General functional components of a compiler; –The general translating process of a compiler;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -3- What will be introduced Scanning –General information about a Scanner (scanning) Functional requirement (input, output, main function) Data structures –General techniques in developing a scanner How to define the lexeme of a programming language? How to implement a scanner with respect to the definition of lexeme?

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -4- What will be introduced Scanning –General techniques in developing a scanner Two formal languages –Regular expression –Finite automata (DFA, NFA) Three algorithms –From regular expression to NFA –From NFA to DFA –Minimizing DFA One implementation –Implementing DFA to get a scanner

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -5- Outline 2.1 Overview General Function of a Scanner Some Issues about Scanning 2.2 Finite Automata Definition and Implementation of DFA Non-Determinate Finite Automata Transforming NFA into DFA Minimizing DFA 2.3 Regular Expressions Definition of Regular Expressions Regular Definition From Regular Expression to DFA 2.4 Design and Implementation of a Scanner Developing a Scanner from DFA A Scanner Generator – Lex

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -6- Knowledge Relation Graph Develop a Scanner Lexical definition basing on Regular ExpressionNFA DFA usingtransfo rming minimized DFA minimizing implement

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -7- Outline 2.1 Overview General Function of a Scanner Some Issues about Scanning 2.2 Finite Automata Definition and Implementation of DFA Non-Determinate Finite Automata Transforming NFA into DFA Minimizing DFA 2.3 Regular Expressions Definition of Regular Expressions Regular Definition From Regular Expression to DFA 2.4 Design and Implementation of a Scanner Developing a Scanner from DFA A Scanner Generator – Lex

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques Overview Status of Scanning in a Compiler Token General function of scanning –Input –Output –Functional description Some issues about scanning –Data structure –Blank/tab, return, newline, comments –Lexical errors

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -9- Status of Scanning in a Compiler Lexical Analysis scanning Syntax Analysis Parsing Semantic Analysis Intermediate Code Optimization Intermediate Code Generation Target Code Generation analysis/front endsynthesis/back end

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -10- Token 单词的内部表示 Is the minimal semantic unit in a programming language; A token includes two parts –Token Type –Attributes (Semantic information) Example: –

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -11- Two Forms of a Scanner Independent –A scanner is independent of other parts of a compiler –Output: a sequence of tokens; Attached –As an auxiliary function to the parser; –Returns a token once called by the parser;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -12- Scanner Two forms CharList Independent Scanner TokenList Attached Scanner Parser call Token CharList Parser

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -13- General Function of Scanning Input –Source program Output –Sequence of tokens/ token Functional description (similar to spelling) –Reads source program ; –Recognizes words one by one according to the lexical definition of the source language ; –Builds internal representation of words – tokens; –Checks lexical errors; –Returns the sequence of tokens/token ;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -14- Some Issues about Scanning

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -15- Tokens Source program –Stream of characters Token –A sequence of characters that can be treated as a unit in the grammar of a programming language; –Token types Identifier: x, y1, …… Number: 12, 12.3, Keywords (reserved words): int, real, main, Operator: +, -, *, /, >, <, …… Delimiter: ;, {, }, …… Each can be treated as one type!

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -16- Tokens The content of a token token-typeSemantic information Semantic information (attributes): - Identifier: the string - Number: the value - Keywords (reserved words): the number in the keyword table - Operator: itself - Delimiter: itself

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -17- Token Data Structure –Token type enum –One token struct –Sequence of tokens list Please define data structure with C !!!

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -18- Keywords –Words that have special meaning –Can not be used for other meanings Reserved words –Words that are reserved by a programming language for special meaning; –Can be used for other meaning, overload previous meaning; Keyword table –to record all keywords defined by the source programming language

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -19- Sample Source Program ) a ; t d ) 1 eyr(  dr  ;y, △ x) ni  v sle △ )(teirw △ nhe  y>x △ i }  ; (0t i w △ e artx{ a ( ; e f △ r e 

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -20- Sequence of Tokens var , k  int, kx, ide,y, ide;  {read, k(x, ide);  read, k (y, ide);  if, kx, ide> y, idethen, kwrite,k(1, num);  else, kwrite,k(0, num);  }

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -21- Blank, tab, newline, comments No semantic meaning Only for readability Can be removed Line number should be calculated;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -22- The End of Scanning Two optional situations –Once read the character representing the end of a program; PASCAL: ‘.’ –The end of the source program file

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -23- Lexical Errors Limited types of errors can be found during scanning –illegal character ; §, ← –the first character is wrong; “/abc” Lexical Error Recovery –Once a lexical error has been found, scanner will not stop, it will take some measures to continue the process of scanning Ignore current character, start from next character if § a then x = 12.else ……

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -24- To Develop a Scanner Now we know what is the function of a Scanner ; How to implement a scanner? –Basis : lexical rules of the source programming language Set of allowed characters; What kinds of tokens it has? The structure of each token-type –Scanner will be developed according to the lexical rules;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -25- How to define the lexical structure of a programming language? 1.Natural language (English, Chinese …… ) - easy to write, ambiguous and hard to implement 2.Formal languages - need some special background knowledge; - concise, easy to implement; - automation;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -26- Two formal languages for defining lexical structure of a programming language –Finite automata Non-deterministic finite automata (NFA) Deterministic finite automata (DFA) –Regular expressions (RE) –Both of them can formally describing lexical structure The set of allowed words –They are equivalent; –FA is easy to implement; RE is easy to define;

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -27- Summary What is the main function of scanning (a scanner)? Two forms of Scanner? What is a token? The content of a token? Semantic information for different token types? Lexical error? Two key problems in design and implementation of a scanner?

College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -28- Reading Assignment (2) Topic: Different methods for dealing with lexical errors? Objectives –Try to find out different ways to cope with lexical errors; Tips –There are different schema (error correction; error repair; error recovery;……) –Google some research papers; –Treat it as a survey; –challenging!