CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Slides:



Advertisements
Similar presentations
Lexical Analysis Lexical analysis is the first phase of compilation: The file is converted from ASCII to tokens. It must be fast!
Advertisements

1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
1 Languages. 2 A language is a set of strings String: A sequence of letters Examples: “cat”, “dog”, “house”, … Defined over an alphabet: Languages.
1 Introduction to Computability Theory Lecture3: Regular Expressions Prof. Amos Israeli.
1 Introduction to Computability Theory Lecture4: Regular Expressions Prof. Amos Israeli.
Winter 2007SEG2101 Chapter 81 Chapter 8 Lexical Analysis.
Lexical Analysis III Recognizing Tokens Lecture 4 CS 4318/5331 Apan Qasem Texas State University Spring 2015.
Courtesy Costas Busch - RPI1 Non Deterministic Automata.
1 The scanning process Main goal: recognize words/tokens Snapshot: At any point in time, the scanner has read some input and is on the way to identifying.
Finite Automata Finite-state machine with no output. FA consists of States, Transitions between states FA is a 5-tuple Example! A string x is recognized.
Fall 2006Costas Busch - RPI1 Deterministic Finite Automata And Regular Languages.
1 Finite Automata. 2 Finite Automaton Input “Accept” or “Reject” String Finite Automaton Output.
1 Languages and Finite Automata or how to talk to machines...
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Finite Automata Costas Busch - RPI.
1 Non-Deterministic Finite Automata. 2 Alphabet = Nondeterministic Finite Automaton (NFA)
Nondeterminism (Deterministic) FA required for every state q and every symbol  of the alphabet to have exactly one arrow out of q labeled . What happens.
Lexical Analysis The Scanner Scanner 1. Introduction A scanner, sometimes called a lexical analyzer A scanner : – gets a stream of characters (source.
1 Scanning Aaron Bloomfield CS 415 Fall Parsing & Scanning In real compilers the recognizer is split into two phases –Scanner: translate input.
CPSC 388 – Compiler Design and Construction Scanners – Finite State Automata.
Finite-State Machines with No Output Longin Jan Latecki Temple University Based on Slides by Elsa L Gunter, NJIT, and by Costas Busch Costas Busch.
Finite-State Machines with No Output
1 Outline Informal sketch of lexical analysis –Identifies tokens in input string Issues in lexical analysis –Lookahead –Ambiguities Specifying lexers –Regular.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Fall 2006Costas Busch - RPI1 Deterministic Finite Automaton (DFA) Input Tape “Accept” or “Reject” String Finite Automaton Output.
Lexical Analyzer (Checker)
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
SCRIBE SUBMISSION GROUP 8 Date: 7/8/2013 By – IKHAR SUSHRUT MEGHSHYAM 11CS10017 Lexical Analyser Constructing Tokens State-Transition Diagram S-T Diagrams.
CS412/413 Introduction to Compilers Radu Rugina Lecture 4: Lexical Analyzers 28 Jan 02.
TRANSITION DIAGRAM BASED LEXICAL ANALYZER and FINITE AUTOMATA Class date : 12 August, 2013 Prepared by : Karimgailiu R Panmei Roll no. : 11CS10020 GROUP.
1 November 1, November 1, 2015November 1, 2015November 1, 2015 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University Azusa.
1 Languages and Compilers (SProg og Oversættere) Lexical analysis.
Lexical Analysis: Finite Automata CS 471 September 5, 2007.
CSc 453 Lexical Analysis (Scanning)
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
Natural Language Processing Lecture 4 : Regular Expressions and Automata.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Prof. Necula CS 164 Lecture 31 Lexical Analysis Lecture 3-4.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
using Deterministic Finite Automata & Nondeterministic Finite Automata
Formal Languages Finite Automata Dr.Hamed Alrjoub 1FA1.
Overview of Previous Lesson(s) Over View  A token is a pair consisting of a token name and an optional attribute value.  A pattern is a description.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Deterministic Finite Automata Nondeterministic Finite Automata.
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Costas Busch - LSU1 Deterministic Finite Automata And Regular Languages.
Department of Software & Media Technology
Fall 2004COMP 3351 Finite Automata. Fall 2004COMP 3352 Finite Automaton Input String Output String Finite Automaton.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Languages.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
CSc 453 Lexical Analysis (Scanning)
Finite-State Machines (FSMs)
Non Deterministic Automata
Finite-State Machines (FSMs)
Recognizer for a Language
CSE322 Finite Automata Lecture #2.
Deterministic Finite Automata And Regular Languages Prof. Busch - LSU.
Non-Deterministic Finite Automata
Lexical Analysis Lecture 3-4 Prof. Necula CS 164 Lecture 3.
CSE322 Definition and description of finite Automata
Non Deterministic Automata
CSE322 Minimization of finite Automaton & REGULAR LANGUAGES
Compiler Construction
Lexical Analysis.
Chapter 1 Regular Language
Non Deterministic Automata
CSc 453 Lexical Analysis (Scanning)
Presentation transcript:

CSC3315 (Spring 2009)1 CSC 3315 Lexical and Syntax Analysis Hamid Harroud School of Science and Engineering, Akhawayn University

Lexical Analysis Convert source file characters into token stream. Remove content-free characters (comments, whitespace,...) Detect lexical errors (badly-formed literals, illegal characters,...) Output of lexical analysis is input to syntax analysis. Idea: Look for patterns in input character sequence, convert to tokens with attributes, and pass them to parser in stream.

Lexical Analysis Example

Specifying Lexical Analysers Can define lexical analyzer via list of pairs: (regular expression, action) where regular expression describes token pattern and action is a piece of code, parameterized by the matching lexeme, that returns a (token, attribute) pair Example (digit+, {return new Token(NUM,parseInt(lexeme));}) (alpha(alpha|digit) ∗, {return new Token(ID,lexeme);}) (space|tab|newline, {}) (.,.) So R.E’s can help us specify scanners.

Regular Expressions A regular expression (R.E.) is a concise formal characterization of a regular language. Example: The regular language containing all IDENTs is described by the regular expression letter (letter | digit) ∗ where “| ” means “or” and “e ∗ ” means “zero or more copies of e.” Regular languages are one particular kind of formal languages.

6 Finite Automaton Input “Accept” or “Reject” String Finite Automaton Output

7 Transition Graph initial state accepting state transition

8 Initial Configuration Input String

9 Reading the Input

10 Reading the Input

11 Reading the Input

12 Reading the Input

13 accept Input finished Reading the Input

14 Rejection

15 Rejection

16 Rejection

17 Rejection

18 reject Input finished Rejection

19 Another Rejection

20 reject Another Rejection

21 Another Example

22 Another Example

23 Another Example

24 Another Example

25 accept Another Example

26 Rejection Example

27 Rejection Example

28 Rejection Example

29 Rejection Example

30 reject Input finished Rejection Example

31 Languages Accepted by FAs FA The language contains all input strings accepted by = { strings that bring to an accepting state}

32 Example accept

33 Example accept

34 Formal Definition Finite Automaton (FA) : set of states : input alphabet : transition function : initial state : set of accepting states

35 Input Alphabet

36 Set of States

37 Initial State

38 Set of Accepting States

39 Transition Function

40 Transition Function

41 Transition Function

42 Transition Function

44 Extended Transition Function

45 Extended Transition Function

46 Example = { all strings with prefix } accept

47 Example = { all strings without substring }

48 Example

49 Regular Languages Definition: A language is regular if there is FA such that Observation: All languages accepted by FAs form the family of regular languages

50 { all strings with prefix } { all strings without substring } There exist automata that accept these Languages (see previous slides). Examples of Regular Languages There exist languages which are not Regular: There is no FA that accepts such a language.

51 Alphabet = Nondeterministic Finite Automaton (NFA)

52 Nondeterministic Finite Automaton (NFA) Two choices

53 First Choice Nondeterministic Finite Automaton (NFA)

54 Second Choice No transition: the automaton hangs Nondeterministic Finite Automaton (NFA)

55 An NFA accepts a string: when there is a computation of the NFA that accepts the string There is a computation: all the input is consumed and the automaton is in an accepting state An NFA rejects a string: when there is no computation of the NFA that accepts the string. Nondeterministic Finite Automaton (NFA)

56 Lambda Transitions

57 Another NFA Example

58 Another NFA Example (2)