Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

This Time Whitespace and Input/Output revisited The Programming cycle Boolean Operators The “if” control structure LAB –Write a program that takes an integer.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
C++ Basics March 10th. A C++ program //if necessary include headers //#include void main() { //variable declaration //read values input from user //computation.
CS0007: Introduction to Computer Programming Console Output, Variables, Literals, and Introduction to Type.
Lecture 2 Introduction to C Programming
Introduction to C Programming
 2005 Pearson Education, Inc. All rights reserved Introduction.
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
Introduction to C Programming
Principles of Programming Fundamental of C Programming Language and Basic Input/Output Function 1.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
C Programming Language 4 Developed in 1972 by Dennis Ritchie at AT&T Bell Laboratories 4 Used to rewrite the UNIX operating system 4 Widely used on UNIX.
1 Lecture 2  Input-Process-Output  The Hello-world program  A Feet-to-inches program  Variables, expressions, assignments & initialization  printf()
Tutorial 1 Scanner & Parser
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
 2007 Pearson Education, Inc. All rights reserved Introduction to C Programming.
Computer Science: A Structured Programming Approach Using C1 Objectives ❏ To understand the structure of a C-language program. ❏ To write your first C.
Introduction to C Programming
Chapter 3: Introduction to C Programming Language C development environment A simple program example Characters and tokens Structure of a C program –comment.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
C How to Program, 6/e © by Pearson Education, Inc. All Rights Reserved.
CHAPTER 8 CHARACTER AND STRINGS
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Chapter 2 Overview of C Part I J. H. Wang ( 王正豪 ), Ph. D. Assistant Professor Dept. Computer Science and Information Engineering National Taipei University.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
Introduction to C Programming Angela Chih-Wei Tang ( 唐 之 瑋 ) Department of Communication Engineering National Central University JhongLi, Taiwan 2010 Fall.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
 Pearson Education, Inc. All rights reserved Introduction to Java Applications.
Structure of a C program Preprocessor directive (header file) Program statement } Preprocessor directive Global variable declaration Comments Local variable.
BASICS CONCEPTS OF ‘C’.  C Character Set C Character Set  Tokens in C Tokens in C  Constants Constants  Variables Variables  Global Variables Global.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Introduction to Lex Ying-Hung Jiang
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Introduction to Lex Fan Wu
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
CSC141 Introduction to Computer Programming Teacher: AHMED MUMTAZ MUSTEHSAN Lecture - 6.
CSCI 3133 Programming with C Instructor: Bindra Shrestha University of Houston – Clear Lake.
Constants, Variables and Data types in C The C character Set A character denotes any alphabet, digit or special symbol used to represent information.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
© Copyright 1992–2004 by Deitel & Associates, Inc. and Pearson Education Inc. All Rights Reserved. 1 Chapter 2 - Introduction to C Programming Outline.
Operating System Discussion Section. The Basics of C Reference: Lecture note 2 and 3 notes.html.
Program Development Cycle 1.Edit program 2.Compile program - translates it from C to machine language 3. Run/execute your program. 4. If not satisfied,
Sudeshna Sarkar, IIT Kharagpur 1 Programming and Data Structure Sudeshna Sarkar Lecture 3.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
1 Lecture 2 - Introduction to C Programming Outline 2.1Introduction 2.2A Simple C Program: Printing a Line of Text 2.3Another Simple C Program: Adding.
Definition of the Programming Language CPRL
Lexical Analysis.
NFAs, scanners, and flex.
Tutorial On Lex & Yacc.
Chapter 2, Part I Introduction to C Programming
Regular Languages.
Introduction to C++.
TDDD55- Compilers and Interpreters Lesson 2
JLex Lecture 4 Mon, Jan 26, 2004.
Appendix B.1 Lex Appendix B.1 -- Lex.
Lexical Elements & Operators
More on flex.
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
Lex Appendix B.1 -- Lex.
Presentation transcript:

Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine instructions? First, each separate entity must be recognised: e.g. the 5th line is processed as This process is known as lexical analysis

Application: Lex A program generator Series of regular expressions lex A lexical analyser Lex input file:... definitions... %... regular expression/action pairs... %... user-defined functions...

Lex Regular Expressions meta-characters (do not match themselves): ( ) [ ] { } + /, ^ * |. \ " $ ? - % Let c be a character, x,y, regular expressions, s a string, m,n integers and i an identifier. regular expressions: cany character except meta-characters [...]the list of chars enclosed (may be range) [ ­...]the list of chars not enclosed.any ASCII char except newline xyconcatenation of x and y x/yx, only if followed by y (y not read) x{m,n}m to n occurrences of x ­ xx, only at beginning of line x$x, only at end of line "s"exactly what is in the quotes (except for "\" and following character) x*same as x * x+same as x + x?an optional x (same as x+ ) x|yx or y {i}definition of i

Lex Regular Expressions (cont.) meta characters are obtained by preceding with "\". regular expresions are terminated by space or tab backslash, tab and newline represented by \\, \t, \n

Definitions if identifier string appears in the definition section, string replaces identifier in {identifier}. L [a-zA-Z] % {L}+; is same as: % [a-zA-Z]+; Anything enclosed between %{... %} in this section will be copied straight into lex.yy.c include and define statements, all variables, all function definitions, and any comments should be placed here. E.g. %{ #include /* an example program */ %}

Actions A C-language statement followed by ; Example: [0-9]+printf("Integer\n"); [a-zA-Z]+printf("String\n"); will output "Integer" after receiving a digit string, and "String" after receiving a character string. Input: 12+19=sum; will be result in: Integer +Integer =String ; Note: a recognised regular expression is held in the string yytext. Its length is held in the integer yylen.

Running Lex To run a lex program "example.l", type lex example.l cc lex.yy.c -ll a.out "-ll" calls the lex library. This library contains a "main" program, which calls yylex(). You can override this by defining your own "main".

Example Lex Program %{ /* simple word recognition */ %} L[a-zA-Z] % [ \t]+;/* ignore whitespace */ is|areprintf("verb: %s; ", yytext); a|theprintf("determiner: %s; ", yytext); dog | cat | male | femaleprintf("noun: %s; ", yytext); {L}+printf("unknown: %s; ", yytext);.|\nECHO; % main() { yylex(); }

Example Session % word the dog is a male determiner: the;noun: dog; verb: is; determiner: a; noun: male; female cat dog is noun: female; noun: cat; noun: dog; verb: is; catdog is male unknown: catdog; verb: is; noun: male; -d %

Practical 1: Lexical Analysis Aim: To write a lexical analyser in C using Lex, for the language L, defined below. identifiers: sequence of one or more letters, must be declared before use, int or real. integers: optional sign, one or more digits reals: optional sign, one or more digits, decimal point, one or more digits expressions: bracketed expressions using +, -, *, / and :=. comments: start with !, to end of line print statements: either printi or printr, for printing integers and reals, one argument.

Example L Program ! example L program real a; real baboon; int x y; ! end of declarations x := 300; printi(x); y := 7 - x; a := / 3 * 5 - 5; baboon := a * y; printi(5); printr(baboon);

Required Structure Output should be in the form of pairs. Every element of the program should be classified. Thus, output for the 9th line should be:, Numbers should be converted from strings to the appropriate form. The input must be described by regular expressions. You must use Lex. A "tokens.h" file will be supplied, defining all the different tokens to be used. You should output the token names and not the associated numbers.