Lex Appendix B.1 -- Lex.

Slides:



Advertisements
Similar presentations
Application: Yacc A parser generator A context-free grammar An LR parser Yacc Yacc input file:... definitions... %... production rules... %... user-defined.
Advertisements

Lexical Analysis. what is the main Task of the Lexical analyzer Read the input characters of the source program, group them into lexemes and produce the.
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Characters and Strings. Characters In Java, a char is a primitive type that can hold one single character A character can be: –A letter or digit –A punctuation.
Basic Input/Output and Variables Ethan Cerami New York
Lecture 2: Lexical Analysis CS 540 George Mason University.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
CS 536 Spring Learning the Tools: JLex Lecture 6.
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Machine-independent code improvement Target code generation Machine-specific.
REGULAR EXPRESSIONS. Lexical Analysis Lexical analysers can be constructed by programs such as LEX These programs employ as input a description of the.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Review: Compiler Phases: Source program Lexical analyzer Syntax analyzer Semantic analyzer Intermediate code generator Code optimizer Code generator Symbol.
JLex Lecture 4 Mon, Jan 24, JLex JLex is a lexical analyzer generator in Java. It is based on the well-known lex, which is a lexical analyzer generator.
Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular.
Introduction to Lex Ying-Hung Jiang
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
Introduction to Lex Fan Wu
Flex Fast LEX analyzer CMPS 450. Lexical analysis terms + A token is a group of characters having collective meaning. + A lexeme is an actual character.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
Chapter 2 Variables.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Ajmer Singh PGT(IP) Programming Fundamentals. Ajmer Singh PGT(IP) Java Character Set Character set is a set of valid characters that a language can recognize.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
Operating System Discussion Section. The Basics of C Reference: Lecture note 2 and 3 notes.html.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
17-Mar-16 Characters and Strings. 2 Characters In Java, a char is a primitive type that can hold one single character A character can be: A letter or.
ICS611 Lex Set 3. Lex and Yacc Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
2.1 The Part of a C++ Program. The Parts of a C++ Program // sample C++ program #include using namespace std; int main() { cout
C++ First Steps.
Chapter 2 Variables.
Lexical Analysis.
Chapter 3 Lexical Analysis.
Lexical Analysis CSE 340 – Principles of Programming Languages
Tutorial On Lex & Yacc.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
Getting Started with C.
JLex Lecture 4 Mon, Jan 26, 2004.
Review: Compiler Phases:
Subject Name:Sysytem Software Subject Code: 10SCS52
Chapter 4 –Requirements for coding in Assembly Language
Compiler Structures 3. Lex Objectives , Semester 2,
Chapter 2: Introduction to C++.
Appendix B.1 Lex Appendix B.1 -- Lex.
Yacc Yacc.
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Chapter 2 Variables.
Lexical Elements & Operators
Introduction to C Programming
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Lexical Analysis - Scanner-Contd
Presentation transcript:

Lex Appendix B.1 -- Lex

Overview of Lex Input specification file is in 3 parts Definitions Token Descriptions and actions User-Written code Parts are separated by %% In the first part we define patterns, in the third part we define actions, in the second part we put them together. Appendix B.1 -- Lex

Token Definitions Elementary Operations single characters except . $ ^ [ ] - ? * + | ( ) / { } < > concatenation (put characters together) alternation (a|b|c) [ab] == a|b [a-k] == a|b|c|...|i|j|k [a-z0-9] == any letter or digit Appendix B.1 -- Lex

Elementary Operations (cont.) NOTE: . matches any character except the newline * -- Kleene Closure + -- Positive Closure Examples: [0-9]+"."[0-9]+ note: without the quotes it could be any character [ \t]+ -- is whitespace (except CR). Yes there is a space inside the box before the \t Appendix B.1 -- Lex

Special Characters: . -- matches any single character (except newline) \t -- tab \n -- newline \" -- double quote \\ -- \ ? -- this means the preceding was optional ab? == a|ab (ab)? == ab|e Appendix B.1 -- Lex

Special Characters (cont.) ^ -- means at the beginning of the line (unless it is inside of a [ ]) $ means at the end of the line [^ ] -- means anything except \"[^\"]*\" is a double quoted string Lex always chooses the longest matching substring for its tokens. Appendix B.1 -- Lex

Definitions NAME REG_EXPR digs [0-9]+ integer {digs} plain_real {digs}"."{digs} expreal {digs}"."{digs}[Ee][+-]?{digs} real {plainreal}|{expreal} Appendix B.1 -- Lex

The definitions can also contain variables and other declarations used by the Code generated by Lex. These usually go at the start of this section, marked by %{ at the beginning and %} at the end. Includes usually go here. It is usually convenient to maintain a line counter so that error messages can be keyed to the lines in which the errors are found. %{ int linecount = 1; %} Appendix B.1 -- Lex

Tokens and Actions Example: {real} return FLOAT; begin return BEGIN; {newline} linecount++; {integer} { printf("I found an integer\n"); return INTEGER; } Appendix B.1 -- Lex

identifiers used by Lex and Yacc begin with yy yytext -- a string containing the lexeme yyleng -- the length of the lexeme yylval -- holds the lexical value of the token. Example: {integer} { printf("I found an integer\n"); sscanf(yytext,"%d", &yylval); return INTEGER; } C++ Comments -- // ..... //.* ; Appendix B.1 -- Lex

User Written Code The actions associated with any given token are normally specified using statements in C. But occasionally the actions are complicated enough that it is better to describe them with a function call, and define the function elsewhere. Definitions of this sort go in the last section of the Lex input. Appendix B.1 -- Lex

A Sample Lex Specification Note: If 2 rules match the same pattern, Lex will use the first rule. Let’s go look at some code… Appendix B.1 -- Lex