Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.

Slides:



Advertisements
Similar presentations
Compiler construction in4020 – lecture 2 Koen Langendoen Delft University of Technology The Netherlands.
Advertisements

Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
CS252: Systems Programming
 Lex helps to specify lexical analyzers by specifying regular expression  i/p notation for lex tool is lex language and the tool itself is refered to.
176 Formal Languages and Applications: We know that Pascal programming language is defined in terms of a CFG. All the other programming languages are context-free.
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #4 Lexical.
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
Lecture 2: Lexical Analysis CS 540 George Mason University.
Chapter 3: Introduction to C Programming Language C development environment A simple program example Characters and tokens Structure of a C program –comment.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lexical Analysis I Specifying Tokens Lecture 2 CS 4318/5531 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper and Torczon.
COP 4620 / 5625 Programming Language Translation / Compiler Writing Fall 2003 Lecture 3, 09/11/2003 Prof. Roy Levow.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
Flex: A fast Lexical Analyzer Generator CSE470: Spring 2000 Updated by Prasad.
Copyright © Curt Hill Languages and Grammars This is not English Class. But there is a resemblance.
Group 4 Java Compiler Group Members: Atul Singh(Y6127) Manish Agrawal(Y6241) Mayank Sachan(Y6253) Sudeept Sinha(Y6483)
LEX (04CS1008) A tool widely used to specify lexical analyzers for a variety of languages We refer to the tool as Lex compiler, and to its input specification.
Introduction to Lex Ying-Hung Jiang
Introduction to Yacc Ying-Hung Jiang
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
IN LINE FUNCTION AND MACRO Macro is processed at precompilation time. An Inline function is processed at compilation time. Example : let us consider this.
Introduction to Lex Fan Wu
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Compilation With an emphasis on getting the job done quickly Copyright © – Curt Hill.
Compiler Principle and Technology Prof. Dongming LU Mar. 26th, 2014.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
COMMONWEALTH OF AUSTRALIA Copyright Regulations 1969 WARNING This material has been reproduced and communicated to you by or on behalf of Monash University.
Copyright Curt Hill The C/C++ switch Statement A multi-path decision statement.
Compiler Construction By: Muhammad Nadeem Edited By: M. Bilal Qureshi.
Copyright © Curt Hill Formatting Reals Outputs other than normal.
Parser Generation Using SLK and Flex++ Copyright © 2015 Curt Hill.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
1 February 23, February 23, 2016February 23, 2016February 23, 2016 Azusa, CA Sheldon X. Liang Ph. D. Computer Science at Azusa Pacific University.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Batch Files More flow of control Copyright © by Curt Hill.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
BNF A CFL Metalanguage Some Variations Particular View to SLK Copyright © 2015 – Curt Hill.
The Second C++ Program Variables, Types, I/O Animation!
Lecture 2 Lexical Analysis
Lexical Analysis.
Tutorial On Lex & Yacc.
Chapter 2 :: Programming Language Syntax
Using SLK and Flex++ Followed by a Demo
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University
Automata and Languages What do these have in common?
TDDD55- Compilers and Interpreters Lesson 2
Parser and Scanner Generation: An Introduction
Lexical and Syntax Analysis
Lecture 5: Lexical Analysis III: The final bits
CS 3304 Comparative Languages
Compiler Structures 3. Lex Objectives , Semester 2,
Chapter 2 :: Programming Language Syntax
The Java switch Statement
Chapter 2 :: Programming Language Syntax
Regular Expressions and Lexical Analysis
CMPE 152: Compiler Design December 4 Class Meeting
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Presentation transcript:

Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill

Introduction There are a variety of parser generators and scanner generators The standards for UNIX seem to be lex and yacc –Yacc seems to have replaced something earlier The GNU versions are flex and bison We will use flex++ which is somewhat more parameterizable We will also use SLK which seems to accept a better set of languages Copyright © 2015 Curt Hill

Flex++ This is a scanner generator It takes an input of three sections –Definitions for later use –Rules that accept a string and produce a token –Code which is passed through as-is These three are separated by a % The percent is used to indicate options and a variety of other things as well Copyright © 2015 Curt Hill

Theory of Operation Almost all the scanner generators produce a program that is a character driven finite state automaton Each character moves the automaton from one state to another –Self loops are possible Some of these states recognize the completion of a keyword or multi- character symbol Copyright © 2015 Curt Hill

Regular Expressions The typical way to describe what you want is using regular expressions –Recall that a type 3 language is regular –Every regular language may be recognized by a finite state automaton Thus we will define what we mean by an identifier with a regular expression –Or an integral constant, floating constant, character string etc Copyright © 2015 Curt Hill

Sections As mentioned there are three sections –Definition –Rules –Code Separated by the % directive We will now look at these Copyright © 2015 Curt Hill

Definition Section As the name suggests we set up directives and definitions for later use Directives are options used in Flex++ –Usually start with a % We also define regular expressions we will use later Copyright © 2015 Curt Hill

Directives %name scanner name –This is the class name %header{ … %} –Code that will be inserted at the beginning of the file – includes etc. –May also use %{ … %} %define name content –Defines a macro –Next screen shows some predefined macros Copyright © 2015 Curt Hill

Predefined Directives %define TEXT yytext –The text of something that matched a regular expression –Need this for identifiers and the like %define LENG yyleng –The length of the text %define LEX yylex –Scanner function %define LEX_RETURN int –The return type of LEX Copyright © 2015 Curt Hill

More Predefined %define CLASS name –Defaults to same as %name %define INHERIT name –Only needed if your class is a derivation %define MEMBERS mem –Extra member data for the class %define CONSTRUCTOR_INIT lg –The initialization list including : Any of these predefined names may be changed to something else Copyright © 2015 Curt Hill

Named Expressions The definition section may also contain any regular expressions that will be handy in the next section The format is: NAME expression –Where is the name you will use later and expression is the regular expression Example: DIGIT [0-9] –This is a set that matches any digit Copyright © 2015 Curt Hill

Example Definitions Copyright © 2015 Curt Hill %name Scanner %define MEMBERS public: int line, column; %define CONSTRUCTOR_INIT : line(1), column(1) %header{ #include #include "CDHConstants.h" using namespace std; %} LETTER [A-Za-z] DIGIT [0-9] DIGIT1 [1-9]

Rules Section This is where you make rules Each rule matches a construct A match provokes an action Each rule matches a terminal and returns the token Or it disposes of things that may ignored The format is: RE action –Where RE is a regular expression and action is C++ code to execute upon match Copyright © 2015 Curt Hill

Rule Examples Recall the definitions before Here is a blank killer: " " { ++column; } –The column variable is defined earlier The C++ code could be multiple lines –For example removing comments will take some work Here is a terminal: "=" { ++column; cout << "equal\n"; return EQUAL_; } Copyright © 2015 Curt Hill

More Examples Reserved words are easy: "int“ { column += 3; cout << "int\n"; return INT_; } Other non-terminals are harder: {DIGIT1}{DIGIT}* { column += strlen(yytext); cout << "number: “ << yytext << “\n”; return NUMBER_; } Copyright © 2015 Curt Hill

Code Section This is just C++ code that is copied as-is onto the end of the file Often it is nothing It may be a main function that tests the scanner Copyright © 2015 Curt Hill

Command line Options -ofilename –Gives the output name with extension –Default is lex.yy.c -hfilename –Generates a.h header for using this in another file The input file is last in the line Copyright © 2015 Curt Hill

Process 1 Not all that difficult Craft the BNF Feed this into your parser generator Start creating the input to Flex++ –Each parser generator generates a file that will list the terminals In SLK this is the XXXKeywords.txt file or the XXXConstants.h Copyright © 2015 Curt Hill

Processs 2 Start the definition section –Name the scanner –Set whatever definitions make sense –The header define gets includes Determine what the scanner will return: –Integer –Enumeration –Object Copyright © 2015 Curt Hill

Process 3 Start the rules section For each terminal create a rule –Punctuation and reserved words are easy –Numbers, names are somewhat harder –Comments hardest Copyright © 2015 Curt Hill

Finally We generate a makefile to do everything but run it Next we do a demo Then an assignment Soon consider the generated parser Copyright © 2015 Curt Hill