Sung-Dong Kim, School of Computer Engineering, Hansung University

Slides:



Advertisements
Similar presentations
Lexical Analysis Consider the program: #include main() { double value = 0.95; printf("value = %f\n", value); } How is this translated into meaningful machine.
Advertisements

COS 320 Compilers David Walker. Outline Last Week –Introduction to ML Today: –Lexical Analysis –Reading: Chapter 2 of Appel.
Lex -- a Lexical Analyzer Generator (by M.E. Lesk and Eric. Schmidt) –Given tokens specified as regular expressions, Lex automatically generates a routine.
1 Chapter 2: Scanning 朱治平. Scanner (or Lexical Analyzer) the interface between source & compiler could be a separate pass and places its output on an.
Lexical Analysis - Scanner- Contd Computer Science Rensselaer Polytechnic Compiler Design Lecture 4(01/26/98)
Tools for building compilers Clara Benac Earle. Tools to help building a compiler C –Lexical Analyzer generators: Lex, flex, –Syntax Analyzer generator:
College of Computer Science & Technology Compiler Construction Principles & Implementation Techniques -1- Compiler Construction Principles & Implementation.
Lexical Analysis Mooly Sagiv html:// Textbook:Modern Compiler Implementation in C Chapter 2.
Chapter 3 Chang Chi-Chung. The Structure of the Generated Analyzer lexeme Automaton simulator Transition Table Actions Lex compiler Lex Program lexemeBeginforward.
Scanning with Jflex.
1 Material taught in lecture Scanner specification language: regular expressions Scanner generation using automata theory + extra book-keeping.
A brief [f]lex tutorial Saumya Debray The University of Arizona Tucson, AZ
CS 536 Spring Learning the Tools: JLex Lecture 6.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
1 Flex. 2 Flex A Lexical Analyzer Generator  generates a scanner procedure directly, with regular expressions and user-written procedures Steps to using.
Compilers: lex/3 1 Compiler Structures Objectives – –describe lex – –give many examples of lex's use , Semester 1, Lex.
Review: Regular expression: –How do we define it? Given an alphabet, Base case: – is a regular expression that denote { }, the set that contains the empty.
Lecture 2: Lexical Analysis
CPSC 388 – Compiler Design and Construction Scanners – JLex Scanner Generator.
Agenda Regular Expressions (Appendix A in Text) –Definition / Purpose –Commands that Use Regular Expressions –Using Regular Expressions –Using the Replacement.
COMP313A Programming Languages Lexical Analysis. Lecture Outline Lexical Analysis The language of Lexical Analysis Regular Expressions.
Scanning & FLEX CPSC 388 Ellen Walker Hiram College.
FLEX Fast Lexical Analyzer EECS Introduction Flex is a lexical analysis (scanner) generator. Flex is provided with a user input file or Standard.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Introduction to Lex Ying-Hung Jiang
1 Using Lex. 2 Introduction When you write a lex specification, you create a set of patterns which lex matches against the input. Each time one of the.
1 Using Lex. Flex – Lexical Analyzer Generator A language for specifying lexical analyzers Flex compilerlex.yy.clang.l C compiler -lfl a.outlex.yy.c a.outtokenssource.
COMPILERS AND INTERPRETERS Lesson 3 – TDDD16 TDDB44 Compiler Construction 2010 Kristian Stavåker Department.
Introduction to Lex Fan Wu
Lex.
Introduction to Lexical Analysis and the Flex Tool. © Allan C. Milne Abertay University v
Lexical Analysis with lex(1) and flex(1) © 2014 Clinton Jeffery.
Practical 1-LEX Implementation
1 Lex & Yacc. 2 Compilation Process Lexical Analyzer Source Code Syntax Analyzer Symbol Table Intermed. Code Gen. Code Generator Machine Code.
Lex & Yacc By Hathal Alwageed & Ahmad Almadhor. References *Tom Niemann. “A Compact Guide to Lex & Yacc ”. Portland, Oregon. 18 April 2010 *Levine, John.
ICS312 LEX Set 25. LEX Lex is a program that generates lexical analyzers Converting the source code into the symbols (tokens) is the work of the C program.
COMPILER CONSTRUCTION Principles and Practice Kenneth C. Louden.
PL&C Lab, DongGuk University Compiler Lecture Note, MiscellaneousPage 1 Yet Another Compiler-Compiler Stephen C. Johnson July 31, 1978 YACC.
C Chuen-Liang Chen, NTUCS&IE / 35 SCANNING Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei,
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
1 Steps to use Flex Ravi Chotrani New York University Reviewed By Prof. Mohamed Zahran.
Scanner Generation Using SLK and Flex++ Followed by a Demo Copyright © 2015 Curt Hill.
Chapter 2 Scanning. Dr.Manal AbdulazizCS463 Ch22 The Scanning Process Lexical analysis or scanning has the task of reading the source program as a file.
Compiler Principle and Technology Prof. Dongming LU Feb. 28th, 2014.
LECTURE 6 Scanning Part 2. FROM DFA TO SCANNER In the previous lectures, we discussed how one might specify valid tokens in a language using regular expressions.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
Chapter 2-II Scanning Sung-Dong Kim Dept. of Computer Engineering, Hansung University.
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY.
9-December-2002cse Tools © 2002 University of Washington1 Lexical and Parser Tools CSE 413, Autumn 2002 Programming Languages
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LEX & Yacc Sung-Dong Kim, Dept. of Computer Engineering, Hansung University.
Tutorial On Lex & Yacc.
Chapter 2 Scanning – Part 1 June 10, 2018 Prof. Abdelaziz Khamis.
4 Lexical analysis.
Using SLK and Flex++ Followed by a Demo
Sung-Dong Kim, Dept. of Computer Engineering, Hansung University
Regular Languages.
TDDD55- Compilers and Interpreters Lesson 2
4 Lexical analysis.
Other Issues - § 3.9 – Not Discussed
Compiler Structures 3. Lex Objectives , Semester 2,
Appendix B.1 Lex Appendix B.1 -- Lex.
Compiler Lecture Note, Miscellaneous
Compiler Structures 7. Yacc Objectives , Semester 2,
Regular Expressions and Lexical Analysis
Systems Programming & Operating Systems Unit – III
Compiler Design 3. Lexical Analyzer, Flex
Lexical Analysis - Scanner-Contd
Lex Appendix B.1 -- Lex.
Presentation transcript:

Sung-Dong Kim, School of Computer Engineering, Hansung University LEX Sung-Dong Kim, School of Computer Engineering, Hansung University

LEX 1975, Lesk Input: regular expression + action code (tiny.l) Output: C program (lex.yy.c or lexyy.c) Procedure yylex Table-driven implementation of a DFA Compilation: cc –o scanner lex.yy.c -lm Lex Scanner (C code) RE + action (2017-1) Compiler

LEX Convention (1) Metacharacters Quotes: actual characters Backslash For not metacharacters: “if”, if For metacharacters: “(” Backslash \(\* = “\*” \n, \t (aa|bb)(a|b)*c? = (“aa”|“bb”)(“a”|“b”)* “c”? (2017-1) Compiler

LEX Convention (2) [...] : any one of them Hyphen [abxz]: any one of the characters a, b, x, z (aa|bb)(ab)*c? Hyphen Ranges of characters [0-9] (2017-1) Compiler

LEX Convention (3) . ^ Represents a set of characters Any character except a newline ^ Complementary sets [^0-9abc]: any character that is not a digit and is not one of the letter a, b, c (2017-1) Compiler

LEX Convention (4) Square bracket Most of the meta-characters lose their special status [-+] == (“+”|“-”) [+-]: from “+”, all characters [.”?]: any of the three characters ., ”, ? [\^\\]: ^ or \ (2017-1) Compiler

LEX Convention (5) Curly bracket Names of regular expressions nat = [0-9]+ signedNat = (“+”|“-”)? nat nat [0-9]+ signedNat (“+”|“-”)? {nat} (2017-1) Compiler

Format of LEX Input (1) Input file = regular expression + C code Definitions Any C code that must be inserted to any function - %{…}% Names of regular expressions Rules Regular expressions + C code (action) Auxiliary routines (optional) C code + main program (if needed) (2017-1) Compiler

Format of LEX Input (2) Layout {definitions} %% {rules} {auxiliary routines} (2017-1) Compiler

Example 1: scanner that adds line numbers to text %{ /* a Lex program that adds line numbers to lines of text, printing the new text to the standard output */ #include <stdio.h> int lineno = 1; %} line .*\n %% {line} {printf(“%5d %s”,lineno++,yytext); } main() { yylex(); return 0; } (2017-1) Compiler

Example 2: prints the count of # of replacements %{ /* a Lex program that changes all numbers from decimal to hexadecimal notation, printing a summary statistic stderr */ #include <stdlib.h> #include <stdio.h> int count = 0; %} digit [0-9] number {digit}+ %% {number} { int n = atoi(yytext); printf(“%x”, n); if (n > 9) count++; } (2017-1) Compiler

fprintf(stderr, “number of replacements = %d”, count); return 0; } main() { yylex(); fprintf(stderr, “number of replacements = %d”, count); return 0; } (2017-1) Compiler

Example 3: prints all input lines that begin or end with the ‘a’ %{ /* Selects only lines that end or begin with the letter ‘a’. Deletes everything else. */ #include <stdio.h> %} ends_with_a .*a\n begins_with_a a.*\n %% {ends_with_a} ECHO; {begins_with_a} ECHO; .*\n ; main() { yylex(); return 0; } (2017-1) Compiler

Summary (1) Ambiguity resolution The principles of longest substring Substring with equal length: first-match first-serve No match: copy the next character to the output and continue (2017-1) Compiler

Summary (2) Insertion of C Code %{ … %}: exact copy Auxiliary procedure section: exact copy at the end Any code following a RE (action): at the appropriate place in yylex (2017-1) Compiler

Lex Internal Names lex.yy.c: Lex output file name or lexyy.c yylex: Lex scanning routine yytext: String matched on current action yyin: Lex input file (default: stdin) yyout: Lex output file (default: stdout) input: Lex buffered input routine ECHO: Lex default action (print yytext to yyout) (2017-1) Compiler

LEX for TINY %{ #include “globals.h” #include “util.h” #include “scan.h” /* lexeme of identifier or reserved word */ char tokenString[MAXTOKENLEN+1]; %} digit [0-9] number {digit}+ letter [a-zA-Z] identifier {letter}+ newline \n whitespace [ \t] %% (2017-1) Compiler

“repeat” { return REPEAT; } “until” { return UNTIL; } “if” { return IF; } “then” { return THEN; } “else” { return ELSE; } “end” { return END; } “repeat” { return REPEAT; } “until” { return UNTIL; } “read” { return READ; } “write” { return WRITE; } “:=” { return ASSIGN; } “=” { return EQ; } “<” { return LT; } “+” { return PLUS; } “-” { return MINUS; } “*” { return TIMES; } “/” { return OVER; } “(” { return LPAREN; } “)” { return RPAREN; } “;” { return SEMI; } (2017-1) Compiler

{number} { return NUM; } {identifier} { return ID; } {newline} { lineno++; } {whitespace} { /* skip whitespace */ } “{” { char c; do { c = input(); if (c == ‘\n’) lineno++; } while (c != ‘}’); } . { return ERROR; } %% (2017-1) Compiler

TokenType getToken(void) { static int firstTime = TRUE; TokenType currentToken; if (firstTime) { firstTime = FALSE; lineno++; yyin = source; yyout = listing; } currentToken = yylex(); strncpy(tokenString, yytext, MAXTOKENLEN); if (TraceScan) { fprintf(listing, “\t%d: “, lineno); printToken(currentToken, tokenString); return currentToken; (2017-1) Compiler

참고 교재 Lex & Yacc 2nd Edition, John R. Levine, Tony Mason, Doug Brown, O'Reilly,1992 예제 http://myweb.stedwards.edu/laurab/cosc4342/lex- examples.html (2017-1) Compiler

Exercises 입력파일에 있는 단어의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오. (2017-1) Compiler