Download presentation
Presentation is loading. Please wait.
Published byAustin Black Modified over 6 years ago
1
Sung-Dong Kim, School of Computer Engineering, Hansung University
LEX Sung-Dong Kim, School of Computer Engineering, Hansung University
2
LEX 1975, Lesk Input: regular expression + action code (tiny.l)
Output: C program (lex.yy.c or lexyy.c) Procedure yylex Table-driven implementation of a DFA Compilation: cc –o scanner lex.yy.c -lm Lex Scanner (C code) RE + action (2017-1) Compiler
3
LEX Convention (1) Metacharacters Quotes: actual characters Backslash
For not metacharacters: “if”, if For metacharacters: “(” Backslash \(\* = “\*” \n, \t (aa|bb)(a|b)*c? = (“aa”|“bb”)(“a”|“b”)* “c”? (2017-1) Compiler
4
LEX Convention (2) [...] : any one of them Hyphen
[abxz]: any one of the characters a, b, x, z (aa|bb)(ab)*c? Hyphen Ranges of characters [0-9] (2017-1) Compiler
5
LEX Convention (3) . ^ Represents a set of characters
Any character except a newline ^ Complementary sets [^0-9abc]: any character that is not a digit and is not one of the letter a, b, c (2017-1) Compiler
6
LEX Convention (4) Square bracket
Most of the meta-characters lose their special status [-+] == (“+”|“-”) [+-]: from “+”, all characters [.”?]: any of the three characters ., ”, ? [\^\\]: ^ or \ (2017-1) Compiler
7
LEX Convention (5) Curly bracket Names of regular expressions
nat = [0-9]+ signedNat = (“+”|“-”)? nat nat [0-9]+ signedNat (“+”|“-”)? {nat} (2017-1) Compiler
8
Format of LEX Input (1) Input file = regular expression + C code
Definitions Any C code that must be inserted to any function - %{…}% Names of regular expressions Rules Regular expressions + C code (action) Auxiliary routines (optional) C code + main program (if needed) (2017-1) Compiler
9
Format of LEX Input (2) Layout {definitions} %% {rules}
{auxiliary routines} (2017-1) Compiler
10
Example 1: scanner that adds line numbers to text
%{ /* a Lex program that adds line numbers to lines of text, printing the new text to the standard output */ #include <stdio.h> int lineno = 1; %} line .*\n %% {line} {printf(“%5d %s”,lineno++,yytext); } main() { yylex(); return 0; } (2017-1) Compiler
11
Example 2: prints the count of # of replacements
%{ /* a Lex program that changes all numbers from decimal to hexadecimal notation, printing a summary statistic stderr */ #include <stdlib.h> #include <stdio.h> int count = 0; %} digit [0-9] number {digit}+ %% {number} { int n = atoi(yytext); printf(“%x”, n); if (n > 9) count++; } (2017-1) Compiler
12
fprintf(stderr, “number of replacements = %d”, count); return 0; }
main() { yylex(); fprintf(stderr, “number of replacements = %d”, count); return 0; } (2017-1) Compiler
13
Example 3: prints all input lines that begin or end with the ‘a’
%{ /* Selects only lines that end or begin with the letter ‘a’. Deletes everything else. */ #include <stdio.h> %} ends_with_a .*a\n begins_with_a a.*\n %% {ends_with_a} ECHO; {begins_with_a} ECHO; .*\n ; main() { yylex(); return 0; } (2017-1) Compiler
14
Summary (1) Ambiguity resolution The principles of longest substring
Substring with equal length: first-match first-serve No match: copy the next character to the output and continue (2017-1) Compiler
15
Summary (2) Insertion of C Code %{ … %}: exact copy
Auxiliary procedure section: exact copy at the end Any code following a RE (action): at the appropriate place in yylex (2017-1) Compiler
16
Lex Internal Names lex.yy.c: Lex output file name or lexyy.c
yylex: Lex scanning routine yytext: String matched on current action yyin: Lex input file (default: stdin) yyout: Lex output file (default: stdout) input: Lex buffered input routine ECHO: Lex default action (print yytext to yyout) (2017-1) Compiler
17
LEX for TINY %{ #include “globals.h” #include “util.h”
#include “scan.h” /* lexeme of identifier or reserved word */ char tokenString[MAXTOKENLEN+1]; %} digit [0-9] number {digit}+ letter [a-zA-Z] identifier {letter}+ newline \n whitespace [ \t] %% (2017-1) Compiler
18
“repeat” { return REPEAT; } “until” { return UNTIL; }
“if” { return IF; } “then” { return THEN; } “else” { return ELSE; } “end” { return END; } “repeat” { return REPEAT; } “until” { return UNTIL; } “read” { return READ; } “write” { return WRITE; } “:=” { return ASSIGN; } “=” { return EQ; } “<” { return LT; } “+” { return PLUS; } “-” { return MINUS; } “*” { return TIMES; } “/” { return OVER; } “(” { return LPAREN; } “)” { return RPAREN; } “;” { return SEMI; } (2017-1) Compiler
19
{number} { return NUM; } {identifier} { return ID; }
{newline} { lineno++; } {whitespace} { /* skip whitespace */ } “{” { char c; do { c = input(); if (c == ‘\n’) lineno++; } while (c != ‘}’); } . { return ERROR; } %% (2017-1) Compiler
20
TokenType getToken(void) { static int firstTime = TRUE;
TokenType currentToken; if (firstTime) { firstTime = FALSE; lineno++; yyin = source; yyout = listing; } currentToken = yylex(); strncpy(tokenString, yytext, MAXTOKENLEN); if (TraceScan) { fprintf(listing, “\t%d: “, lineno); printToken(currentToken, tokenString); return currentToken; (2017-1) Compiler
21
참고 교재 Lex & Yacc 2nd Edition, John R. Levine, Tony Mason, Doug Brown, O'Reilly,1992 예제 examples.html (2017-1) Compiler
22
Exercises 입력파일에 있는 단어의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오.
(2017-1) Compiler
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.