Download presentation
Presentation is loading. Please wait.
Published byMichael Domenic Stevenson Modified over 8 years ago
1
LEX SUNG-DONG KIM, DEPT. OF COMPUTER ENGINEERING, HANSUNG UNIVERSITY
2
LEX 1975, Lesk Input: regular expression + action code (tiny.l) Output: C program (lex.yy.c or lexyy.c) Procedure yylex Table-driven implementation of a DFA Similar to “getToken” (2013-1) Compiler 2 Lex Scanner (C code) RE + action
3
LEX CONVENTION (1) Metacharacters Quotes: actual characters For not metacharacters: “if”, if For metacharacters: “(” Backslash \(\* = “\*” \n, \t (aa|bb)(a|b)*c? = (“aa”|“bb”)(“a”|“b”)* “c”? (2013-1) Compiler 3
4
LEX CONVENTION (2) [...] : any one of them [abxz]: any one of the characters a, b, x, z (aa|bb)(ab)*c? Hyphen Ranges of characters [0-9] (2013-1) Compiler 4
5
LEX CONVENTION (3) . Represents a set of characters Any character except a newline ^ Complementary sets [^0-9abc]: any character that is not a digit and is not one of the letter a, b, c (2013-1) Compiler 5
6
LEX CONVENTION (4) Square bracket Most of the metacharacters lose their special status [-+] == (“+”|“-”) [+-]: from “+”, all characters [.”?]: any of the three characters., ”, ? [\^\\]: ^ or \ (2013-1) Compiler 6
7
LEX CONVENTION (5) Curly bracket Names of regular expressions (2013-1) Compiler 7 nat = [0-9]+ signedNat = (“+”|“-”)? nat nat [0-9]+ signedNat (“+”|“-”)? {nat}
8
FORMAT OF LEX INPUT (1) Input file = regular expression + C code Definitions Any C code that must be inserted to any function - %{…}% Names of regular expressions Rules Regular expressions + C code (action) Auxiliary routines (optional) C code + main program (if needed) (2013-1) Compiler 8
9
FORMAT OF LEX INPUT (2) Layout (2013-1) Compiler 9 {definitions} % {rules} % {auxiliary routines}
10
(2013-1) Compiler 10 EXAMPLE 1: SCANNER THAT ADDS LINE NUMBERS TO TEXT %{ /* a Lex program that adds line numbers to lines of text, printing the new text to the standard output */ #include int lineno = 1; %} line.*\n % {line} {printf(“%5d %s”,lineno++,yytext); } % main() { yylex(); return 0; }
11
(2013-1) Compiler 11 %{ /* a Lex program that changes all numbers from decimal to hexadecimal notation, printing a summary statistic stderr */ #include int count = 0; %} digit [0-9] number {digit}+ % {number} { int n = atoi(yytext); printf(“%x”, n); if (n > 9) count++; } %
12
main() { yylex(); fprintf(stderr, “number of replacements = %d”, count); return 0; } (2013-1) Compiler 12
13
(2013-1) Compiler 13 %{ /* Selects only lines that end or begin with the letter ‘a’. Deletes everything else. */ #include %} ends_with_a.*a\n begins_with_a a.*\n % {ends_with_a} ECHO; {begins_with_a} ECHO;.*\n ; % main() { yylex(); return 0; }
14
SUMMARY (1) Ambiguity resolution The principles of longest substring Substring with equal length: first-match first-serve No match: copy the next character to the output and continue (2013-1) Compiler 14
15
SUMMARY (2) Insertion of C Code %{ … %}: exact copy Auxiliary procedure section: exact copy at the end Any code following a RE (action): at the appropriate place in yylex (2013-1) Compiler 15
16
LEX INTERNAL NAMES lex.yy.c: Lex output file name or lexyy.c yylex: Lex scanning routine yytext: String matched on current action yyin: Lex input file (default: stdin) yyout: Lex output file (default: stdout) input: Lex buffered input routine ECHO: Lex default action (print yytext to yyout) (2013-1) Compiler 16
17
%{ #include “globals.h” #include “util.h” #include “scan.h” /* lexeme of identifier or reserved word */ char tokenString[MAXTOKENLEN+1]; */ digit[0-9] number{digit}+ letter[a-zA-Z] identifier{letter}+ newline\n whitespace[ \t] % LEX FOR TINY (2013-1) Compiler 17
18
“if”{ return IF; } “then”{ return THEN; } “else”{ return ELSE; } “end”{ return END; } “repeat”{ return REPEAT; } “until”{ return UNTIL; } “read”{ return READ; } “write”{ return WRITE; } “:=”{ return ASSIGN; } “=”{ return EQ; } “<”{ return LT; } “+”{ return PLUS; } “-”{ return MINUS; } “*”{ return TIMES; } “/”{ return OVER; } “(”{ return LPAREN; } “)”{ return RPAREN; } “;”{ return SEMI; } (2013-1) Compiler 18
19
(2013-1) Compiler 19 {number}{ return NUM; } {identifier}{ return ID; } {newline}{ lineno++; } {whitespace}{ /* skip whitespace */ } “{”{ char c; do { c = input(); if (c == ‘\n’) lineno++; } while (c != ‘}’); }.{ return ERROR; } %
20
(2013-1) Compiler 20 TokenType getToken(void) {static int firstTime = TRUE; TokenType currentToken; if (firstTime) { firstTime = FALSE; lineno++; yyin = source; yyout = listing; } currentToken = yylex(); strncpy(tokenString, yytext, MAXTOKENLEN); if (TraceScan) { fprintf(listing, “\t%d: “, lineno); printToken(currentToken, tokenString); } return currentToken; }
21
참고 교재 Lex & Yacc 2nd Edition, John R. Levine, Tony Mason, Doug Brown, O'Reilly,1992 예제 http://myweb.stedwards.edu/laurab/cosc4342/lex- examples.htmlhttp://myweb.stedwards.edu/laurab/cosc4342/lex- examples.html lex 함수 설명 등 http://docs.sun.com/app/docs/doc/801- 6734/6i13drksb?l=ko&a=viewhttp://docs.sun.com/app/docs/doc/801- 6734/6i13drksb?l=ko&a=view (2013-1) Compiler 21
22
EXERCISES 입력파일에 있는 단어의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오. 입력파일에 있는 문장의 개수를 구하는 프로그램을 위한 lex input 파일을 작성하시오. (2013-1) Compiler 22
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.