Download presentation
Presentation is loading. Please wait.
Published byBryce Underwood Modified over 9 years ago
1
by Neng-Fa Zhou Lexical Analysis 4 Why separate lexical and syntax analyses? –simpler design –efficiency –portability
2
by Neng-Fa Zhou Tokens, Patterns, Lexemes –Tokens Terminal symbols in the grammar –Patterns Description of a class of tokens –Lexemes Words in the the source program
3
by Neng-Fa Zhou Languages –Fixed and finite alphabet (vocabulary) –Finite length sentences –Possibly infinite number of sentences 4 Examples –Natural numbers {1,2,3,...10,11,...} –Strings over {a,b} a n ba n 4 Terms on parts of a string –prefix, suffix, substring, proper....
4
by Neng-Fa Zhou Operations on Languages
5
by Neng-Fa Zhou Examples L = {A,B,...,Z,a,b,...,z} D = {0,1,...,9} L D: the set of letters and digits LD: a letter followed by a digit L 4 : four-letter strings L*: all strings of letters, including L(L D)* : strings of letters and digits beginning with a letter D+: strings of one or more digits
6
by Neng-Fa Zhou Regular Expression(RE) is a RE a symbol in is a RE 4 Let r and s be REs. –(r) | (s): or –(r)(s): concatenation –(r) * : zero or more instances –(r) + : one or more instances –(r)?: zero or one instance
7
by Neng-Fa Zhou Precedence of Operators high low r* r + r? rs r|s all left associative 4 Examples = {a,b} 1. a|b 2. (a|b)(a|b) 3. a* 4. (a|b)* 5. a| a*b
8
by Neng-Fa Zhou Algebraic Properties of RE
9
by Neng-Fa Zhou d 1 r 1 d 2 r 2 d n r n.... d i is a RE over {d 1,d 2,...,d i-1 } Regular Definitions not recursive
10
by Neng-Fa Zhou Examples 4 Identifiers 4 Decimal integers in Java 4 Hexadecimal integers letter -> A | B |... | Z | a | b |... | z digit -> 0 | 1 |... | 9 id -> letter ( letter | digit )* DecimalNumeral -> 0 | nonZeroDigit digit* HexaNumeral -> (0x | 0X) hexadigit*
11
Example-1 by Neng-Fa Zhou %{ int num_lines = 0, num_chars = 0; %} % \n ++num_lines; ++num_chars;. ++num_chars; % main() { yylex(); printf( "# of lines = %d, # of chars = %d\n", num_lines, num_chars ); } yywrap(){return 0;}
12
by Neng-Fa Zhou Example-2 D [0-9] INT {D}{D}* % {INT}("."{INT}((e|E)("+"|-)?{INT})?)? {printf("valid %s\n",yytext);}. {printf("unrecognized %s\n",yytext);} % int main(int argc, char *argv[]){ ++argv, --argc; if (argc>0) yyin = fopen(argv[0],"r"); else yyin = stdin; yylex(); } yywrap(){return 0;}
13
java.util.regex by Neng-Fa Zhou import java.util.regex.*; class Number { public static void main(String[] args){ String regExNum = "\\d+(\\.\\d+((e|E)(\\+|-)?\\d+)?)?"; if (Pattern.matches(regExNum,args[0])) System.out.println("valid"); else System.out.println("invalid"); }
14
String Pattern Matching in Perl by Neng-Fa Zhou print "Input a string :"; $_ = ; chomp($_); if (/^[0-9]+(\.[0-9]+((e|E)(\+|-)?[0-9]+)?)?$/){ print "valid\n"; } else { print "invalid\n"; }
15
by Neng-Fa Zhou Finite Automata 4 Nondeterministic finite automaton (NFA) NFA = (S,T,s 0,F) –S: a set of states –T: a transition mapping –s 0 : the start state –F: final states or accepting states
16
by Neng-Fa Zhou Example
17
by Neng-Fa Zhou Deterministic Finite Automata (DFA) T: a transition function There is only one arc going out from each node on each symbol.
18
by Neng-Fa Zhou Simulating a DFA s = s0; c = nextchar; while (c != eof) { s = move(s,c); c = nextchar; } if (s is in F) return "yes"; else return "no";
19
by Neng-Fa Zhou From RE to NFA – –a in –s|t
20
by Neng-Fa Zhou From RE to NFA (cont.) –st –s*
21
by Neng-Fa Zhou Example (a|b)*a
22
by Neng-Fa Zhou Building Lexical Analyzer RENFADFA Emulator Algorithm 3.23 (Thompson's construction) Algorithm 3.32 (Subset construction)
23
by Neng-Fa Zhou Conversion of an NFA into a DFA 4 Intuition –move(s,a) is a function in a DFA –move(s,a) is a mapping in a NFA NFA DFA A state reachable from s0 in the DFA on an input string corresponds to a set of states in NFA that are reachable on the same string.
24
by Neng-Fa Zhou Computation of -Closure -Closure(T): Set of NFA states reachable from some NFA state s in T by transition alone.
25
by Neng-Fa Zhou From an NFA to a DFA (The subset construction)
26
by Neng-Fa Zhou Example NFA DFA
27
by Neng-Fa Zhou Algorithm 3.39 F, S-F}; do begin for each group G in do begin partition G into subgroups such that two states s and t of G are in the same subgroup iff for all input symbols a, s and t have transitions on a to states in the same group; replace G in by the set of all subgroups formed; end if ( ) return; ; end;
28
by Neng-Fa Zhou Example ab ACBAC BBD DBE EBAC
29
Construct a DFA Directly from a Regular Expression by Neng-Fa Zhou
30
Implementation Issues 4 Input buffering –Read in characters one by one Unable to look ahead Inefficient –Read in a whole string and store it in memory Requires a big buffer –Buffer pairs
31
by Neng-Fa Zhou Buffer Pairs
32
by Neng-Fa Zhou Use Sentinels
33
by Neng-Fa Zhou Lexical Analyzer
34
by Neng-Fa Zhou Lex 4 A tool for automatically generating lexical analyzers
35
by Neng-Fa Zhou Lex Specifications declarations % translation rules % auxiliary procedures p 1 {action 1 } p 2 {action 2 }... p n {action n }
36
by Neng-Fa Zhou Lex Regular Expressions
37
by Neng-Fa Zhou yylex() yylex(){ switch (pattern_match()){ case 1:{action 1 } case 2:{action 2 }... case n:{action n } }
38
by Neng-Fa Zhou Example DIGIT [0-9] ID[a-z][a-z0-9]* % {DIGIT}+{printf("An integer:%s(%d)\n",yytext,atoi(yytext));} {DIGIT}+"."{DIGIT}*{printf("A float: %s (%g)\n",yytext,atof(yytext));} if|then|begin|end|procedure|function{printf("A keyword: %s\n",yytext);} {ID}{printf("An identifier %s\n",yytext);} "+"|"-"|"*"|"/"{printf("An operator %s\n",yytext);} "{"[^}\n]*"}" {/* eat up one-line comments */} [ \t\n]+ {/* eat up white space */}. {printf("Unrecognized character: %s\n", yytext);} % int main(int argc, char *argv[]){ ++argv, --argc; if (argc>0) yyin = fopen(argv[0],"r"); else yyin = stdin; yylex(); }
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.