Fall 2006Costas Busch - RPI1 Lex
Fall 2006Costas Busch - RPI2 Lex: a lexical analyzer A Lex program recognizes strings For each kind of string found the lex program takes an action
Fall 2006Costas Busch - RPI3 Var = ; if (test > 20) temp = 0; else while (a < 20) temp++; Lex program Identifier: Var Operand: = Integer: 12 Operand: + Integer: 9 Semicolumn: ; Keyword: if Parenthesis: ( Identifier: test.... Input Output
Fall 2006Costas Busch - RPI4 In Lex strings are described with regular expressions “if” “then” “+” “-” “=“ /* operators */ /* keywords */ Lex program Regular expressions
Fall 2006Costas Busch - RPI5 (0|1|2|3|4|5|6|7|8|9)+ /* integers */ /* identifiers */ Regular expressions (a|b|..|z|A|B|...|Z)+ Lex program
Fall 2006Costas Busch - RPI6 integers [0-9]+(0|1|2|3|4|5|6|7|8|9)+
Fall 2006Costas Busch - RPI7 (a|b|..|z|A|B|...|Z)+ [a-zA-Z]+ identifiers
Fall 2006Costas Busch - RPI8 Each regular expression has an associated action (in C code) Examples: \n Regular expressionAction linenum++; [a-zA-Z]+ printf(“identifier”); [0-9]+ prinf(“integer”);
Fall 2006Costas Busch - RPI9 Default action: ECHO; Prints the string identified to the output
Fall 2006Costas Busch - RPI10 A small lex program % [a-zA-Z]+printf(“Identifier\n”); [0-9]+printf(“Integer\n”); [ \t\n] ; /*skip spaces*/
Fall 2006Costas Busch - RPI test var Input Output Integer Identifier Integer
Fall 2006Costas Busch - RPI12 % [a-zA-Z]+ printf(“Identifier\n”); [0-9]+ prinf(“Integer\n”); [ \t] ; /*skip spaces*/. printf(“Error in line: %d\n”, linenum); Another program %{ int linenum = 1; %} \nlinenum++;
Fall 2006Costas Busch - RPI test var temp Input Output Integer Identifier Integer Error in line: 3 Identifier
Fall 2006Costas Busch - RPI14 Lex matches the longest input string “if” “ifend” Regular Expressions Input: ifend if Matches: “ifend” “if” Example:
Fall 2006Costas Busch - RPI15 Internal Structure of Lex Lex Regular expressions NFADFA Minimal DFA The final states of the DFA are associated with actions