Download presentation
Presentation is loading. Please wait.
Published bySherman Price Modified over 9 years ago
1
CH3.1 CS 345 Dr. Mohamed Ramadan Saady Algebraic Properties of Regular Expressions AXIOMDESCRIPTION r | s = s | r r | (s | t) = (r | s) | t (r s) t = r (s t) r = r r = r r* = ( r | )* r ( s | t ) = r s | r t ( s | t ) r = s r | t r r** = r* | is commutative | is associative concatenation is associative concatenation distributes over | relation between * and Is the identity element for concatenation * is idempotent
2
CH3.2 CS 345 Dr. Mohamed Ramadan Saady Regular Expression Examples All Strings that start with “tab” or end with “bat”: tab{A,…,Z,a,...,z}*|{A,…,Z,a,....,z}*bat All Strings in Which Digits 1,2,3 exist in ascending numerical order: {A,…,Z}*1 {A,…,Z}*2 {A,…,Z}*3 {A,…,Z}*
3
CH3.3 CS 345 Dr. Mohamed Ramadan Saady Towards Token Definition Regular Definitions: Associate names with Regular Expressions For Example : PASCAL IDs letter A | B | C | … | Z | a | b | … | z digit 0 | 1 | 2 | … | 9 id letter ( letter | digit )* Shorthand Notation: “+” : one or more r* = r + | & r + = r r* “?” : zero or oner?=r | [range] : set range of characters (replaces “|” ) [A-Z] = A | B | C | … | Z Example Using Shorthand : PASCAL IDs id [A-Za-z][A-Za-z0-9]*
4
CH3.4 CS 345 Dr. Mohamed Ramadan Saady Token Recognition How can we use concepts developed so far to assist in recognizing tokens of a source language ? Assume Following Tokens: if, then, else, relop, id, num What language construct are they used for ? Given Tokens, What are Patterns ? if if then then else else relop | >= | = | <> id letter ( letter | digit )* num digit + (. digit + ) ? ( E(+ | -) ? digit + ) ? What does this represent ? What is ? Grammar: stmt |if expr then stmt |if expr then stmt else stmt | expr term relop term | term term id | num
5
CH3.5 CS 345 Dr. Mohamed Ramadan Saady What Else Does Lexical Analyzer Do? Scan away b, nl, tabs Can we Define Tokens For These? blank b tab ^T newline ^M delim blank | tab | newline ws delim +
6
CH3.6 CS 345 Dr. Mohamed Ramadan SaadyOverall Regular Expression TokenAttribute-Value ws if then else id num < <= = > >= - if then else id num relop - pointer to table entry LT LE EQ NE GT GE Note: Each token has a unique token identifier to define category of lexemes
7
CH3.7 CS 345 Dr. Mohamed Ramadan Saady Constructing Transition Diagrams for Tokens Transition Diagrams (TD) are used to represent the tokens As characters are read, the relevant TDs are used to attempt to match lexeme to a pattern Each TD has: States : Represented by Circles Actions : Represented by Arrows between states Start State : Beginning of a pattern (Arrowhead) Final State(s) : End of pattern (Concentric Circles) Each TD is Deterministic - No need to choose between 2 different actions !
8
CH3.8 CS 345 Dr. Mohamed Ramadan Saady Example TDs start other => 067 8 * RTN(G) RTN(GE) > = : We’ve accepted “>” and have read other char that must be unread.
9
CH3.9 CS 345 Dr. Mohamed Ramadan Saady Example : All RELOPs start< 0 other = 67 8 return(relop, LE) 5 4 > = 12 3 other > = * * return(relop, NE) return(relop, LT) return(relop, EQ) return(relop, GE) return(relop, GT)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.