Download presentation
Presentation is loading. Please wait.
1
Lecture 4 RegExpr NFA DFA Topics Thompson Construction Subset construction Readings: 3.7, 3.6 January 23, 2006 CSCE 531 Compiler Construction
2
– 2 – CSCE 531 Spring 2006 Overview Last Time Flex Symbol table - hash table from K&R Today’s Lecture DFA review Simulating DFA figure 3.22 NFAs Thompson Construction: re NFA Examples NFA DFA, the subset construction ε – closure(s), ε – closure(T), move(T,a)References
3
– 3 – CSCE 531 Spring 2006 Hash Table #define ENDSTR 0 #define MAXSTR 100 #include #include struct nlist { /* basic table entry */ char *name; char *name; int val; int val; struct nlist *next; /*next entry in chain */ struct nlist *next; /*next entry in chain */}; #define HASHSIZE 100 static struct nlist *hashtab[HASHSIZE]; /* pointer table */
4
– 4 – CSCE 531 Spring 2006 Hashtable … …...... xbarfoo boatcount x int float func double null............
5
– 5 – CSCE 531 Spring 2006 The Hash Function /* PURPOSE: Hash determines hash value based on the sum of the character values in the string. character values in the string. USAGE: n = hash(s); DESCRIPTION OF PARAMETERS: s(array of char) string to be hashed AUTHOR: Kernighan and Ritchie LAST REVISION: 12/11/83 */ hash(char *s) /* form hash value for string s */ { int hashval; int hashval; for (hashval = 0; *s != '\0'; ) for (hashval = 0; *s != '\0'; ) hashval += *s++; hashval += *s++; return (hashval % HASHSIZE); return (hashval % HASHSIZE);}
6
– 6 – CSCE 531 Spring 2006 The lookup Function /*PURPOSE: Lookup searches for entry in symbol table and returns a pointer USAGE: np= lookup(s); DESCRIPTION OF PARAMETERS: s(array of char) string searched for AUTHOR: Kernighan and Ritchie LAST REVISION: 12/11/83*/ struct nlist *lookup(char *s) /* look for s in hashtab */ { struct nlist *np; struct nlist *np; for (np = hashtab[hash(s)]; np != NULL; np = np->next) for (np = hashtab[hash(s)]; np != NULL; np = np->next) if (strcmp(s, np->name) == 0) if (strcmp(s, np->name) == 0) return(np); /* found it */ return(np); /* found it */ return(NULL); /* not found */ return(NULL); /* not found */}
7
– 7 – CSCE 531 Spring 2006 The install Function /* PURPOSE: Install checks hash table using lookup and if entry not found, it "installs" the entry. USAGE: np = install(name); DESCRIPTION OF PARAMETERS: name(array of char) name to install in symbol table AUTHOR: Kernighan and Ritchie, modified by Ron Sobczak LAST REVISION: 12/11/83 */
8
– 8 – CSCE 531 Spring 2006 struct nlist *install(char *name) /* put (name) in hashtab */ { struct nlist *np, *lookup(); struct nlist *np, *lookup(); char *strdup(), *malloc(); char *strdup(), *malloc(); int hashval; int hashval; if ((np = lookup(name)) == NULL) { /* not found */ if ((np = lookup(name)) == NULL) { /* not found */ np = (struct nlist *) malloc(sizeof(*np)); np = (struct nlist *) malloc(sizeof(*np)); if (np == NULL) if (np == NULL) return(NULL); return(NULL); if ((np->name = strdup(name)) == NULL) if ((np->name = strdup(name)) == NULL) return(NULL); return(NULL); hashval = hash(np->name); hashval = hash(np->name); np->next = hashtab[hashval]; np->next = hashtab[hashval]; hashtab[hashval] = np; hashtab[hashval] = np; } return(np); return(np);}
9
– 9 – CSCE 531 Spring 2006 NFAs (Non-deterministic Finite Automata) Recall from last Time M = (Σ, S, s 0, δ, S F ) Σ - alphabet S - states δ – state transition function s 0 – start state S F – set of final or accepting states L(M) – { x such that it is possible to follow a path in the transition diagram labeled x that ends in an accepting state.} L(M) – { x such that it is possible to follow a path in the transition diagram labeled x that ends in an accepting state.}
10
– 10 – CSCE 531 Spring 2006 NFA transition function NFAs relax the functional nature of the transition function δ(s, a), the nextstate for state s and input a, is a subset of states δ(s, a), the nextstate for state s and input a, is a subset of states
11
– 11 – CSCE 531 Spring 2006 Equivalence NFA, DFA, RE RegExpr NFA Thompson Construction NFA DFASubset Construction DFA DFADFA minimization DFA tables for scanner DFA RegExprKleene Construction
12
– 12 – CSCE 531 Spring 2006 Converting Regular Expressions to NFAs Ken Thompson (1968) outlined a regular expression to NFA conversion algorithm for use in an editor Future fame? How would we use regular expressions in an editor? Unix regular expressions Grep family – Global Regular Expressions Print – prints all lines in a file that contain a match to the regular expression Grep family – Global Regular Expressions Print – prints all lines in a file that contain a match to the regular expression Variations Variations Fgrep – fast fixed regular expression just a string Egrep – goes through NFA DFA and minimization
13
– 13 – CSCE 531 Spring 2006 Restrictions on NFAs in Thompson Construction Constructs an NFA from the regular expression with the following restrictions: The NFA has a single start state,, and single final state,. The NFA has a single start state, s 0, and single final state, s f. There are no transitions coming into the start state and no transitions leaving the final state. A state has at most 2 exiting ε – transitions and at most 2 entering ε – transitions. s0s0 sfsf
14
– 14 – CSCE 531 Spring 2006 Base Cases of Thompson Construction For a ε Σ the NFA M a = (Σ, {s 0, s f }, δ, s 0, {s f }) that accepts it is: For ε the NFA M ε = (Σ, {s 0, s f }, δ, s 0, {s f }) that accepts it is:
15
– 15 – CSCE 531 Spring 2006 Recursive Cases of Thompson Construction For regular expressions R and S with machines M R and M S M R = (Σ, S R, δ R, r 0, {r f }) M S = (Σ, S S, δ S, s 0, {s f }) Then the NFA M R|S = (Σ, S R U S S U {new 0, new f }, δ R|S, new 0, {new f })
16
– 16 – CSCE 531 Spring 2006 Recursive Cases of Thompson Construction R|S For regular expressions R and S with machines M R and M S M R = (Σ, S R, δ R, r 0, {r f }) M S = (Σ, S S, δ S, s 0, {s f }) Then the NFA M R|S = (Σ, S R U S S U {new 0, new f }, δ R|S, new 0, {new f })
17
– 17 – CSCE 531 Spring 2006 Recursive Cases of Thompson Construction RS For regular expressions R and S with machines M R and M S M R = (Σ, S R, δ R, r 0, {r f }) M S = (Σ, S S, δ S, s 0, {s f }) Then the NFA M RS = (Σ, S R U S S U {new 0, new f }, δ RS, new 0, {new f })
18
– 18 – CSCE 531 Spring 2006 Recursive Cases of Thompson Construction R* For regular expression R with machine M R M R = (Σ, S R, δ R, r 0, {r f }) Then the NFA M R* = (Σ, S R U {new 0, new f }, δ R*, new 0, {new f })
19
– 19 – CSCE 531 Spring 2006 Thompson example Fig 3.16 has one let’s do another RegExpr = ab*b(a|b)*
20
– 20 – CSCE 531 Spring 2006 NFA to DFA the Subset Construction In an NFA given an input string we make choices about which way to go. We can think of it as being in a subset of the states. To convert to a DFA The states of the DFA correspond to sets of states of the NFA The states of the DFA correspond to sets of states of the NFA Transitions of the DFA are when you can move between the sets in the NFA Transitions of the DFA are when you can move between the sets in the NFA
21
– 21 – CSCE 531 Spring 2006 Subset Construction Functions We will use a collection of functions to facilitate seeing all of the states we can get to from one on a given input. We will use a collection of functions to facilitate seeing all of the states we can get to from one on a given input. -closure(s i ) is set of states reachable from s i by arcs -closure(T) is set of states reachable from T by arcs Move(T, a) is set of states reachable from T by a
22
– 22 – CSCE 531 Spring 2006 The Subset Construction Algorithm D 0 = -closure(s 0 ) // s 0 the start state of the NFA Add D 0 to Dstates as unmarked state While there is an unmarked state T in Dstates mark T for each input symbol a do U := -closure(move(T, a)) if U is not in Dstates then add U as unmarked state to Dstates Dtrans[T, a] = U endendend
23
– 23 – CSCE 531 Spring 2006 Example of Subset Construction Figure 3.35 fig 3.37 in text Example 2
24
– 24 – CSCE 531 Spring 2006 Lexical analyzer for subset of C int constants: int, octal, hex, Float constants C identifiers Keywords for, while, if, else Relational operators >= <= != == Arithmetic, Boolean and bit operators + - * / && || ! ~ & | Other symbols ; { } [ ] * ->
25
– 25 – CSCE 531 Spring 2006 Write core.l Flex Specification Due Monday Jan 30 Notes Install Identifiers and constants into symbol table Return separate token code for each relational operator. Not as in text!! Homework 02 Dues Thursday Jan 26 (now Saturday 28) Construct NFA for recognizing (a|b|ε)(ab)* Convert to DFA
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.