Lexical Analysis: DFA Minimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice.

Slides:



Advertisements
Similar presentations
Lexical Analysis IV : NFA to DFA DFA Minimization
Advertisements

Lecture 24 MAS 714 Hartmut Klauck
1 1 CDT314 FABER Formal Languages, Automata and Models of Computation Lecture 3 School of Innovation, Design and Engineering Mälardalen University 2012.
Lexical Analysis — Introduction Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Regular Expressions Finite State Automaton. Programming Languages2 Regular expressions  Terminology on Formal languages: –alphabet : a finite set of.
1 CIS 461 Compiler Design and Construction Fall 2012 slides derived from Tevfik Bultan et al. Lecture-Module 5 More Lexical Analysis.
Lexical Analysis: DFA Minimization & Wrap Up Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp.
Scanner wrap-up and Introduction to Parser. Automating Scanner Construction RE  NFA (Thompson’s construction) Build an NFA for each term Combine them.
Compiler Construction
1 The scanning process Main goal: recognize words/tokens Snapshot: At any point in time, the scanner has read some input and is on the way to identifying.
From Cooper & Torczon1 Automating Scanner Construction RE  NFA ( Thompson’s construction )  Build an NFA for each term Combine them with  -moves NFA.
1 The scanning process Goal: automate the process Idea: –Start with an RE –Build a DFA How? –We can build a non-deterministic finite automaton (Thompson's.
CSC 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Regular.
2. Lexical Analysis Prof. O. Nierstrasz
1 CMPSC 160 Translation of Programming Languages Fall 2002 slides derived from Tevfik Bultan, Keith Cooper, and Linda Torczon Lecture-Module #5 Introduction.
CS5371 Theory of Computation Lecture 8: Automata Theory VI (PDA, PDA = CFG)
Instruction Scheduling II: Beyond Basic Blocks Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp.
Lexical Analysis — Part II From Regular Expression to Scanner Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled.
LR(1) Parsers Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit.
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
1 Compiler Construction Finite-state automata. 2 Today’s Goals More on lexical analysis: cycle of construction RE → NFA NFA → DFA DFA → Minimal DFA DFA.
1 Chapter 3 Scanning – Theory and Practice. 2 Overview Formal notations for specifying the precise structure of tokens are necessary –Quoted string in.
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions.
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
어휘분석 (Lexical Analysis). Overview Main task: to read input characters and group them into “ tokens. ” Secondary tasks: –Skip comments and whitespace;
Lexical Analysis - An Introduction Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at.
Introduction to Parsing Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
1 Regular Expressions. 2 Regular expressions describe regular languages Example: describes the language.
Lecture # 3 Chapter #3: Lexical Analysis. Role of Lexical Analyzer It is the first phase of compiler Its main task is to read the input characters and.
Lexical Analysis Constructing a Scanner from Regular Expressions.
4b 4b Lexical analysis Finite Automata. Finite Automata (FA) FA also called Finite State Machine (FSM) –Abstract model of a computing entity. –Decides.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Lexical Analysis III : NFA to DFA DFA Minimization Lecture 5 CS 4318/5331 Spring 2010 Apan Qasem Texas State University *some slides adopted from Cooper.
Top Down Parsing - Part I Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University.
Regular Expressions Hopcroft, Motawi, Ullman, Chap 3.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 3 Mälardalen University 2010.
Overview of Previous Lesson(s) Over View  Symbol tables are data structures that are used by compilers to hold information about source-program constructs.
CMSC 330: Organization of Programming Languages Theory of Regular Expressions Finite Automata.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Parsing — Part II (Top-down parsing, left-recursion removal) Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved. Students.
Lexical Analysis: DFA Minimization & Wrap Up. Automating Scanner Construction PREVIOUSLY RE  NFA ( Thompson’s construction ) Build an NFA for each term.
Top-down Parsing Recursive Descent & LL(1) Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412.
CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong NFA to.
CS 203: Introduction to Formal Languages and Automata
Lexical Analysis – Part II EECS 483 – Lecture 3 University of Michigan Wednesday, September 13, 2006.
UNIT - I Formal Language and Regular Expressions: Languages Definition regular expressions Regular sets identity rules. Finite Automata: DFA NFA NFA with.
Cleaning up the CFG Eliminating useless nodes & edges This lecture describes the algorithm Clean, presented in Chapter 10 of EaC2e. The algorithm is due.
98 Nondeterministic Automata vs Deterministic Automata We learned that NFA is a convenient model for showing the relationships among regular grammars,
1 Chapter 3 Regular Languages.  2 3.1: Regular Expressions (1)   Regular Expression (RE):   E is a regular expression over  if E is one of:
LECTURE 5 Scanning. SYNTAX ANALYSIS We know from our previous lectures that the process of verifying the syntax of the program is performed in two stages:
CS412/413 Introduction to Compilers Radu Rugina Lecture 3: Finite Automata 25 Jan 02.
COMP3190: Principle of Programming Languages DFA and its equivalent, scanner.
Lecture 2 Compiler Design Lexical Analysis By lecturer Noor Dhia
Topic 3: Automata Theory 1. OutlineOutline Finite state machine, Regular expressions, DFA, NDFA, and their equivalence, Grammars and Chomsky hierarchy.
WELCOME TO A JOURNEY TO CS419 Dr. Hussien Sharaf Dr. Mohammad Nassef Department of Computer Science, Faculty of Computers and Information, Cairo University.
Parsing — Part II (Top-down parsing, left-recursion removal)
Two issues in lexical analysis
Lexical Analysis - An Introduction
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Lecture 5: Lexical Analysis III: The final bits
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Lecture 4: Lexical Analysis II: From REs to DFAs
DFA Equivalence & Minimization
Automating Scanner Construction
Lexical Analysis — Part II: Constructing a Scanner from Regular Expressions Copyright 2003, Keith D. Cooper, Ken Kennedy & Linda Torczon, all rights reserved.
Parsing — Part II (Top-down parsing, left-recursion removal)
Lexical Analysis: DFA Minimization & Wrap Up
Lexical Analysis - An Introduction
The Partitioning Algorithm for Detecting Congruent Expressions COMP 512 Rice University Houston, Texas Fall 2003 Copyright 2003, Keith D. Cooper.
Lecture 5 Scanning.
Presentation transcript:

Lexical Analysis: DFA Minimization Comp 412 Copyright 2010, Keith D. Cooper & Linda Torczon, all rights reserved. Students enrolled in Comp 412 at Rice University have explicit permission to make copies of these materials for their personal use. Faculty from other educational institutions may use these materials for nonprofit educational purposes, provided this copyright notice is preserved. COMP 412 FALL 2010

Comp 412, Fall Automating Scanner Construction RE  NFA ( Thompson’s construction ) Build an NFA for each term Combine them with  -moves NFA  DFA ( subset construction ) Build the simulation DFA  Minimal DFA Brzozowski’s Algorithm Hopcroft’s algorithm (today) DFA  RE (not really part of scanner construction) All pairs, all paths problem Union together paths from s 0 to a final state minimal DFA RENFADFA The Cycle of Constructions

Comp 412, Fall DFA Minimization The Big Picture Discover sets of equivalent states in the DFA Represent each such set with a single state

Comp 412, Fall DFA Minimization The Big Picture Discover sets of equivalent states in the DFA Represent each such set with a single state Two states are equivalent if and only if: The set of paths leading to them are equivalent, and    , transitions on  lead to equivalent states (DFA) ⇒ Must split a state that has  -transitions to distinct sets

Comp 412, Fall DFA Minimization The Big Picture Discover sets of equivalent states in the DFA Represent each such set with a single state Two states are equivalent if and only if: The set of paths leading to them are equivalent, and    , transitions on  lead to equivalent states (DFA) ⇒ Must split a state that has  -transitions to distinct sets A partition P of S A collection of sets P s.t. each s  S is in exactly one p i  P The algorithm iteratively constructs partitions of the DFA ’s states

Comp 412, Fall DFA Minimization Details of the algorithm Group states into maximally sized initial sets, optimistically Iteratively subdivide those sets, based on transition graph States that remain grouped together are equivalent Initial partition, P 0, has two sets: {F} & {S-F} D =(S, , ,s 0,F) Splitting a set (“partitioning a set by a”) Assume s a & s b  p i, and  (s a,a) = s x, &  (s b,a) = s y If s x & s y are not in the same set p j, then p i must be split —s a has transition on a, s b does not  a splits p i One state in the final DFA cannot have two transitions on a The algorithm works backward, from a pair (p,a) to the subset of the states in some other set q that reach p on a final statesothers Maximally sized sets  minimal number of sets

Comp 412, Fall Key Idea: Splitting S around  Q S  The algorithm partitions Q around  Original set Q   Q has transitions on  to R, S, & T RT

Comp 412, Fall Key Idea: Splitting Q around (S,  ) I S  This part must have an  -transition to one or more other states in one or more other partitions. Otherwise, it does not split! Find maximal partition I that has an  -transition into S Q Think of I as the image of S under the inverse of the transition function I     –1 ( s,  )

Comp 412, Fall Hopcroft's Algorithm W  {F, S-F}; P  {F, S-F}; // W is the worklist, P the current partition while ( W is not empty ) do begin select and remove s from W ; // s is a set of states for each  in  do begin let I     –1 ( s ); // I  is set of all states that can reach s on  for each p  P such that p  I  is not empty and p is not contained in I  do begin partition p into p 1 and p 2 such that p 1  p  I  ; p 2  p – p 1 ; P  (P – p)  p 1  p 2 ; if p   W then W  (W – p)  p 1  p 2 ; else if |p 1 | ≤ |p 2 | then W  W  p 1 ; else W  W  p 2 ; end

Q Comp 412, Fall Key Idea: Splitting p i around  S Original set p i pjpj pkpk How does the worklist algorithm ensure that it splits p k around R & T ? p k is everything in p i - p j R  T   Subtle point: either R or T (or both) must already be on the worklist. (R & T have split from {S-F}.) Thus, it can split p i around one state (S) & add either p j or p k to the worklist. p j is the image of S under the inverse of 

Comp 412, Fall A Detailed Example Remember ( a | b ) * abb ? (from last lecture) Applying the subset construction: Iteration 3 adds nothing to S, so the algorithm halts a | b q0q0 q1q1 q4q4 q2q2 q3q3  abb Our first NFA contains q 4 (final state) State  -closure(move(s i,  ) Iter.DFANFAab 0s0s0 q 0,q 1 q 1,q 2 q1q1 1s1s1 q 1,q 3 s2s2 q1q1 q 1,q 2 q1q1 2s3s3 q 1,q 3 q 1,q 2 q 1,q 4 3s4s4 q 1,q 2 q1q1

Comp 412, Fall A Detailed Example The DFA for ( a | b ) * abb Not much expansion from NFA (we feared exponential blowup) Deterministic transitions Use same code skeleton as before s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Character Stateab s0s0 s1s1 s2s2 s1s1 s1s1 s3s3 s2s2 s1s1 s2s2 s3s3 s1s1 s4s4 s4s4 s1s1 s2s2

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } For the record, example was right in 1999, broken in 2000

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 }

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 } none

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 }none{s 1 } {s 0,s 2 }

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 }none{s 1 } {s 0,s 2 } P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 }none{s 1 } {s 0,s 2 } P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }{s 1 }none

Comp 412, Fall A Detailed Example (DFA Minimization) s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 }none{s 1 } {s 0,s 2 } P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }{s 1 }none P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }{s 0,s 2 }none Empty worklist  done!

Comp 412, Fall A Detailed Example (DFA Minimization) Current PartitionWorklistsSplit on aSplit on b P0P0 {s 4 } {s 0,s 1,s 2,s 3 } {s 4 }none{s 3 } {s 0,s 1,s 2 } P1P1 {s 4 } {s 3 } {s 0,s 1,s 2 }{s 3 } {s 0,s 1,s 2 }{s 3 }none{s 1 } {s 0,s 2 } P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }{s 1 }none P2P2 {s 4 } {s 3 } {s 1 } {s 0,s 2 }{s 1 } {s 0,s 2 }{s 0,s 2 }none s0s0 a s1s1 b s3s3 b s4s4 s2s2 a b b a a a b s 0, s 2 a s1s1 b s3s3 b s4s4 b a a a b 20% reduction in number of states

Comp 412, Fall DFA Minimization What about a ( b | c ) * ? First, the subset construction: q0q0 q1q1 a  q4q4 q5q5 b q6q6 q7q7 c q3q3 q8q8 q2q2 q9q9      s3s3 s2s2 s0s0 s1s1 c b a b b c c States  -closure(Move(s,*)) DFANFA a bc s0s0 q0q0 s1s1 none s1s1 q 1, q 2, q 3, q 4, q 6, q 9 none s2s2 s3s3 s2s2 q 5, q 8, q 9, q 3, q 4, q 6 nones2s2 s3s3 s3s3 q 7, q 8, q 9, q 3, q 4, q 6 none s2s2 s3s3 From last lecture …

Comp 412, Fall DFA Minimization Then, apply the minimization algorithm It splits no states after the initial partition  The minimal DFA has two states  One for {s 0 }  One for {s 1,s 2,s 3 } s3s3 s2s2 s0s0 s1s1 c b a b b c c Split on Current Partitionabc P0P0 {s 1,s 2,s 3 } {s 0 }none

Comp 412, Fall DFA Minimization Then, apply the minimization algorithm It produces this DFA s3s3 s2s2 s0s0 s1s1 c b a b b c c s0s0 s1s1 a b | c In lecture 5, we observed that a human would design a simpler automaton than Thompson’s construction & the subset construction did. Minimizing that DFA produces the one that a human would design! Split on Current Partitionabc P0P0 {s 1,s 2,s 3 } {s 0 }none

Comp 412, Fall Abbreviated Register Specification Start with a regular expression r0 | r1 | r2 | r3 | r4 | r5 | r6 | r7 | r8 | r9 minimal DFA RENFADFA The Cycle of Constructions Register names from zero to nine

Comp 412, Fall Abbreviated Register Specification Thompson’s construction produces r 0 r 1 r 2 r 8 r 9 …… s0s0 sfsf                     … minimal DFA RENFADFA The Cycle of Constructions To make the example fit, we have eliminated some of the  -transitions, e.g., between r and 0

Comp 412, Fall Abbreviated Register Specification The subset construction builds This is a DFA, but it has a lot of states … r 0 sf0sf0 s0s0 sf1sf1 1 sf2sf2 2 sf9sf9 sf8sf8 … 9 8 minimal DFA RENFADFA The Cycle of Constructions

Comp 412, Fall Abbreviated Register Specification The DFA minimization algorithm builds This looks like what a skilled compiler writer would do! r s0s0 sfsf 0,1,2,3,4, 5,6,7,8,9 minimal DFA RENFADFA The Cycle of Constructions

Comp 412, Fall Automating Scanner Construction RE  NFA ( Thompson’s construction ) Build an NFA for each term Combine them with  -moves NFA  DFA ( subset construction ) Build the simulation DFA  Minimal DFA Brzozowski’s Algorithm Hopcroft’s algorithm DFA  RE (not really part of scanner construction) All pairs, all paths problem Union together paths from s 0 to a final state minimal DFA RENFADFA The Cycle of Constructions

Comp 412, Fall RE Back to DFA Kleene’s Construction for i  0 to |D| - 1;// label each immediate path for j  0 to |D| - 1; R 0 ij  { a |  (d i,a) = d j }; if (i = j) then R 0 ii = R 0 ii | {  }; for k  0 to |D| - 1;// label nontrivial paths for i  0 to |D| - 1; for j  0 to |D| - 1; R k ij  R k-1 ik (R k-1 kk )* R k-1 kj | R k-1 ij L  {}// union labels of paths from For each final state s i // s 0 to a final state s i L  L | R |D|-1 0i minimal DFA RENFADFA The Cycle of Constructions R k ij is the set of paths from i to j that include no state higher than k

Comp 412, Fall Limits of Regular Languages Not all languages are regular RL’s  CFL’s  CSL’s You cannot construct DFA’s to recognize these languages L = { p k q k } (parenthesis languages) L = { wcw r | w   * } Neither of these is a regular language (nor an RE) But, this is a little subtle. You can construct DFA ’s for Strings with alternating 0’s and 1’s (  | 1 ) ( 01 ) * (  | 0 ) Strings with and even number of 0’s and 1’s RE ’s can count bounded sets and bounded differences

Comp 412, Fall Limits of Regular Languages Advantages of Regular Expressions Simple & powerful notation for specifying patterns Automatic construction of fast recognizers Many kinds of syntax can be specified with RE s Example — an expression grammar Term  [a-zA-Z] ([a-zA-Z] | [0-9]) * Op  + | - |  | / Expr  ( Term Op ) * Term Of course, this would generate a DFA … If RE s are so useful … Why not use them for everything?

EXTRA SLIDES START HERE Comp 412, Fall

Comp 412, Fall DFA Minimization The algorithm T  { F, {S-F}} P  { } while ( P  T) P  T T  { } for each set p i  P T  T  Split(p i ) Split(S) for each c   if c splits S into s 1 & s 2 then return {s 1, s 2 } return S Why does this work? Partition P  2 S Start off with 2 subsets of S: {F} and {S-F} The while loop takes P i  P i+1 by splitting 1 or more sets P i+1 is at least one step closer to the partition with |S | sets Maximum of |S | splits Note that Partitions are never combined Initial partition ensures that final states remain final states This is a fixed-point algorithm! mild abuse of notation

Comp 412, Fall DFA Minimization Refining the algorithm As written, it examines every p i  P on each iteration —This strategy entails a lot of unnecessary work —Only need to examine p i if some T, reachable from p i, has split Reformulate the algorithm using a “worklist” —Start worklist with initial partition, F and {S-F} —When it splits p i into p 1 and p 2, place p 2 on worklist This version looks at each p i  P many fewer times Well-known, widely used algorithm due to John Hopcroft

Comp 412, Fall Alternative Approach to DFA Minimization The Intuition The subset construction merges prefixes in the NFA s0s0 s 10 s9s9 s8s8 s5s5 s7s7 s6s6 s3s3 s2s2 s1s1 s4s4    a b a b c d c abc | bc | ad Thompson’s construction would leave  -transitions between each single- character automaton s0s0 s6s6 s4s4 s5s5 s2s2 s1s1 s3s3 a b b c d c Subset construction eliminates  - transitions and merges the paths for a. It leaves duplicate tails, such as bc.

Comp 412, Fall Alternative Approach to DFA Minimization Idea: use the subset construction twice For an NFA N —Let reverse(N) be the NFA constructed by making initial states final (& vice-versa) and reversing the edges —Let subset(N) be the DFA that results from applying the subset construction to N —Let reachable(N) be N after removing all states that are not reachable from the initial state Then, reachable(subset(reverse[reachable(subset(reverse(N))])) is the minimal DFA that implements N [ Brzozowski, 1962 ] This result is not intuitive, but it is true. Neither algorithm dominates the other.

Comp 412, Fall Alternative Approach to DFA Minimization Step 1 The subset construction on reverse(NFA) merges suffixes in original NFA s 11    Reversed NFA s0s0 s 10 s9s9 s8s8 s5s5 s7s7 s6s6 s3s3 s2s2 s1s1 s4s4    a b a b c d c s 11 s9s9 s8s8 s3s3 s2s2 s1s1 a a b d c subset(reverse(NFA))

Comp 412, Fall Alternative Approach to DFA Minimization Step 2 Reverse it again & use subset to merge prefixes … Reverse it, again s 11 s9s9 s8s8 s3s3 s2s2 s1s1 a a b d c s0s0    And subset it, again s 11 s3s3 s2s2 a b d c s0s0 b Minimal DFA minimal DFA RENFADFA The Cycle of Constructions Brzozowski