Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)

Left over Topics of Chapter 1 What is Analysis /Synthesis Model of Compilation? Symbol Table Management Error Detection and Reporting What is meant by grouping of compilation phases into Front End and Back End? What is meant by Single / Multiple Passes? What are the Compiler Construction Tools available?

The Analysis-Synthesis Model of Compilation There are two parts to compilation: – Analysis determines the operations implied by the source program which are recorded in a tree structure – Synthesis takes the tree structure and translates the operations therein into the target program 3

Another way.. Analysis: breaks the source program into constituent pieces and creates intermediate representation Synthesis: constructs target program from the intermediate representation The first three phases namely: Lexical Analysis, Syntax Analysis and Semantic Analysis form the analysis part The last three phases form the Synthesis part

Symbol Table Management An essential function of a compiler is to record the identifiers used in the source program and to collect information about various attributes of each identifier A symbol table is a data structure containing an entry for each identifier with fields for the attributes of the identifier

Error Detection and Reporting Each phase of the compiler can encounter error. After detecting error the compiler must deal with that error so that compilation can proceed. A lexical analyzer will detect errors where characters do not form a token Errors where token violates the syntax are determined by syntax analysis If the compiler tries to add two variables one of which is the name of a function and another is an array then Symantic Analysis will throw error

Section 1.5: The Grouping of Phases Compiler phases are grouped into front and back ends: – Front end: analysis (machine independent) – Back end: synthesis (machine dependent) Front End focuses on understanding the source program and the backend focuses on mapping programs to the target machine. 7

Compiler Passes Compiler Passes: – A collection of phases is done only once (single pass) or multiple times (multi pass) Single pass: usually requires everything to be defined before being used in source program Multi pass: compiler may have to keep entire program representation in memory

Section 1.6: Compiler-Construction Tools Software development tools are available to implement one or more compiler phases – Scanner generators (Lex and Flex) – Parser generators (Yacc and Bison) – Syntax-directed translation engines – Automatic code generators – Data-flow engines For further details this webpage would be sufficient http://dinosaur.compilertools.net/ 9 COP5621 Fall 2009

ANTLR 3.x Project for Compiler Construction This is a project that is built using Eclipse and the source code along with all the class files are available in Java. This aids the students in creating compiler project on a fly. Its C# libraries are also available that can be used. I would try to take a lab and discuss it It tutorials and videos are available at the following address: http://www.vimeo.com/groups/29150/videos

Recap of the last lecture Difference: Preprocessor Compiler Assembler Linker Source Program Target Assembly Program Relocatable Object Code 11 Absolute Machine Code Skeletal Source Program Libraries and Relocatable Object Files

Recap We discussed: What are Regular Expressions ? How to write ? RE→NFA (Thompson’s construction) NFA →DFA (Subset construction)

13 The Subset Construction Algorithm Initially,  -closure(s 0 ) is the only state in Dstates and it is unmarked while there is an unmarked state T in Dstates do mark T for each input symbol a   do U :=  -closure(move(T,a)) if U is not in Dstates then add U as an unmarked state to Dstates end if Dtran[T,a] := U end do end do

14 Subset Construction Example 2 a 1 6 a 3 45 bb 8 b 7 a b 0 start    a1a1 a2a2 a3a3 Dstates A = {0,1,3,7} B = {2,4,7} C = {8} D = {7} E = {5,8} F = {6,8} A start a D b b b a b b B C EF a b a1a1 a3a3 a3a3 a 2 a 3

Today’s Lecture How can we minimize a DFA? (Hopcroft’s Algorithm) What are the important states of an NFA? How to convert from a Regular Expression to DFA directly?

Section 3.9: Minimization of DFA What do we want to achieve?

Hopcroft’s Algorithm Pg 142 Input: A DFA M with set of states S, set of inputs , transition function defined, start state So and set of accepting states F Output: A DFA M’ accepting the same language as M and having fewer states as possible

Algorithm 3.6 Method: Step1:Construct an initial partition P of the states with two groups : the accepting states (F) and the non accepting states (S-F) Step2:Apply the following procedure (Construction of Pnew) to construct a new partition (Pnew)

Procedure for Pnew construction For each group G of P do partition G into subgroups such that two states s and t are in the same subgroup if and only if for all input symbols a, states s and t have transitions on a to states in the same group of P Replace G in Pnew by the set of all subgroups formed

Algorithm 3.6(continue..) Step3: If Pnew = P and proceed to step 4. Otherwise repeat step 2 with P=Pnew Step4:Choose one state as the state representative and add these states in M’ Step5: If M’ has a dead state and unreachable state then remove those states (A dead state is a non accepting state that has transitions to itself on all inputs. An unreachable state is any state not reachable from the start state ) Step6: Complete

Example # 1 The DFA for (a|b) *abb

Example # 1 (Applying Minimization)

Example # 2 The DFA for a(b|c)*

Example #2: Applying Minimization

Example # 3 Minimize the following DFA:

Example 3: Minimized

Example # 4 Minimize the following DFA: AB C D E b b b b b a a a a a A start BD E bb a a b a a

28 Section 3.9: From Regular Expression to DFA Directly The “important states” of an NFA are those without an  -transition, that is if move({s},a)   for some a then s is an important state The subset construction algorithm uses only the important states when it determines  -closure(move(T,a))

29 From Regular Expression to DFA Directly (Algorithm) Augment the regular expression r with a special end symbol # to make accepting states important: the new expression is r# Construct a syntax tree for r# Traverse the tree to construct functions nullable, firstpos, lastpos, and followpos

30 From Regular Expression to DFA Directly: Syntax Tree of ( a | b )* abb# * | 1 a 2 b 3 a 4 b 5 b # 6 concatenation closure alternation position number (for leafs  )

31 From Regular Expression to DFA Directly: Annotating the Tree nullable(n): the sub tree at node n generates languages including the empty string firstpos(n): set of positions that can match the first symbol of a string generated by the sub tree at node n lastpos(n): the set of positions that can match the last symbol of a string generated be the sub tree at node n followpos(i): the set of positions that can follow position i in the tree

32 From Regular Expression to DFA Directly: Annotating the Tree Node nnullable(n)firstpos(n)lastpos(n) Leaf  true  Leaf ifalse{i}{i}{i}{i} | / \ c 1 c 2 nullable(c 1 ) or nullable(c 2 ) firstpos(c 1 )  firstpos(c 2 ) lastpos(c 1 )  lastpos(c 2 ) / \ c 1 c 2 nullable(c 1 ) and nullable(c 2 ) if nullable(c 1 ) then firstpos(c 1 )  firstpos(c 2 ) else firstpos(c 1 ) if nullable(c 2 ) then lastpos(c 1 )  lastpos(c 2 ) else lastpos(c 2 ) *|c1*|c1 truefirstpos(c 1 )lastpos(c 1 )

33 From Regular Expression to DFA Directly: Syntax Tree of ( a | b )* abb# {6}{1, 2, 3} {5}{1, 2, 3} {4}{1, 2, 3} {3}{1, 2, 3} {1, 2} * | {1} a {2} b {3} a {4} b {5} b {6} # nullable firstposlastpos 12 3 4 5 6

34 From Regular Expression to DFA Directly: followpos for each node n in the tree do if n is a cat-node with left child c 1 and right child c 2 then for each i in lastpos(c 1 ) do followpos(i) := followpos(i)  firstpos(c 2 ) end do else if n is a star-node for each i in lastpos(n) do followpos(i) := followpos(i)  firstpos(n) end do end if end do

35 From Regular Expression to DFA Directly: Algorithm s 0 := firstpos(root) where root is the root of the syntax tree Dstates := {s 0 } and is unmarked while there is an unmarked state T in Dstates do mark T for each input symbol a   do let U be the set of positions that are in followpos(p) for some position p in T, such that the symbol at position p is a if U is not empty and not in Dstates then add U as an unmarked state to Dstates end if Dtran[T,a] := U end do end do

36 From Regular Expression to DFA Directly: Example 1,2,3 start a 1,2, 3,4 1,2, 3,6 1,2, 3,5 bb bb a a a Nodefollowpos 1{1, 2, 3} 2 3{4} 4{5} 5{6} 6- 1 2 3456

37 Time-Space Tradeoffs Automaton Space (worst case) Time (worst case) NFA O(r)O(r)O(  r  x  ) DFAO(2 |r| ) O(x)O(x)

Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)

Similar presentations

Presentation on theme: "Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)

Similar presentations

Presentation on theme: "Lecture # 4 Chapter 1 (Left over Topics) Chapter 3 (continue)"— Presentation transcript:

Similar presentations

About project

Feedback