Presentation is loading. Please wait.

Presentation is loading. Please wait.

Context-free grammars. Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers.

Similar presentations


Presentation on theme: "Context-free grammars. Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers."— Presentation transcript:

1 Context-free grammars

2 Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers

3 RegExs Are Great! Perfect for tokenizing a language They do have some limitations – Limited class of language that cannot specify all programming constructs we need – No notion of structure Let’s explore both of these issues

4 Limitations of RegExs Cannot handle “matching” – Eg: language of balanced parentheses L () = { ( n ) n where n > 0} cannot be matched – Intuition: An FSM can only handle a finite depth of parentheses that we can handle let’s see a diagram…

5 Theorem: No RegEx/DFA can describe the language L () By contradiction: Let’s say there exists a DFA A for L () and such a DFA has N states A has to accept the string ( N ) N with some path q 0 q 1 …q N …q 2N+1 By pigeonhole principle some state has repeated: q i = q j for some i<j<N Therefore the run q 0 q 1 …q i q j+1 …q N …q 2N+1 is also accepting A accepts the string ( N-(j-i) ) N  contradiction!

6 Limitations of RegEx: Structure Our Enhanced-RegEx scanner can emit a stream of tokens: X = Y + Z … but this doesn’t really enforce any order of operations IDASSIGN IDPLUSID

7 The Chomsky Hierarchy Regular Context-Free Context-Sensitive Recursively enumerable power efficiency LANGUAGE CLASS: FSM Turing machine Happy medium? Noam Chomsky

8 Context Free Grammars (CFGs) A set of (recursive) rewriting rules to generate patterns of strings Can envision a “parse tree” that keeps structure

9 CFG: Intuition S → ‘(‘ S ‘)’ A rule that says that you can rewrite S to be an S surrounded by a single set of parenthesis S S S() After applying ruleBefore applying rule

10 Context Free Grammars (CFGs) A CFG is a 4-tuple (N,Σ,P,S) N is a set of non-terminals, e.g., A, B, S… Σ is the set of terminals P is a set of production rules S (in N) is the initial non-terminal symbol

11 Context Free Grammars (CFGs) A CFG is a 4-tuple (N,Σ,P,S) N is a set of non-terminals, e.g., A, B, S… Σ is the set of terminals P is a set of production rules S (in N) is the initial non-terminal symbol Placeholder / interior nodes in the parse tree Tokens from scanner Rules for deriving strings If not otherwise specified, use the non-terminal that appears on the LHS of the first production as the start

12 Production Syntax Expression: Sequence of terminals and nonterminals LHS → RHS Single nonterminal symbol Examples: S  ‘(‘ S ‘)’ S  ε

13 Production Shorthand Nonterm → expression Nonterm→ ε equivalently: Nonterm → expression | ε equivalently: Nonterm → expression | ε S  ‘(‘ S ‘)’ S  ε S  ‘(‘ S ‘)’ | ε S  ‘(‘ S ‘)’ | ε

14 Derivations To derive a string: Start by setting “Current Sequence” to the start symbol Repeat: – Find a Nonterminal X in the Current Sequence – Find a production of the form X→α – “Apply” the production: create a new “current sequence” in which α replaces X Stop when there are no more non-terminals

15 Derivation Syntax

16 An Example Grammar

17 Terminals begin end semicolon assign id plus

18 An Example Grammar Terminals begin end semicolon assign id plus For readability, bold and lowercase

19 An Example Grammar Terminals begin end semicolon assign id plus Program boundary For readability, bold and lowercase

20 An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements For readability, bold and lowercase

21 An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement For readability, bold and lowercase

22 An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name For readability, bold and lowercase

23 An Example Grammar Terminals begin end semicolon assign id plus Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name Represents “+“ expression For readability, bold and lowercase

24 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase

25 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase

26 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree For readability, bold and lowercase For readability, Italics and UpperCamelCase

27 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements For readability, bold and lowercase For readability, Italics and UpperCamelCase

28 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement For readability, bold and lowercase For readability, Italics and UpperCamelCase

29 An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Root of the parse tree List of statements A single statement A mathematical expression For readability, bold and lowercase For readability, Italics and UpperCamelCase

30 Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr For readability, bold and lowercase For readability, Italics and UpperCamelCase Defines the syntax of legal programs

31 Productions Prog → begin Stmts end Stmts → Stmts semicolon Stmt | Stmt Stmt → id assign Expr Expr→ id | Expr plus id An Example Grammar Terminals begin end semicolon assign id plus Nonterminals Prog Stmts Stmt Expr Program boundary Represents “;” Separates statements Represents “=“ statement Identifier / variable name Represents “+“ expression Root of the parse tree List of statements A single statement A mathematical expression Defines the syntax of legal programs For readability, bold and lowercase For readability, Italics and UpperCamelCase

32 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id

33 Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id

34 Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Parse Tree

35 Derivation Sequence Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Parse Tree Key terminal Nonterminal Rule used

36 Derivation Sequence Prog Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Prog Parse Tree Key terminal Nonterminal Rule used

37 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Prog Parse Tree 1 Key terminal Nonterminal Rule used

38 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id endbeginStmts Prog Parse Tree Key terminal Nonterminal Rule used 1

39 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id StmtStmts endbeginStmts Prog semicolon Parse Tree 2 Key terminal Nonterminal Rule used 1

40 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id Stmt Stmts endbeginStmts Prog semicolon Parse Tree 3 Key terminal Nonterminal Rule used 2 1

41 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Stmt Stmts endbeginStmts Prog semicolon Parse Tree 4 Key terminal Nonterminal Rule used 3 2 1

42 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign idExprassign Stmt Stmts endbeginStmts Prog semicolon Parse Tree 4 Key terminal Nonterminal Rule used 4 3 2 1

43 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign idExprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree 5 Key terminal Nonterminal Rule used 4 4 3 2 1

44 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Exprplusid Exprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree 6 Key terminal Nonterminal Rule used 5 4 4 3 2 1

45 Productions 1.Prog → begin Stmts end 2.Stmts → Stmts semicolon Stmt 3. | Stmt 4.Stmt → id assign Expr 5.Expr → id 6. | Expr plus id idExprassign Exprplusid Exprassign id Stmt Stmts endbeginStmts Prog semicolon Parse Tree 5 Key terminal Nonterminal Rule used id 6 5 4 4 3 2 1

46 MAKEFILE A five minute introduction

47 Makefiles: Motivation Typing the series of commands to generate our code can be tedious – Multiple steps that depend on each other – Somewhat complicated commands – May not need to rebuild everything Makefiles solve these issues – Record a series of commands in a script-like DSL – Specify dependency rules and Make generates the results

48 Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab)

49 Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab)

50 Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab) Example.class depends on example.java and IO.class

51 Makefiles: Basic Structure : Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java (tab) Example.class depends on example.java and IO.class Example.class is generated by javac Example.java

52 Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java

53 Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java Internal Dependency graph

54 Makefiles: Dependencies Example Example.class: Example.java IO.class javac Example.java IO.class: IO.java javac IO.java Example.class Example.java IO.class IO.java Internal Dependency graph A file is rebuilt if one of it’s dependencies changes

55 Makefiles: Variables You can thread common configuration values through your makefile

56 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g

57 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Build for debug

58 Makefiles: Variables You can thread common configuration values through your makefile Example JC = /s/std/bin/javac JFLAGS = -g Example.class: Example.java IO.class $(JC) $(JFLAGS) Example.java IO.class: IO.java $(JC) $(JFLAGS) IO.java Build for debug

59 Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

60 Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

61 Makefiles: Phony Targets You can run commands through make. – Write a target with no dependencies (called phony) – Will cause it to execute the command every time Example clean: rm –f *.class test: java –cp. Test.class

62 Recap We’ve defined context-free grammars – More powerful than regular expressions Learned a bit about makefile Next time we’ll look at grammars in more detail


Download ppt "Context-free grammars. Roadmap Last time – Regex == DFA – JLex for generating Lexers This time – CFGs, the underlying abstraction for Parsers."

Similar presentations


Ads by Google