Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free.

Similar presentations


Presentation on theme: "CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free."— Presentation transcript:

1 CSCI 3130: Automata theory and formal languages Andrej Bogdanov http://www.cse.cuhk.edu.hk/~andrejb/csc3130 The Chinese University of Hong Kong Context-free languages Fall 2011

2 Precedence in arithmetic expressions bash-3.2$ python Python 2.6.5 (r265:79359, Mar 24 2010, 01:32:55) >>> 2+3*5 17 + 5 23 * 2 * 35 + or = 25 = 17

3 Grammars describe meaning EXPR → EXPR + TERM EXPR → TERM TERM → TERM * NUM TERM → NUM NUM → 0 - 9 rules for valid (simple) arithmetic expressions 23*+5 NUM TERM EXPR Rules always yield the correct meaning

4 The grammar of English SENTENCE → NOUN-PHRASE VERB-PHRASE a girl likes the boy NOUN-PHRASE VERB-PHRASE NOUN-PHRASE → A-NOUN or → A-NOUN PREP-PHRASE a girl A-NOUN a girl with a flower A-NOUNPREP-PHRASE

5 The grammar of English NOUN-PHRASE → A-NOUN or → A-NOUN PREP-PHRASE a girla girl with a flower A-NOUN PREP-PHRASE PREP-PHRASE → PREP NOUN-PHRASE with a flower PREPNOUN-PHRASE recursive structure

6 The grammar of (parts of) English SENTENCE → NOUN-PHRASE VERB-PHRASE NOUN-PHRASE → A-NOUN NOUN-PHRASE → A-NOUN PREP-PHRASE VERB-PHRASE → CMPLX-VERB VERB-PHRASE → CMPLX-VERB PREP-PHRASE PREP-PHRASE → PREP A-NOUN A-NOUN → ARTICLE NOUN CMPLX-VERB → VERB NOUN-PHRASE CMPLX-VERB → VERB ARTICLE → a ARTICLE → the NOUN → boy NOUN → girl NOUN → flower VERB → likes VERB → touches VERB → sees PREP → with

7 The meaning of sentences a girl with a flower likes the boy ARTICLENOUNPREPARTICLENOUNVERBARTICLENOUN SENTENCE VERB-PHRASE NOUN-PHRASE CMPLX-VERB PREP-PHRASE NOUN-PHRASE A-NOUN

8 Context-free grammar A → 0 A 1 A → B B → # A  0A1 0A1  00 A 11  000 A 111  000 B 111  000#111 derivation productions variables terminals start variable

9 Context-free grammar A context-free grammar is given by (V, , R, S) where –V is a finite set of variables or non-terminals –  is a finite set of terminals –R is a set of productions or substitution rules of the form A is a variable and  is a string of variables and terminals –S is a variable called the start variable A → 

10 Notation and conventions E  E + E E  ( E ) E  N E  E + E | ( E ) | N N  0 N | 1 N | 0 | 1 Variables: E, N Terminals: +, *, (, ), 0, 1 Start variable: E N  0 N N  1 N N  0 N  1 Variables in UPPERCASE Start variable comes first conventions : shorthand :

11 Derivation A derivation is a sequential application of productions: E derivation  E + E  ( E )+ E  ( E )+ N  ( E + E )+ 1  ( E + N )+ 1  ( N + N )+ 1  ( N + 1 N )+ 1  ( N + 10)+ 1  (1 + 10)+ 1  one production  * derivation E  E + E | ( E ) | N N  0 N | 1 N | 0 | 1 E  (1 + 10)+ 1 *

12 Context-free languages The language of a CFG is the set of all strings at the end of a derivation L(G) = {w : w   * and S  w } * Questions we will ask: I give you a CFG, what is the language? I give you a language, write a CFG for it

13 Analysis example 1 Can you derive: A → 0 A 1 | B B → # 00#11 00#111 00##11 # A  0A1 0A1  00 A 11  00 B 11  00#11 A  B B  # # No, there is an uneven number of 0 s and 1 s No, there are too many # L(G) = { 0 n #1 n : n ≥ 0}

14 Analysis example 2 Can you derive S  SS | ( S ) |  S  ( S ) (2)  () (3) S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) () (()())

15 Parse trees A parse tree gives a more compact representation: S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) (()()) S S  SS | ( S ) |  SS () S  () S  S ( )

16 Parse trees S  ( S )  ( SS )  (( S ) S )  (( S )( S ))  (()( S ))  (()()) S S S () S  S ( ) () S  S  ( S )  ( SS )  (( S ) S )  (() S )  (()( S ))  (()()) S  ( S )  ( SS )  ( S ( S ))  (( S )( S ))  (()( S ))  (()()) S  ( S )  ( SS )  ( S ( S ))  ( S ())  (( S )())  (()()) One parse tree can represent many derivations

17 Analysis example 2 Can you derive S  SS | ( S ) |  (()() No, because there is an uneven number of ( and ) ())(() No, because there is a prefix with an excess of )

18 Analysis example 2 S  SS | ( S ) |  L(G) = {w: w has the same number of ( and ) no prefix of w has more ) than ( } ( ( ) ( ) ) ( ) Parsing rules: Divide w up in blocks with same number of ( and ) Each block is in L(G) Parse each block recursively S S S S S S S   S S 

19 Design example 1 L = {0 n 1 n | n  0} S  These strings have recursive structure: 000000111111 0000011111 00001111 000111 0011 01  0S1| 

20 Design example 2 L = numbers without leading zeros 0, 109, 2, 23 , 01, 003 allowednot allowed L → 1|2|3|4|5|6|7|8|9 S → 0|LN D → 0|L N → DN|  1052870032 any number N leading digit L

21 Design examples L = {0 n 1 n 0 m 1 m | n  0, m  0} These strings have two parts: L 1 = {0 n 1 n | n  0} L 2 = {0 m 1 m | m  0} L = L 1 L 2 rules for L 1 :S 1  0S 1 1|  L 2 is the same as L 1 S  S 1 S 1 S 1  0S 1 1 |  010011 000111 00110011

22 Design examples L = {0 n 1 m 0 m 1 n | n  0, m  0} These strings have nested structure: inner part: 1 m 0 m outer part: 0 n 1 n S  0S1|I I  1I0 |  011001 1100 0011 00110011

23 Design examples L = {x: x has two 0-blocks with same number of 0s} 01011, 001011001, 10010101001 01001000, 01111 allowednot allowed 10010011010010110 initial part middle partfinal part ABC A : , or ends in 1 C : , or begins with 1

24 Design examples 10010011010010110 ABC A : , or ends in 1 C : , or begins with 1 A →  | U1 U → 0U | 1U |  C →  | 1U D → 1U1 | 1 S → ABC B has recursive structure: 00110100 D same number of 0 s at least one 0 B → 0D0 | 0B0 U : any string D : begins and ends in 1

25 Context-free versus regular Write a CFG for the language (0 + 1)*111 Can you do so for every regular language? S  U111 U  0U | 1U |  Every regular language is context-free regular expression DFANFA

26 From regular to context-free regular expression   a (alphabet symbol) E 1 + E 2 CFG E1E2E1E2 E1*E1* grammar with no rules S  →  S →  a S  → S 1 | S 2 S  → S 1 S 2 S  → SS 1 |  ( S becomes the new start symbol)

27 Context-free versus regular Is every context-free language regular? S → 0S1 |  L = {0 n 1 n : n ≥ 0} Is context-free but not regular regularcontext-free


Download ppt "CSCI 3130: Automata theory and formal languages Andrej Bogdanov The Chinese University of Hong Kong Context-free."

Similar presentations


Ads by Google