Download presentation
Presentation is loading. Please wait.
Published byGabriel Moris Lewis Modified over 8 years ago
1
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong
2
Last Time Talked about: – 1. Declarative (logical) reading of grammar rules – 2. Prolog query: s(String,[]). Case 1. String is known: Is String ∈ L(G)? Case 2. String is unknown: Enumerate L(G) – 3. Different search strategies Prolog's (left-to-right) depth-first search Iterative deepening
3
Beyond Regular Languages Beyond regular languages – a n b n = {ab, aabb, aaabbb, aaaabbbb,... } n≥1 – is not a regular language That means no FSA, RE or RG can be built for this set 1.We only have a finite number of states to play with … 2.We’re only allowed simple free iteration (looping) 3.Pumping Lemma proof
4
Beyond Regular Languages Language – a n b n = {ab, aabb, aaabbb, aaaabbbb,... } n>=1 A regular grammar extended to allow both left and right recursive rules can accept/generate it: 1.a --> [a], b. 2.b --> [b]. 3.b --> a, [b]. Example: Set membership Set enumeration
5
Beyond Regular Languages Language – a n b n = {ab, aabb, aaabbb, aaaabbbb,... } n>=1 A regular grammar extended to allow both left and right recursive rules can accept/generate it: 1.a --> [a], b. 2.b --> [b]. 3.b --> a, [b]. Intuition: – grammar implements the stacking of partial trees balanced for a’s and b’s: B A b a A B A b a
6
Beyond Regular Languages Language – a n b n = {ab, aabb, aaabbb, aaaabbbb,... } n>=1 A regular grammar extended to allow both left and right recursive rules can accept/generate it: 1.a --> [a], b. 2.b --> [b]. 3.b --> a, [b]. A type-2 or context-free grammar (CFG) has no restrictions on what can go on the RHS of a grammar rule Note : – CFGs still have a single nonterminal limit for the LHS of a rule Example: 1.s --> [a], [b]. 2.s --> [a], s, [b].
7
Extra Argument: Parse Tree Recovering a parse tree – when want Prolog to return more than just true/false answers – in case of true, we can compute a syntax tree representation of the parse – by adding an extra argument to nonterminals – applies to all grammar rules (not just regular grammars) Example sheeptalk again DCG (non-regular, context-free): s --> [b], [a], a, [!]. a --> [a]. (base case) a --> [a], a. (recursive case) s a! a a a a b
8
Extra Argument: Parse Tree Tree: Prolog data structure: – term – hierarchical – allows sequencing of arguments – functor(arg 1,..,arg n ) – each arg i could be another term or simple atom s a! a a a a b s(b,a,a(a,a(a)),!)
9
Extra Arguments: Parse Tree DCG – s --> [b],[a], a, [!]. – a --> [a]. (base case) – a --> [a], a. (right recursive case) base case –a --> [a]. –a(subtree) --> [a]. –a(a(a)) --> [a]. recursive case –a --> [a], a. –a(subtree) --> [a], a(subtree). –a(a(a,A)) --> [a], a(A). s a! a a a a b s(b,a,a(a,a(a)),!) Idea: for each nonterminal, add an argument to store its subtree
10
Extra Arguments: Parse Tree Prolog grammar – s --> [b], [a], a, [!]. – a --> [a]. (base case) – a --> [a], a. (right recursive case) base and recursive cases –a(a(a)) --> [a]. –a(a(a,A)) --> [a], a(A). start symbol case –s --> [b], [a], a, [!]. –s(tree) --> [b], [a], a(subtree), [!]. –s(s(b,a,A,!) ) --> [b], [a], a(A), [!]. s a! a a a a b s(b,a,a(a,a(a)),!)
11
Extra Arguments: Parse Tree Prolog grammar – s --> [b], [a], a, [!]. – a --> [a]. (base case) – a --> [a], a. (right recursive case) Equivalent Prolog grammar computing a parse –s(s(b,a,A,!)) --> [b], [a], a(A), [!]. –a(a(a)) --> [a]. –a(a(a,A)) --> [a], a(A).
12
Extra Arguments Extra arguments are powerful – they allow us to impose (grammatical) constraints and change the expressive power of the system (if used as memory) Example: – a n b n c n n>0 is not context-free (context-sensitive)
13
Extra arguments A context-free grammar (CFG) + extra argument (EA) for the context-sensitive language { a n b n c n | n>0}: 1.s(s(A,A,A)) --> a(A), b(A), c(A). 2.a(a(a)) --> [a]. 3.a(a(a,X)) --> [a], a(X). 4.b(a(a)) --> [b]. 5.b(a(a,X)) --> [b], b(X). 6.c(a(a)) --> [c]. 7.c(a(a,X)) --> [c], c(X).
14
Extra arguments A CFG+EA for a n b n c n n>0: Set membership question
15
Extra arguments A CFG+EA grammar for a n b n c n n>0: Set enumeration
16
Another grammar for {a n b n c n |n>0} Use Prolog’s arithmetic predicates. { … } embeds Prolog code inside grammar rules These are not nonterminal or terminal symbols. Used in grammar rules, we must enclose these statements within curly braces These are not nonterminal or terminal symbols. Used in grammar rules, we must enclose these statements within curly braces 16
17
Another Grammar for {a n b n c n |n>0} Explicit computation of the number of a’s using arithmetic. { … } embeds Prolog code inside grammar rules
18
Another Grammar for {a n b n c n |n>0} Parsing the a’s
19
Another Grammar for {a n b n c n |n>0} Computing the b’s
20
Another Grammar for {a n b n c n |n>0} Computing the c’s
21
Another grammar for {a n b n c n |n>0} Grammar is “correct” but not so efficient… – consider string [a,a,b,b,b,b,b,b,b,c,c] s --> a(X), b(X), c(X). a(1) --> [a]. a(N) --> [a], a(M), {N is M+1}. b(1) --> [b]. b(N) --> [b], b(M), {N is M+1}. c(1) --> [c]. c(N) --> [c], c(M), {N is M+1}. counts upwards could change to count down could change to count down
22
A context-sensitive grammar for {a n b n c n |n>0} Context-sensitive grammar has rules of the form LHS RHS – such that both LHS and RHS can be arbitrary strings of terminals and non-terminals, and – |RHS| ≥ |LHS| (exception: S ε, S not in RHS) This is almost a normal Prolog DCG: – (but rules 5 & 6 contain more than one non-terminal on the LHS): 1.s --> [a,b,c]. 2.s --> [a],a,[b,c]. 3.a --> [a,b], c. 4.a --> [a],a,[b],c. 5.c,[b] --> [b], c. 6.c,[c] --> [c,c]. rules 5 and 6 are responsible for shuffling the c's to the end
23
A context-sensitive grammar for {a n b n c n |n>0} ?- listing([s,a,c]). 1.s([a, b, c|A], A). 2.s([a|A], C) :- a(A, B), B=[b, c|C]. 3.a([a, b|A], B) :- c(A, B). 4.a([a|A], D) :- a(A, B), B=[b|C], c(C, D). 5.c(A, C) :- A=[b|B], c(B, D), C=[b|D]. 6.c([c, c|A], [c|A]). 1.s --> [a,b,c]. 2.s --> [a],a,[b,c]. 3.a --> [a,b], c. 4.a --> [a],a,[b],c. 5.c,[b] --> [b], c. 6.c,[c] --> [c,c].
24
A context-sensitive grammar for {a n b n c n |n>0} [a,a,a,b,b,b,c,c,c] 1.s 2.[a],a,[b,c] 3.[a],[a],a,[b],c,[b,c] 4.[a],[a],[a,b],c,[b],c,[b,c] 5.[a],[a],[a,b],[b],c,c,[b,c] 6.[a],[a],[a,b],[b],c,[b],c,[c] 7.[a],[a],[a,b],[b],[b],c,c,[c] 8.[a],[a],[a,b],[b],[b],c,[c,c] 9.[a],[a],[a,b],[b],[b],[c,c,c] 10.[a,a,a,b,b,b,c,c,c] 1.s --> [a,b,c]. 2.s --> [a],a,[b,c]. 3.a --> [a,b], c. 4.a --> [a],a,[b],c. 5.c,[b] --> [b], c. 6.c,[c] --> [c,c].
25
A context-sensitive grammar for {a n b n c n |n>0} 1.s([a,a,a,b,b,b,c,c,c],[]) 1.a([a,a,b,b,b,c,c,c],B) 1.a([a,b,b,b,c,c,c],B) 1.c([b,b,c,c,c],B) 1.c([b,c,c,c],D) 1.c([c,c,c],D) 2.=> c([c,c,c],[c,c]) 3.C=[b|[c,c]] 2.=> c([b,c,c,c],[b,c,c]) 3.C=[b|[b,c,c]] 2.=> c([b,b,c,c,c],[b,b,c,c]) 2.=> a([a,b,b,b,c,c,c],[b,b,c,c]) 3.[b,b,c,c]=[b|C](C=[b,c,c]) 4.c([b,c,c],D) 1.c([c,c],D) 2.=> c([c,c],[c]) 3.C=[b|[c]] 5.=> c([b,c,c],[b,c]) 2.=> a([a,a,b,b,b,c,c,c],[b,c]) 3.[b,c]=[b,c|[]] 1.s([a, b, c|A], A). 2.s([a|A], C) :- a(A, B), B=[b, c|C]. 3.a([a, b|A], B) :- c(A, B). 4.a([a|A], D) :- a(A, B), B=[b|C], c(C, D). 5.c(A, C) :- A=[b|B], c(B, D), C=[b|D]. 6.c([c, c|A], [c|A]).
26
A context-sensitive grammar for {a n b n c n |n>0} 1.s([a,a,b,b,b,c,c,c],[]) 1.a([a,b,b,b,c,c,c],B) 1.c([b,b,c,c,c],B) 1.c([b,c,c,c],D) 1.c([c,c,c],D) 2.=> c([c,c,c],[c,c]) 3.C=[b|[c,c]] 2.=> c([b,c,c,c],[b,c,c]) 3.C=[b|[b,c,c]] 2.=> c([b,b,c,c,c],[b,b,c,c]) 2.=> a([a,b,b,b,c,c,c],[b,b,c,c]) 3.[b,b,c,c]=[b,c|[]] FAIL 1.s([a, b, c|A], A). 2.s([a|A], C) :- a(A, B), B=[b, c|C]. 3.a([a, b|A], B) :- c(A, B). 4.a([a|A], D) :- a(A, B), B=[b|C], c(C, D). 5.c(A, C) :- A=[b|B], c(B, D), C=[b|D]. 6.c([c, c|A], [c|A]).
27
A context-sensitive grammar for {a n b n c n |n>0} 1.s([a,a,a,b,b,c,c,c],[]) 1.a([a,a,b,b,c,c,c],B) 1.a([a,b,b,c,c,c],B) 1.c([b,c,c,c],B) 1.c([c,c,c],D) 2.=> c([c,c,c],[c,c]) 3.C=[b|[c,c]] 2.=> c([b,c,c,c],[b,c,c]) 2.=> a([a,b,b,c,c,c],[b,c,c]) 3.[b,c,c]=[b|C](C=[c,c]) 4.c([c,c],D) 5.=> c([c,c],[c]) 2.=> a([a,a,b,b,c,c,c],[c]) 3.[c]=[ b,c|[]] FAIL 1.s([a, b, c|A], A). 2.s([a|A], C) :- a(A, B), B=[b, c|C]. 3.a([a, b|A], B) :- c(A, B). 4.a([a|A], D) :- a(A, B), B=[b|C], c(C, D). 5.c(A, C) :- A=[b|B], c(B, D), C=[b|D]. 6.c([c, c|A], [c|A]).
28
A context-sensitive grammar for {a n b n c n |n>0} 1.s([a,a,a,b,b,b,c,c],[]) 1.a([a,a,b,b,b,c,c],B) 1.a([a,b,b,b,c,c],B) 1.c([b,b,c,c],B) 1.c([b,c,c],D) 1.c([c,c],D) 2.=> c([c,c],[c]) 3.C=[b|[c]] 2.=> c([b,c,c],[b,c]) 3.C=[b|[b,c]] 2.=> c([b,b,c,c],[b,b,c]) 2.=> a([a,b,b,b,c,c],[b,b,c]) 3.[b,b,c]=[b|C](C=[b,c]) 4.c([b,c],D) 1.c([c],D) FAIL 1.s([a, b, c|A], A). 2.s([a|A], C) :- a(A, B), B=[b, c|C]. 3.a([a, b|A], B) :- c(A, B). 4.a([a|A], D) :- a(A, B), B=[b|C], c(C, D). 5.c(A, C) :- A=[b|B], c(B, D), C=[b|D]. 6.c([c, c|A], [c|A]).
29
Natural Language Parsing Syntax trees are a big deal in NLP Stanford Parser – http://nlp.stanford.edu:8080/parser/index.jsp http://nlp.stanford.edu:8080/parser/index.jsp – Uses probabilistic rules learnt from a Treebank corpus We do a lot with Treebanks in the follow-on course to this one (LING 581, Spring) 29
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.