LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/18.

Slides:



Advertisements
Similar presentations
LING/C SC/PSYC 438/538 Lecture 12 Sandiway Fong. Administrivia We'll postpone Homework 4 review until next week …
Advertisements

Top-down Parsing By Georgi Boychev, Rafal Kala, Ildus Mukhametov.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/29.
LING 388: Language and Computers Sandiway Fong Lecture 9: 9/22.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 10: 9/27.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/4.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 9: 9/21.
LING 388: Language and Computers Sandiway Fong Lecture 21: 11/7.
LING 388: Language and Computers Sandiway Fong Lecture 12: 10/5.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/11.
LING 388 Language and Computers Lecture 8 9/25/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 9: 9/25.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 11: 10/3.
LING 388: Language and Computers Sandiway Fong Lecture 11: 10/3.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/7.
LING 388 Language and Computers Take-Home Final Examination 12/9/03 Sandiway FONG.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/5.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
LING 388 Language and Computers Lecture 11 10/7/03 Sandiway FONG.
Prof. Fateman CS 164 Lecture 91 Bottom-Up Parsing Lecture 9.
LING/C SC/PSYC 438/538 Midterm 10/11. Instructions It is recommended that you attempt all questions –Submit your answers in one file by to
LING 388: Language and Computers Sandiway Fong Lecture 10: 9/26.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 5: 9/5.
LING 388 Language and Computers Lecture 9 9/30/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 13: 10/10.
LING 388 Language and Computers Lecture 6 9/18/03 Sandiway FONG.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING 388: Language and Computers Sandiway Fong Lecture 11.
Problem of the DAY Create a regular context-free grammar that generates L= {w  {a,b}* : the number of a’s in w is not divisible by 3} Hint: start by designing.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 7 Mälardalen University 2010.
CPSC 388 – Compiler Design and Construction Parsers – Context Free Grammars.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
LING 388: Language and Computers Sandiway Fong Lecture 7.
LING/C SC/PSYC 438/538 Lecture 14 Sandiway Fong. Administrivia Midterm – This Wednesday – A bit like doing a homework in real time – Bring your laptop.
Top-Down Parsing - recursive descent - predictive parsing
Context Free Grammars CIS 361. Introduction Finite Automata accept all regular languages and only regular languages Many simple languages are non regular:
Context-Free Grammars and Parsing
Context-free Languages
Languages & Grammars. Grammars  A set of rules which govern the structure of a language Fritz Fritz The dog The dog ate ate left left.
LING/C SC/PSYC 438/538 Lecture 7 9/15 Sandiway Fong.
LING 388: Language and Computers Sandiway Fong Lecture 10.
Context Free Grammars. Context Free Languages (CFL) The pumping lemma showed there are languages that are not regular –There are many classes “larger”
LING 388: Language and Computers Sandiway Fong Lecture 13.
LING/C SC/PSYC 438/538 Lecture 12 10/4 Sandiway Fong.
Parsing Introduction Syntactic Analysis I. Parsing Introduction 2 The Role of the Parser The Syntactic Analyzer, or Parser, is the heart of the front.
2. Regular Expressions and Automata 2007 년 3 월 31 일 인공지능 연구실 이경택 Text: Speech and Language Processing Page.33 ~ 56.
LING 388: Language and Computers Sandiway Fong Lecture 12.
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong. Administrivia Reading Homework – Chapter 3 of JM: Words and Transducers.
Enter Chomsky Grammars. 2 What has Chomsky* to do with computing? Linguistics and computing intersect at various places: Things that are used to create.
LING/C SC/PSYC 438/538 Lecture 14 Sandiway Fong. Administrivia Homework 6 graded.
More Parsing CPSC 388 Ellen Walker Hiram College.
LING/C SC/PSYC 438/538 Lecture 15 Sandiway Fong. Did you install SWI Prolog?
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong 1.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
CSC312 Automata Theory Lecture # 26 Chapter # 12 by Cohen Context Free Grammars.
LING/C SC/PSYC 438/538 Lecture 16 Sandiway Fong. SWI Prolog Grammar rules are translated when the program is loaded into Prolog rules. Solves the mystery.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
1 CD5560 FABER Formal Languages, Automata and Models of Computation Lecture 12 Mälardalen University 2007.
Compiler Construction Lecture Five: Parsing - Part Two CSC 2103: Compiler Construction Lecture Five: Parsing - Part Two Joyce Nakatumba-Nabende 1.
CMSC 330: Organization of Programming Languages Pushdown Automata Parsing.
Syntax Analysis By Noor Dhia Syntax analysis:- Syntax analysis or parsing is the most important phase of a compiler. The syntax analyzer considers.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong. Last Time Talked about: – 1. Declarative (logical) reading of grammar rules – 2. Prolog query: s(String,[]).
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 17 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 24 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 25 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 26 Sandiway Fong.
Presentation transcript:

LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 8: 9/18

Administrivia Last Thursday: –class replaced by –Cognitive Science Master’s Seminar Series talk on Statistical Natural Language Parsing and Treebanks

Administrivia This Thursday –class replaced by this Friday’s Linguistics Colloquium

Administrivia This Friday: –Computational Linguistics Colloquium –Presenters: Sandiway Fong, Mike Hammond and Ying Lin –Time: 3:00-4:30 PM –Location: Speech and Hearing 205H –Abstract: –In this talk, we will demonstrate the use of various software to collect and analyze linguistic data of various sorts. –Fong will give a tutorial on using his freely-available treebankviewer software to explore syntactic structures in treebanks. –Hammond will demonstrate the use of his "Finite State Playground", a suite of command-line tools that can be used to construct and manipulate finite-state language models. –Lin will discuss the use of speech technology and statistical tools for phonetic data analysis.

Today’s Topics DCG topics –language enumeration –ambiguity: multiple parses –stepping outside regular grammars just a little bit Homework 2

Regular Grammars: Language Enumeration right recursive regular grammar: s --> [b], ays. ays --> [a], a. a --> [a], exclam. a --> [a], a. exclam --> [!]. language: Sheeptalk –baa! –baaa! –ba...a! this DCG always halts: ?- s([b,a,a,!],[]). Yes ?- s([b,a,!],[]). No as a decider But not in all modes of operation –Prolog’s top-down, left-to-right, depth-first strategy also allows the grammar to be used an enumerator ?- s(L,[]). –L is a variable, we’re not naming the list. Let Prolog figure it out.

Regular Grammars: Language Enumeration Try ?- s(L,[]). –Prolog returns one answer and waits for action (type ; to get the next, to finish) Output: ?- s(L,[]). L = [b,a,a,!] ? ; L = [b,a,a,a,!] ? ; L = [b,a,a,a,a,!] ? ; L = [b,a,a,a,a,a,!] ? ; L = [b,a,a,a,a,a,a,!] ? ; L = [b,a,a,a,a,a,a,a,!] ? ; L = [b,a,a,a,a,a,a,a,a,!] ? Prolog is enumerating the strings of the language

Regular Grammars: Language Enumeration Let’s see why this happens ?- s(L,[]). DCG : 1.s --> [b], ays. 2.ays --> [a], a. 3.a --> [a], exclam. 4.a --> [a], a. 5.exclam --> [!]. derivation tree: s bays aa a exclam ! bays aa a a s aexclam !

Regular Grammars: Language Enumeration left recursive regular grammar: s --> a, [!]. a --> firsta, [a]. a --> a, [a]. firsta --> b, [a]. b --> [b]. doesn’t halt when faced with a string not in the language Example: ?- s([b,a,!],[]). enters an infinite loop Question: –how well does it do when enumerating? –i.e. ?- s(L,[]). –does this query halt?

Regular Grammars: Language Enumeration left recursive regular grammar: 1.s --> a, [!]. 2.a --> firsta, [a]. 3.a --> a, [a]. 4.firsta --> b, [a]. 5.b --> [b]. doesn’t halt when faced with a string not in the language Perhaps surprisingly: ?- s(L,[]). enumerates just fine. derivation tree: s a firsta b b a a ! s a a b b a a a ! and so on

Regular Grammars: Language Enumeration However, this slightly re-ordered left recursive regular grammar: 1.s --> a, [!]. 2.a --> a, [a]. 3.a --> firsta, [a]. 4.firsta --> b, [a]. 5.b --> [b]. won’t halt when enumerating Why? s a a a a... descends infinitely using rule 2

DCG: Ambiguity A context-free grammar that returns a parse: avoiding some nasty recursion for infinite loops s(s(NP,VP)) --> np(NP), vp(VP). np(np(i)) --> [i]. np(np(D,N)) --> det(D), noun(N). np(np(D,N,PP)) --> det(D), noun(N), pp(PP). det(d(the)) --> [the]. det(d(a)) --> [a]. noun(n(boy)) --> [boy]. noun(n(telescope)) --> [telescope]. vp(vp(V,NP)) --> verb(V), np(NP). vp(vp(V,NP,PP)) --> verb(V), np(NP), pp(PP). verb(v(saw)) --> [saw]. pp(pp(P,NP)) --> p(P), np(NP). p(p(with)) --> [with].

DCG: Ambiguity Parsing : ?- s(X,[i,saw,the,boy,with,a,telescope],[]). X = s(np(i),vp(v(saw),np(d(the),n(boy),pp(p(with),np(d(a),n(telescope)))))) ? ; X = s(np(i),vp(v(saw),np(d(the),n(boy)),pp(p(with),np(d(a),n(telescope))))) ? ; no Prolog finds both parses by backtracking. But why does Prolog return the parses in this order? 5min exercise: re-arrange grammar to produce the opposite order of parses

Regular Grammars: Stepping Outside Beyond regular grammars –a n b n = {ab, aabb, aaabbb, aaaabbbb,... } n>=1 is not a regular language (accept fact for now) A regular grammar extended to allow both left and right recursive rules can accept/generate it a --> [a], b. b --> [b]. b --> a, [b]. B A b a A B A b a Example: this grammar accepts –aabb aaaabbbb and rejects –aab aaaabbbbb Intuition: –grammar implements the stacking of partial trees balanced for a’s and b’s:

Homework 2 Ground rules: –all in one single file please –no more multiple attachments

Homework 2 Question 1 –Consider a language L 3or5 = {111, 11111, , , , , ,...} –each member of L 3or5 is a string containing only 1s –the number of 1s in each string is divisible by 3 or 5 Give a regular grammar that generates language L 3or5 Show your Prolog code works on strings in the language and rejects strings outside the language

Homework Question 2: Language L = {a 2n b n+1 | n>=1} is also non-regular nearly twice as many a’s as in the b’s but there’s always one b too many for this to be true but can be generated by a regular grammar extended to allow left and right recursive rules Given a Prolog grammar satisfying these rules for L Show it accepts and rejects correctly Legit rules: X -> aY X -> a X -> Ya –where X, Y are non- terminals, a is some arbitirary terminal

Homework 2 Question 3 (Optional 438) A right recursive regular grammar that generates a (rightmost) parse for the language {a n | n>=2}: s(s(a,B)) --> [a], b(B). b(b(a,B)) --> [a], b(B). b(b(a))--> [a]. Example ?- s(Tree,[a,a,a],[]). Tree = s(a,b(a,b(a)))? A corresponding left recursive regular grammar will not halt in all cases when an input string is supplied Modify the right recursive grammar to produce a left recursive parse, e.g. ?- s(Tree,[a,a,a],[]). Tree = s(b(b(a),a),a) Can you do this for any right recursive regular grammar? e.g. the one for sheeptalk Explain your answer