LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Natural Language and Prolog
Advertisements

Prolog programming....Dr.Yasser Nada. Chapter 8 Parsing in Prolog Taif University Fall 2010 Dr. Yasser Ahmed nada prolog programming....Dr.Yasser Nada.
Statistical NLP: Lecture 3
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/12.
LING 364: Introduction to Formal Semantics Lecture 9 February 9th.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 9: 9/21.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/11.
LING 388: Language and Computers Sandiway Fong Lecture 28: 12/6.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 22: 11/15.
LING 388: Language and Computers Sandiway Fong Lecture 20: 11/2.
LING 364: Introduction to Formal Semantics Lecture 4 January 24th.
LING 388: Language and Computers Sandiway Fong Lecture 14: 10/13.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 6: 9/6.
Features and Unification
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 16: 10/23.
LING 388 Language and Computers Take-Home Final Examination 12/9/03 Sandiway FONG.
LING 388: Language and Computers Sandiway Fong Lecture 28: 12/5.
LING 438/538 Computational Linguistics Sandiway Fong Lecture 12: 10/5.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/25.
LING 388: Language and Computers Sandiway Fong Lecture 17: 10/24.
LING 388 Language and Computers Lecture 12 10/9/03 Sandiway FONG.
LING 388 Language and Computers Lecture 15 10/21/03 Sandiway FONG.
The students will be able to know:
Syntax.
LING 388: Language and Computers Sandiway Fong Lecture 8.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING 388: Language and Computers Sandiway Fong Lecture 11.
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 14, Feb 27, 2007.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
LING 388: Language and Computers Sandiway Fong Lecture 17.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
LING 388: Language and Computers Sandiway Fong Lecture 7.
LING 388: Language and Computers Sandiway Fong Lecture 15 10/13.
LING 388: Language and Computers Sandiway Fong Lecture 30 12/8.
LING 388: Language and Computers Sandiway Fong Lecture 18.
LING 388: Language and Computers Sandiway Fong Lecture 26 11/22.
Chapter Twenty-ThreeModern Programming Languages1 Formal Semantics.
Formal Semantics Chapter Twenty-ThreeModern Programming Languages, 2nd ed.1.
LING 388: Language and Computers Sandiway Fong Lecture 19.
LING/C SC/PSYC 438/538 Lecture 26 Sandiway Fong. Administrivia 538 Presentations – Send me your choices if you haven’t already Thanksgiving Holiday –
1 Natural Language Processing Chapter 15 (part 2).
For Wednesday Read chapter 23 Homework: –Chapter 22, exercises 1,4, 7, and 14.
LING 388: Language and Computers Sandiway Fong Lecture 11: 10/4.
ICS 482: Natural language Processing Pre-introduction
Rules, Movement, Ambiguity
LING 388: Language and Computers Sandiway Fong Lecture 21.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
What tense is that verb? Naming verb tenses
Verb phrases Main reference: Randolph Quirk and Sidney Greenbaum, A University Grammar of English, Longman: London, (3.23 – 3.55)
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
SYNTAX.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong 1.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
1 Some English Constructions Transformational Framework October 2, 2012 Lecture 7.
LING 388: Language and Computers Sandiway Fong Lecture 20.
CS 330 Programming Languages 09 / 25 / 2007 Instructor: Michael Eckmann.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
Week 12. NP movement Text 9.2 & 9.3 English Syntax.
Statistical NLP: Lecture 3
Natural Language Processing
Natural Language Processing
Part I: Basics and Constituency
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
LING 581: Advanced Computational Linguistics
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 24 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 25 Sandiway Fong.
Presentation transcript:

LING 438/538 Computational Linguistics Sandiway Fong Lecture 23: 11/20

Today’s Topics Three things 1.continue with context-free grammar example deal with left recursion problem... 2.Homework your chance to write a context-free grammar 538 Class Presentations selecting a chapter format etc.

Last Time Let’s write a context-free grammar that returns parse trees for simple active/passive sentence pairs such as: –John hit a ball/John ate a sandwich –*John hit/John ate –*hit a ball/*ate a sandwich –the ball was hit/the sandwich was eaten –the ball was hit by John/the sandwich was eaten by John Let’s introduce traces in the case of passives: –[ S [ NP the ball] [ VP [aux was ][ VP [ V hit] [ NP trace]]]] –[ S [ NP the ball] [ VP [ VP [aux was ][ VP [ V hit] [ NP trace]]][ PP [ P by][ NP John]]]]

Grammar Note: –need to handle English passive morphology –passive be selects for a V-en Example –*was ate (simple past form) –was eaten (-en past participle form) Implementation: –use an extra argument to indicate the verb form –v(v(ate),past) --> [ate]. –v(v(eaten),pastparticiple) --> [eaten].

Grammar [Developed in class] s(s(NP,VP)) --> np(NP,notrace), vp(VP,_,_,notrace). np(np(Det,N),notrace) --> det(Det), common_noun(N). np(np(N),notrace) --> proper_noun(N). np(np(trace),trace) --> []. proper_noun(john) --> [john]. det(det(the)) --> [the]. det(det(a)) --> [a]. common_noun(n(ball)) --> [ball]. common_noun(n(sandwich)) --> [sandwich]. vp(vp(BE,VP),Form,selectsforvp,no trace) --> passive_be(BE,Form), vp(VP,pastparticiple,transitive,tr ace). vp(vp(V,NP),Form,transitive,EC) --> transitive(V,Form), np(NP,EC). vp(vp(V),Form,intransitive,notrace) --> intransitive(V,Form). vp(vp(VP,PP),Form,transitive,EC) - -> vp(VP,Form,transitive,EC), pp(PP).

Grammar passive_be(v(be),root) --> [be]. passive_be(v(is),thirdpersonpresen t) --> [is]. passive_be(v(was),past) --> [was]. intransitive(v(eat),root) --> [eat]. intransitive(v(eats),s) --> [eats]. intransitive(v(ate),past) --> [ate]. intransitive(v(eaten),pastparticiple) --> [eaten]. transitive(v(eat),root) --> [eat]. transitive(v(eats),s) --> [eats]. transitive(v(ate),past) --> [ate]. transitive(v(eaten),pastparticiple) -- > [eaten]. transitive(v(hit),root) --> [hit]. transitive(v(hits),s) --> [hits]. transitive(v(hit),past) --> [hit]. transitive(v(hit),pastparticiple) --> [hit]. pp(pp(P,NP)) --> p(P), np(NP,notrace). p(p(by)) --> [by].

Grammatical Sentences –John hit a ball/John ate a sandwich –John ate –the ball was hit/the sandwich was eaten –the ball was hit by John/the sandwich was eaten by John ?- s(X,[john,hit,the,ball],[]). X = s(np(john),vp(v(hit),np(det(the),n(ball)))) | ?- s(X,[john,ate,a,sandwich],[]). X = s(np(john),vp(v(ate),np(det(a),n(sandwich)))) | ?- s(X,[john,ate],[]). X = s(np(john),vp(v(ate))) | ?- s(X,[the,sandwich,was,eaten],[]). X = s(np(det(the),n(sandwich)),vp(v(was),vp(v(eaten),np(trace)))) | ?- s(X,[the,sandwich,was,eaten,by,john],[]). X = s(np(det(the),n(sandwich)),vp(v(was),vp(vp(v(eaten),np(trace)),pp(p(by),np(joh n)))))

Infinite loop Occurs with ungrammatical input –*John hit Also with grammatical input when we ask for more solutions –i.e. invoke backtracking –John ate a sandwich Computational System –involves recursion –Prolog also selects first matching rule –but will try other rules on backtracking

Solving the Left Adjunction Problem Rule (simplified): –vp --> vp, pp. –causes Prolog to go into an infinite loop Why? –Suppose there is no PP in the input –what happens on backtracking?

Solving the Left Adjunction Problem Idea: –Look ahead into the input for a potential PP –License Prolog to use the VP adjunction rule only when there is an appropriate (overt) preposition ahead in the input

Solving the Left Adjunction Problem Implementation: –requires access to the input list –not available directly from the DCG rule DCG rules are translated into underlying Prolog rules that contain input/output list pairs Example: DCG rule –vp(vp(VP,PP)) --> vp(VP,Number), pp(PP). –gets translated into Prolog as –vp(vp(A,B), C, D, E) :- vp(A, C, D, F), pp(B, F, E). –D = part of sentence to be analyzed by the VP rule –E = part left over after VP rule –(the sandwich) D = [was,eaten,by,john] –E = [] –F = ?

Solving the Left Adjunction Problem DCG rules are translated into underlying Prolog rules that contain input/output list pairs Example: DCG rule –vp(vp(VP,PP)) --> vp(VP,Number), pp(PP). –gets translated into Prolog: –vp(vp(A,B), C, D, E) :- vp(A, C, D, F), pp(B, F, E). Solution: –modify the underlying Prolog rule directly –add a call to a Prolog predicate to check for list membership for preposition –vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C, D, F), pp(B, F, E).

Solving the Left Adjunction Problem Prolog VP adjunction rule –vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C, D, F), pp(B, F, E). Implementation of the supporting predicate checkforpp/1 : –% checkforpp(List) true if List contains a preposition (by) –checkforpp([by|_]). –checkforpp([_|L]) :- checkforpp(L).

Solving the Left Adjunction Problem Actually, this only partially solves the problem Case 1: no PP in input –VP adjunction won’t be triggered because checkforpp/1 fails –vp(vp(A,B), C, D, E) :- checkforpp(D), vp(A, C, D, F), pp(B, F, E). Case 2: there is a PP in input –still get recursion on backtracking... and an infinite loop –because each recursion is licensed by the same PP Idea: –need to say that we license VP adjunction one PP at a time Prolog solution: –each time checkforpp/1 succeeds it should “mark” the PP so that next time it is called it won’t select the same PP again

Solving the Left Adjunction Problem Idea: –need to say that we license VP adjunction one PP at a time Prolog solution: –each time checkforpp/1 succeeds it should “mark” the PP so that next time it is called it won’t select the same PP again Implementation: –checkforpp/2 –% checkforpp(List,NewList) true if List contains a preposition (by)and NewList is the marked List –checkforpp([by|L],[by2|L]). –checkforpp([X|L],[X|L2]) :- \+X=by, checkforpp(L,L2).

Solving the Left Adjunction Problem VP adjunction Prolog rule: –vp(vp(A,B), C, D, E) :- checkforpp(D,G), vp(A, C, G, F), pp(B, F, E). must make sure PP rules still manage to pick up marked preposition Hence : –p(p(by)) --> [by]. –must morph into: –p(p(by)) --> [by2].

Homework due Tuesday 27th

Why can’t computers use English? from Lecture 1 –a linguist’s view: a list of examples that are hard for computers to do –a computational linguist’s view (mine): these actually aren’t very hard at all... armed with some DCG technology, we can easily write a grammar to that make the distinctions outlined in the pamphlet –your homework task write a grammar for these examples

If computers are so smart, why can't they use simple English? Consider, for instance, the four letters read ; they can be pronounced as either reed or red. How does the machine know in each case which is the correct pronunciation? Suppose it comes across the following sentences: (l) The girls will read the paper. (reed) (2) The girls have read the paper. (red) We might program the machine to pronounce read as reed if it comes right after will, and red if it comes right after have. But then sentences (3) through (5) would cause trouble. (3) Will the girls read the paper? (reed) (4) Have any men of good will read the paper? (red) (5) Have the executors of the will read the paper? (red) How can we program the machine to make this come out right?

If computers are so smart, why can't they use simple English? (6) Have the girls who will be on vacation next week read the paper yet? (red) (7) Please have the girls read the paper. (reed) (8) Have the girls read the paper?(red) Sentence (6) contains both have and will before read, and both of them are auxiliary verbs. But will modifies be, and have modifies read. In order to match up the verbs with their auxiliaries, the machine needs to know that the girls who will be on vacation next week is a separate phrase inside the sentence. In sentence (7), have is not an auxiliary verb at all, but a main verb that means something like 'cause' or 'bring about'. To get the pronunciation right, the machine would have to be able to recognize the difference between a command like (7) and the very similar question in (8), which requires the pronunciation red.

Homework Requirements This is what you need to submit Part 1 –write down (in English) the grammatical constraints you are going to use to make the distinctions in examples (1) – (8). –e.g. what you are assuming about things like auxiliary/verb fronting and the constraints from perfective have Part 2 –implement your constraints in the framework of a Definite Clause Grammar (DCG) that returns parse trees. –submit both your grammar and the runs. –to make the distinction between the forms of the verb read readily apparent in your parse trees, use something like: –v(v(red),pastparticiple) --> [read]. –v(v(read),root) --> [read].

Homework Requirements Note: the question mark is crucial in the following example (5) Have the executors of the will read the paper? (red) Note: –you can either treat ? as an input word or have the parser return two possible parses (without ?)

538 class presentations your chance to get up and explain ideas in computational linguistics to the rest of the class Textbook Chapters: –from Chapter 11 onwards –as long as it’s on material we haven’t covered (or will cover) in class –so, e.g., the basic pumping lemma wouldn’t be acceptable Remaining topics: –parsing techniques (left-corner, chart, tabular) –WordNet (ontologies, semantic networks)

Chapters II: Syntax 11 Features and Unification 11 Features and Unification 12 Lexicalized and Probabilistic Parsing12 Lexicalized and Probabilistic Parsing 13 Language and Complexity13 Language and Complexity III: Semantics 14 Representing Meaning14 Representing Meaning 15 Semantic Analysis 16 Lexical Semantics 17 Word Sense Disambiguation and Information Retrieval17 Word Sense Disambiguation and Information Retrieval IV: Pragmatics 18 Discourse 19 Dialog and Conversational Agents19 Dialog and Conversational Agents 20 Natural Language Generation V: Multilingual Processing 21 Machine Translation

538 class presentations Pick one chapter –pick topic(s) within the chapter –send me first-come first-served –(same chapter different topics possible) –10 minute presentation with slides –(powerpoint, PDF acceptable) –explain and evaluate the central idea/technique/algorithm/trade-offs behind the topic you’ve chosen –you’ll be graded on clarity of presentation and how well you explain or communicate the topic(s)