Presentation is loading. Please wait.

Presentation is loading. Please wait.

Natural Language Processing

Similar presentations


Presentation on theme: "Natural Language Processing"— Presentation transcript:

1 Natural Language Processing
Machine Translation Predicate argument structures Syntactic parses Lexical Semantics Probabilistic Parsing Ambiguities in sentence interpretation CSE391 – 2005 NLP

2 One of the first applications for computers
Machine Translation One of the first applications for computers bilingual dictionary > word-word translation Good translation requires understanding! War and Peace, The Sound and The Fury? What can we do? Sublanguages. technical domains, static vocabulary Meteo in Canada, Caterpillar Tractor Manuals, Botanical descriptions, Military Messages CSE391 – 2005 NLP

3 Example translation CSE391 – 2005 NLP

4 Translation Issues: Korean to English
- Word order - Dropped arguments - Lexical ambiguities - Structure vs morphology CSE391 – 2005 NLP

5 Common Thread Predicate-argument structure Constituents Relations
Basic constituents of the sentence and how they are related to each other Constituents John, Mary, the dog, pleasure, the store. Relations Loves, feeds, go, to, bring CSE391 – 2005 NLP

6 Abstracting away from surface structure
CSE391 – 2005 NLP

7 Transfer lexicons CSE391 – 2005 NLP

8 Machine Translation Lexical Choice- Word Sense Disambiguation
Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost]. John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced]. CSE391 – 2005 NLP

9 Natural Language Processing
Syntax Grammars, parsers, parse trees, dependency structures Semantics Subcategorization frames, semantic classes, ontologies, formal semantics Pragmatics Pronouns, reference resolution, discourse models CSE391 – 2005 NLP

10 Syntactic Categories Nouns, pronouns, Proper nouns
Verbs, intransitive verbs, transitive verbs, ditransitive verbs (subcategorization frames) Modifiers, Adjectives, Adverbs Prepositions Conjunctions CSE391 – 2005 NLP

11 Syntactic Parsing The cat sat on the mat. Det Noun Verb Prep Det Noun
Time flies like an arrow. Noun Verb Prep Det Noun Fruit flies like a banana. Noun Noun Verb Det Noun CSE391 – 2005 NLP

12 Parses The cat sat on the mat S NP VP Det PP N V the cat sat NP Prep N
CSE391 – 2005 NLP

13 Parses Time flies like an arrow. S NP VP N time V PP flies Prep NP
Det N arrow an CSE391 – 2005 NLP

14 Parses Time flies like an arrow. S NP VP N time V NP N like flies N
Also [you] time flies like an arrows, with “time” as a verb. N time V NP N like flies N Det arrow an CSE391 – 2005 NLP

15 Recursive transition nets for CFGs
NP VP S S1 S2 S3 pp det noun NP S4 S5 adj S6 noun pronoun s :- np,vp. np:- pronoun; noun; det,adj, noun; np,pp. CSE391 – 2005 NLP

16 Lexicon noun(cat). noun(mat). det(the). det(a). verb(sat). prep(on).
noun(flies). noun(time). noun(arrow). det(an). verb(flies). verb(time). prep(like). anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T). CSE391 – 2005 NLP

17 Lexicon with Roots noun(flies,fly). noun(cat,cat). noun(time,time).
noun(arrow,arrow). det(an,an). verb(flies,fly). verb(time,time). prep(like,like). noun(cat,cat). noun(mat,mat). det(the,the) det(a,a). verb(sat,sit). prep(on,on). anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T). CSE391 – 2005 NLP

18 Parses The old can can hold the water. S NP VP det NP aux the V can
Do lexicon on board! det NP aux the V can adj N hold can old det N the water CSE391 – 2005 NLP

19 Structural ambiguities
That factory can can tuna. That factory cans cans of tuna and salmon. Have the students in cse91 finish the exam in 212. Have the students in cse91 finished the exam in 212? CSE391 – 2005 NLP

20 Lexicon The old can can hold the water.
Noun(can,can) Noun(cans,can) Noun(water,water) Noun(hold,hold) Noun(holds,hold) Det(the,the) Verb(hold,hold) Verb(holds,hold) Aux(can,can) Adj(old,old) anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T). CSE391 – 2005 NLP

21 Simple Context Free Grammar in BNF notation
S → NP VP NP → Pronoun | Noun | Det Adj Noun |NP PP PP → Prep NP V → Verb | Aux Verb VP → V | V NP | V NP NP | V NP PP | VP PP CSE391 – 2005 NLP

22 Top-down parse in progress [The, old, can, can, hold, the, water]
S → NP VP NP → NP? NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ?old Noun? Can Succeed. VP? CSE391 – 2005 NLP

23 Top-down parse in progress [can, hold, the, water]
VP → VP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold succeed fail [the, water] CSE391 – 2005 NLP

24 Top-down parse in progress [can, hold, the, water]
VP → VP NP V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ? fail CSE391 – 2005 NLP

25 Lexicon Verb(hold,hold) Noun(can,can) Verb(holds,hold) Noun(cans,can)
Aux(can,can) Adj(old,old) Adj( , ) Noun(can,can) Noun(cans,can) Noun(water,water) Noun(hold,hold) Noun(holds,hold) Det(the,the) anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T). CSE391 – 2005 NLP

26 Top-down parse in progress [can, hold, the, water]
VP → V NP? V → Verb? Verb? fail V → Aux Verb? Aux? can Verb? hold NP → Pronoun? Pronoun? fail NP → Noun? Noun? fail NP → Det Adj Noun? Det? the ADJ? Noun? water SUCCEED CSE391 – 2005 NLP

27 Lexicon Verb(hold,hold) Noun(can,can) Verb(holds,hold) Noun(cans,can)
Aux(can,can) Adj(old,old) Adj( , ) Noun(can,can) Noun(cans,can) Noun(water,water) Noun(hold,hold) Noun(holds,hold) Det(the,the) Noun(old,old) anything missing? noun([fly|T],T), verb([fly|T],T). verb([sit|T],T). CSE391 – 2005 NLP

28 Top-down approach Start with goal of sentence
S → NP VP S → Wh-word Aux NP VP Will try to find an NP 4 different ways before trying a parse where the verb comes first. What does this remind you of? search What would be better? CSE391 – 2005 NLP

29 Bottom-up approach Start with words in sentence.
What structures do they correspond to? Once a structure is built, kept on a CHART. CSE391 – 2005 NLP

30 Bottom-up parse in progress
det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun. CSE391 – 2005 NLP

31 Bottom-up parse in progress
det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun. CSE391 – 2005 NLP

32 Bottom-up parse in progress
det adj noun aux verb det noun. The old can can hold the water. det noun aux/verb noun/verb noun det noun. CSE391 – 2005 NLP

33 Top-down vs. Bottom-up Helps with POS ambiguities – only consider relevant POS Rebuilds the same structure repeatedly Spends a lot of time on impossible parses Has to consider every POS Builds each structure once Spends a lot of time on useless structures What would be better? CSE391 – 2005 NLP

34 Hybrid approach Top-down with a chart
Use look ahead and heuristics to pick most likely sentence type Use probabilities for pos tagging, pp attachments, etc. CSE391 – 2005 NLP

35 Features C for Case, Subjective/Objective
She visited her. P for Person agreement, (1st, 2nd, 3rd) I like him, You like him, He likes him, N for Number agreement, Subject/Verb He likes him, They like him. G for Gender agreement, Subject/Verb English, reflexive pronouns He washed himself. Romance languages, det/noun T for Tense, auxiliaries, sentential complements, etc. * will finished is bad CSE391 – 2005 NLP

36 Example Lexicon Entries Using Features: Case, Number, Gender, Person
pronoun(subj, sing, fem, third, she, she). pronoun(obj, sing, fem, third, her, her). pronoun(obj, Num, Gender, second, you, you). pronoun(subj, sing, Gender, first, I, I). noun(Case, plural, Gender, third, flies,fly). CSE391 – 2005 NLP

37 Language to Logic How do we get there?
John went to the book store.  John  store1, go(John, store1) John bought a book. buy(John,book1) John gave the book to Mary. give(John,book1,Mary) Mary put the book on the table. put(Mary,book1,table1) CSE391 – 2005 NLP

38 Lexical Semantics Same event - different sentences
John broke the window with a hammer. John broke the window with the crack. The hammer broke the window. The window broke. CSE391 – 2005 NLP

39 Same event - different syntactic frames
John broke the window with a hammer. SUBJ VERB OBJ MODIFIER John broke the window with the crack. The hammer broke the window. SUBJ VERB OBJ The window broke. SUBJ VERB CSE391 – 2005 NLP

40 Semantics -predicate arguments
break(AGENT, INSTRUMENT, PATIENT) AGENT PATIENT INSTRUMENT John broke the window with a hammer. INSTRUMENT PATIENT The hammer broke the window. PATIENT The window broke. Fillmore 68 - The case for case CSE391 – 2005 NLP

41 John broke the window with a hammer. SUBJ OBJ MODIFIER
AGENT PATIENT INSTRUMENT John broke the window with a hammer. SUBJ OBJ MODIFIER INSTRUMENT PATIENT The hammer broke the window. SUBJ OBJ PATIENT The window broke. SUBJ CSE391 – 2005 NLP

42 Constraint Satisfaction
break (Agent: animate, Instrument: tool, Patient: physical-object) Agent <=> subj Instrument <=> subj, with-pp Patient <=> obj, subj ACL81,ACL85,ACL86,MT90,CUP90,AIJ93 CSE391 – 2005 NLP

43 Syntax/semantics interaction
Parsers will produce syntactically valid parses for semantically anomalous sentences Lexical semantics can be used to rule them out CSE391 – 2005 NLP

44 Constraint Satisfaction
give (Agent: animate, Patient: physical-object Recipient: animate) Agent <=> subj Patient <=> object Recipient <=> indirect-object, to-pp CSE391 – 2005 NLP

45 Subcategorization Frequencies
The women kept the dogs on the beach. Where keep? Keep on beach 95% NP XP 81% Which dogs? Dogs on beach 5% NP 19% The women discussed the dogs on the beach. Where discuss? Discuss on beach 10% NP PP 24% Which dogs? Dogs on beach 90% NP 76% CSE391 – 2005 Ford, Bresnan, Kaplan 82, Jurafsky 98, Roland,Jurafsky 99 NLP

46 Reading times NP-bias (slower times to bold word)
The waiter confirmed the reservation was made yesterday. The defendant accepted the verdict would be decided soon. CSE391 – 2005 NLP

47 Reading times S-bias (no slower times to bold word)
The waiter insisted the reservation was made yesterday. The defendant wished the verdict would be decided soon. Trueswell, Tanenhaus and Kello, 93 Trueswell and Kim 98 CSE391 – 2005 NLP

48 Probabilistic Context Free Grammars
Adding probabilities Lexicalizing the probabilities CSE391 – 2005 NLP

49 Simple Context Free Grammar in BNF
S → NP VP NP → Pronoun | Noun | Det Adj Noun |NP PP PP → Prep NP V → Verb | Aux Verb VP → V | V NP | V NP NP | V NP PP | VP PP CSE391 – 2005 NLP

50 Simple Probabilistic CFG
S → NP VP NP → Pronoun [0.10] | Noun [0.20] | Det Adj Noun [0.50] |NP PP [0.20] PP → Prep NP [1.00] V → Verb [0.20] | Aux Verb [0.20] VP → V [0.10] | V NP [0.40] | V NP NP [0.10] | V NP PP [0.20] | VP PP [0.20] CSE391 – 2005 NLP

51 Simple Probabilistic Lexicalized CFG
S → NP VP NP → Pronoun [0.10] | Noun [0.20] | Det Adj Noun [0.50] |NP PP [0.20] PP → Prep NP [1.00] V → Verb [0.20] | Aux Verb [0.20] VP → V [0.87] {sleep, cry, laugh} | V NP [0.03] | V NP NP [0.00] | V NP PP [0.00] | VP PP [0.10] CSE391 – 2005 NLP

52 Simple Probabilistic Lexicalized CFG
VP → V [0.30] | V NP [0.60] {break,split,crack..} | V NP NP [0.00] | V NP PP [0.00] | VP PP [0.10] VP → V [0.10] what about | V NP [0.40] leave? | V NP NP [0.10] leave1, leave2? | V NP PP [0.20] | VP PP [0.20] CSE391 – 2005 NLP

53 A TreeBanked Sentence S (S (NP-SBJ Analysts) (VP have (VP been
(VP expecting (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S (NP-SBJ *T*-1) (VP would (VP give (NP the U.S. car maker) (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) VP have VP been VP NP-SBJ Analyst s expecting NP Here is an example from the Wall Street Journal Treebank II. The sentence is in the light blue box on the bottom left corner, the syntactic structure as a parse tree is in the middle, the actual annotation is in the light gray box. that SBAR WHNP-1 a GM-Jaguar pact NP S VP NP-SBJ VP give *T*-1 would NP PP-LOC Analysts have been expecting a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company. the US car maker NP an eventual 30% stake NP the British company NP in CSE391 – 2005 NLP

54 The same sentence, PropBanked
have been expecting (S Arg0 (NP-SBJ Analysts) (VP have (VP been (VP expecting Arg1 (NP (NP a GM-Jaguar pact) (SBAR (WHNP-1 that) (S Arg0 (NP-SBJ *T*-1) (VP would (VP give Arg2 (NP the U.S. car maker) Arg1 (NP (NP an eventual (ADJP 30 %) stake) (PP-LOC in (NP the British company)))))))))))) Arg0 Arg1 Analyst s a GM-Jaguar pact Here the same annotation is still in the gray box, with the argument labels added. The tree represents the dependency structure, which gives rise to the predicates in the light blue box. Notice that the trace links back to the GM-Jaguar Pact. Notice also that it could just as easily have said, “a GM-Jaguar pact that would give an eventual…stake to the US car maker.” where it would be “ ARG0: a GM-Jaguar pact that would give an ARG1: eventual…stake to ARG2: the US car maker.” This works in exactly the same way for Chinese and Korean as it works for English (and presumably will for Arabic as well.) that would give *T*-1 the US car maker an eventual 30% stake in the British company Arg0 Arg2 Arg1 expect(Analysts, GM-J pact) give(GM-J pact, US car maker, 30% stake) CSE391 – 2005 NLP

55 Headlines Police Begin Campaign To Run Down Jaywalkers
Iraqi Head Seeks Arms Teacher Strikes Idle Kids Miners Refuse To Work After Death Juvenile Court To Try Shooting Defendant CSE391 – 2005 NLP

56 Events From KRR lecture CSE391 – 2005 NLP

57 Context Sensitivity Programming languages are Context Free
Natural languages are Context Sensitive? Movement Features respectively John, Mary and Bill ate peaches, pears and apples, respectively. CSE391 – 2005 NLP

58 The Chomsky Grammar Hierarchy
Regular grammars, aabbbb S → aS | nil | bS Context free grammars, aaabbb S → aSb | nil Context sensitive grammars, aaabbbccc xSy → xby Turing Machines CSE391 – 2005 NLP

59 Recursive transition nets for CFGs
NP VP S S1 S2 S3 pp det noun NP S4 S5 adj S6 noun pronoun s :- np,vp. np:- pronoun; noun; det,adj, noun; np,pp. CSE391 – 2005 NLP

60 Most parsers are Turing Machines
To give a more natural and comprehensible treatment of movement For a more efficient treatment of features Not because of respectively – most parsers can’t handle it. CSE391 – 2005 NLP

61 Nested Dependencies and Crossing Dependencies
CF The dog chased the cat that bit the mouse that ran. CF The mouse the cat the dog chased bit ran. CS John, Mary and Bill ate peaches, pears and apples, respectively CSE391 – 2005 NLP

62 Movement What did John give to Mary? *Where did John give to Mary?
John gave cookies to Mary. John gave <what> to Mary. CSE391 – 2005 NLP

63 Handling Movement: Hold registers/Slash Categories
S :- Wh, S/NP S/NP :- VP S/NP :- NP VP/NP VP/NP :- Verb CSE391 – 2005 NLP


Download ppt "Natural Language Processing"

Similar presentations


Ads by Google