LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong.

Slides:



Advertisements
Similar presentations
CPSC 422, Lecture 16Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 16 Feb, 11, 2015.
Advertisements

LING 388: Language and Computers Sandiway Fong Lecture 2.
LING 388 Language and Computers Lecture 22 11/25/03 Sandiway FONG.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 18, March 13, 2007.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
LING/C SC/PSYC 438/538 Computational Linguistics Sandiway Fong Lecture 7: 9/11.
LING 364: Introduction to Formal Semantics Lecture 4 January 24th.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong. Administrivia Homework 4 – out today – due next Wednesday – (recommend you attempt it early) Reading.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
LING 388: Language and Computers Sandiway Fong Lecture 17.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
Overview Project Goals –Represent a sentence in a parse tree –Use parses in tree to search another tree containing ontology of project management deliverables.
LING 388: Language and Computers Sandiway Fong Lecture 30 12/8.
LING 388: Language and Computers Sandiway Fong Lecture 18.
NLP. Introduction to NLP Is language more than just a “bag of words”? Grammatical rules apply to categories and groups of words, not individual words.
LING 388: Language and Computers Sandiway Fong Lecture 10.
11 Chapter 14 Part 1 Statistical Parsing Based on slides by Ray Mooney.
LING 388: Language and Computers Sandiway Fong Lecture 12.
CSA2050: Introduction to Computational Linguistics Part of Speech (POS) Tagging I Introduction Tagsets Approaches.
Rules, Movement, Ambiguity
CPSC 422, Lecture 15Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15 Oct, 14, 2015.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
◦ Process of describing the structure of phrases and sentences Chapter 8 - Phrases and sentences: grammar1.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong 1.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
1 LIN 1310B Introduction to Linguistics Prof: Nikolay Slavkov TA: Qinghua Tang CLASS 11, Feb 9, 2007.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
LING/C SC/PSYC 438/538 Lecture 19 Sandiway Fong 1.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Introduction to Linguistics
Lecture 9: Part of Speech
CSC 594 Topics in AI – Natural Language Processing
Words, Phrases, Clauses, & Sentences
Grammar Daily Review: week nine
Statistical NLP: Lecture 3
Basic Parsing with Context Free Grammars Chapter 13
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 10 Sandiway Fong.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15
LING/C SC/PSYC 438/538 Lecture 5 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 21 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
CSCI 5832 Natural Language Processing
CS 388: Natural Language Processing: Syntactic Parsing
LING/C SC 581: Advanced Computational Linguistics
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
Probabilistic and Lexicalized Parsing
LING 581: Advanced Computational Linguistics
Natural Language - General
LING/C SC 581: Advanced Computational Linguistics
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27
FIRST SEMESTER GRAMMAR
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 15
LING/C SC/PSYC 438/538 Lecture 22 Sandiway Fong.
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 26
LING/C SC/PSYC 438/538 Lecture 24 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 25 Sandiway Fong.
LING/C SC/PSYC 438/538 Lecture 26 Sandiway Fong.
Linguistic Essentials
LING/C SC/PSYC 438/538 Lecture 13 Sandiway Fong.
Natural Language Processing
David Kauchak CS159 – Spring 2019
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
LING/C SC 581: Advanced Computational Linguistics
Presentation transcript:

LING/C SC/PSYC 438/538 Lecture 23 Sandiway Fong

Today's Topics Natural language parsing: syntactic analysis Homeworks 11 and 12

Natural Language Parsing Syntax trees are a big deal in NLP Reminder: reading homework: JM: chapter 5, sections 1 and 2 chapter 12 Stanford Parser / Berkeley Parser (Context-Free grammars: type-2) http://nlp.stanford.edu:8080/parser/index.jsp http://tomato.banatao.berkeley.edu:8080/parser/parser.html Uses probabilistic rules learnt from a Treebank corpus Output: syntax trees diagrams (also dependency graph: Stanford) We do a lot with Treebanks in the follow-on course to this one (LING 581, Spring)

Natural Language Parsing A new generation of "deep learning" parsers (last two years): Google Cloud Natural Language (aka syntaxnet) UDPipe Output: dependency parses (only) https://cloud.google.com/natural-language/

Training Data Penn Treebank: parsed by human annotators Efforts by the Hong Kong Futures Exchange to introduce a new interest-rate futures contract continue to hit snags despite the support the proposed instrument enjoys in the colony’s financial community. (WSJ section)

Natural Language Parsing

Natural Language Parsing Comparison between human parse and machine parse: empty categories not recovered by parsing, otherwise a good match!

Natural Language Parsing

Part of Speech (POS) JM Chapter 5 Parts of speech Classic eight parts of speech: e.g. englishclub.com => traced back to Latin scholars, back further to ancient Greek (Thrax) not everyone agrees on what they are .. The textbook lists: open class 4 (noun, verbs, adjectives, adverbs) closed class 7 (prepositions, determiners, pronouns, conjunctions, auxiliary verbs, particles, numerals) or what the subclasses are e.g. what is a Proper Noun? Saturday, April Textbook answer below …

Part of Speech (POS) Getting POS information about a word dictionary In computational linguistics, the Penn Treebank tagset is the most commonly used tagset (reprinted inside the front cover of your textbook) Getting POS information about a word dictionary pronunciation: e.g. are you conTENT with the CONtent of the slide? possible n-gram sequences e.g. *pronoun << common noun the << common noun structure of the sentence/phrase (Syntax) possible inflectional endings: e.g. V-s/-ed/-en/-ing e.g. N-s 45 tags listed in textbook 36 POS + 10 punctuation Task: POS tagging

Part of Speech (POS) http://faculty.washington.edu/dillon/GramResources/penntable.html NNP NNPS

Part of Speech (POS) PRP PRP$

Part of Speech (POS)

Part of Speech (POS) Stanford parser: walk noun/verb Disambiguation: Syntax Bigram sequence: *PRP << NN DT << NN

Part of Speech (POS) Word sense disambiguation (WSD) is more than POS tagging: different senses of the noun bank

Syntax Words combine recursively with one another into phrases (aka constituents) usually when two words combine, one word will head the phrase e.g [VB/VBP eat] [NN chocolate] e.g [VB/VBP eat] [DT some][NN chocolate] projects Warning: terminology and parses in computational linguistics not necessarily the same as those used in theoretical linguistics object projects

Syntax Words combine recursively with one another into phrases (aka constituents) e.g. [PRP we][VB/VBP eat] [NN chocolate] e.g. [TO to][VB/VBP eat] [NN chocolate] subject

Syntax Words combine recursively with one another into phrases (aka constituents) e.g. [NNP John][VBD noticed][IN/DT/WDT that][PRP we][VB/VBP eat] [NN chocolate] selects/subcategorizes for CP projects projects preposition complementizer (C)

Syntax How about a SBAR node? Words combine recursively with one another into phrases (aka constituents) How about a SBAR node? PRO cf. John wanted me to eat chocolate

Syntax Words combine recursively with one another into phrases (aka constituents) John noticed that we eat chocolate John noticed we eat chocolate

Homework 11 Question 1: write a Prolog CFG for the following sentences: John ate (sensibly) (intransitive eat) I fish (intransitive fish) I ate fish (transitive eat) Bill ate rice Harry ate roast beef Note: you can use lowercase names… (or quotes, e.g. 'John') Note: use Penn Treebank tagset for words (see inside the cover of your textbook, or Stanford Parser) nnp(prp(i)) --> [i]. nnp(nnp(john)) --> [john]. vbd(vbd(ate)) --> [ate]. Your grammar should produce one parse tree per example Your grammar should not contain infinite loops Use ; (for more answers) to show your code obeys the aforementioned constraints Submit your grammar and examples of runs

Homework 11

Homework 11 Question 2: expand your grammar to handle these sentences: I ate fish, and Bill ate rice *I ate fish, Bill ate rice I ate fish, Bill ate rice, and Harry ate roast beef Note: the comma can be a quoted terminal, e.g. [','] comma(comma(',')) --> [',']. ','(','(',')) --> [',']. Note: be careful of left recursion on S (Stanford Parser)

Homework 12 Mandatory for 538; Extra Credit for 438. From Ross (1970), English exhibits (forward) gapping: I ate fish, Bill rice, and Harry roast beef cf. I ate fish, Bill ate rice, and Harry ate roast beef Forwards only (cf. Japanese: backwards): I ate fish, Bill ate rice, and Harry roast beef *I fish, Bill rice, and Harry ate roast beef *I fish, Bill ate rice, and Harry ate roast beef *I fish, Bill ate rice, and Harry roast beef Parallelism requirement: *I ate fish, Bill, and Harry roast beef *I ate fish, Bill rice, and Harry

Homework 12 Gapping: I ate fish, Bill rice, and Harry roast beef (not as gapping) I ate fish, Bill rice, and roast beef I ate fish, rice, and Harry roast beef (you don't have to handle these two) Update your grammar in Homework 11 to handle gapping HInt 1: use an extra argument to represent and spread the elided verb Hint 2: can insert Prolog code into rules e.g. {nonvar(V)}, {var(V)}, or {A=B}

Homework 12

Homework 12

Homeworks 11 and 12 Homework 11 due next Monday Homework 12 due next Wednesday Submit two files with each homework PDF writeup Your .pl file (code)