NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Dialogue & Narrative Structures: Advanced Research Seminar.

Slides:



Advertisements
Similar presentations
An Ontology Creation Methodology: A Phased Approach
Advertisements

School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
A Machine Learning Approach to Coreference Resolution of Noun Phrases By W.M.Soon, H.T.Ng, D.C.Y.Lim Presented by Iman Sen.
LING 388: Language and Computers Sandiway Fong Lecture 2.
Statistical NLP: Lecture 3
Semantic Role Labeling Abdul-Lateef Yussiff
A Framework for Automated Corpus Generation for Semantic Sentiment Analysis Amna Asmi and Tanko Ishaya, Member, IAENG Proceedings of the World Congress.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
An Introduction to Machine Learning and Natural Language Processing Tools Vivek Srikumar, Mark Sammons (Some slides from Nick Rizzolo)
1 Words and the Lexicon September 10th 2009 Lecture #3.
ANLE1 CC 437: Advanced Natural Language Engineering ASSIGNMENT 2: Implementing a query expansion component for a Web Search Engine.
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 20, 2004.
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
An Introduction to Information Retrieval and Question Answering Jimmy Lin College of Information Studies University of Maryland Wednesday, December 8,
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/2010 Overview of NLP tasks (text pre-processing)
Named Entity Recognition and the Stanford NER Software Jenny Rose Finkel Stanford University March 9, 2007.
Siemens Big Data Analysis GROUP 3: MARIO MASSAD, MATTHEW TOSCHI, TYLER TRUONG.
Course G Web Search Engines 3/9/2011 Wei Xu
THE PARTS OF SPEECH. PART OF SPEECH  All words serve a particular function in a sentence.  A word’s function is determined by what “part of speech”
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Dr. Monira Al-Mohizea MORPHOLOGY & SYNTAX WEEK 11.
Parts of Speech Review. A noun is “ a word that names a person, a place, a thing, an idea, a quality, or a characteristic” (Writer’s Choice: 818). A noun.
SYNTAX Lecture -1 SMRITI SINGH.
CS : Language Technology for the Web/Natural Language Processing Pushpak Bhattacharyya CSE Dept., IIT Bombay Constituent Parsing and Algorithms (with.
Methods for the Automatic Construction of Topic Maps Eric Freese, Senior Consultant ISOGEN International.
Day 14 Information Retrieval Question Answering 1.
NATURAL LANGUAGE UNDERSTANDING FOR SOFT INFORMATION FUSION Stuart C. Shapiro and Daniel R. Schlegel Department of Computer Science and Engineering Center.
Semiautomatic domain model building from text-data Petr Šaloun Petr Klimánek Zdenek Velart Petr Šaloun Petr Klimánek Zdenek Velart SMAP 2011, Vigo, Spain,
GrammaticalHierarchy in Information Flow Translation Grammatical Hierarchy in Information Flow Translation CAO Zhixi School of Foreign Studies, Lingnan.
©2003 Paula Matuszek Taken primarily from a presentation by Lin Lin. CSC 9010: Text Mining Applications.
Conversion of Penn Treebank Data to Text. Penn TreeBank Project “A Bank of Linguistic Trees” (as of 11/1992) University of Pennsylvania, LINC Laboratory.
CSA2050 Introduction to Computational Linguistics Parsing I.
Auckland 2012Kilgarriff: NLP and Corpus Processing1 The contribution of NLP: corpus processing.
CHAPTER 12 Adjective Clauses Part One Mr. Hani M. Al-Tahrawi Eng-112 December, 2012
MedKAT Medical Knowledge Analysis Tool December 2009.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
Introduction to Syntactic Parsing Roxana Girju November 18, 2004 Some slides were provided by Michael Collins (MIT) and Dan Moldovan (UT Dallas)
NLP. Introduction to NLP Background –From the early ‘90s –Developed at the University of Pennsylvania –(Marcus, Santorini, and Marcinkiewicz 1993) Size.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Part-of-speech tagging
Sight Words.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Summarizing Getting to the Point.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
Word classes and part of speech tagging. Slide 1 Outline Why part of speech tagging? Word classes Tag sets and problem definition Automatic approaches.
SUMMARIZING GETTING TO THE POINT. SUMMARY  A short account of the central ideas of a text  Summaries are NOT a place for… Opinions Background knowledge.
NATURAL LANGUAGE PROCESSING
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
Lecture 9: Part of Speech
Statistical NLP: Lecture 3
LING/C SC/PSYC 438/538 Lecture 20 Sandiway Fong.
Dept. of Computer Science University of Liverpool
LING/C SC 581: Advanced Computational Linguistics
Improving an Open Source Question Answering System
Stanford CoreNLP
BBI 3212 ENGLISH SYNTAX AND MORPHOLOGY
Chunk Parsing CS1573: AI Application Development, Spring 2003
English parts of speech
Extracting Recipes from Chemical Academic Papers
Natural Language Processing
CS224N Section 3: Corpora, etc.
Summarizing Getting to the Point.
CS224N Section 3: Project,Corpora
Progress report on Semantic Role Labeling
Presentation transcript:

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Dialogue & Narrative Structures: Advanced Research Seminar in NLP and Narrative Prof. Marilyn Walker Baskin School of Engineering University of California, Santa Cruz

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Announcements.  Reid is in charge this week (I am away giving a talk).  Today: class session on Stanford Toolkit. Will help with everything that follows that uses it (e.g. Reids thesis work, Riloff plot structures, Narrative Schema etc).  Homework 3: homework to familiarize yourself with Stanford toolkit. Due next Tuesday.  May be useful to look ahead and Riloff paper to see how she uses parse structures when you are playing around with the Stanford parser.  Will link to Aesop Fables corpus so we can play with it.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Syllabus & Course Structure  3/01/pages/computational-models 3/01/pages/computational-models  Is everyone signed up on Piazza now?

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Homework 2: Instruction sheet from Thorne

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Sample story utterance from Thorne

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ HW2: Who wants to talk about L&W on story blog?  Someone who hasn’t talked in front of class yet? Someone who hasn’t talked in front of class yet?  day-10-surfing-lessons.html  pretty-awesome.html pretty-awesome.html  hee-deputized.html hee-deputized.html  living-riding-horses.html living-riding-horses.html

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Stanford Toolkit: CoreNLP

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ A Pipeline of useful tools.  nlp.stanford.edu/software/corenlp.shtml nlp.stanford.edu/software/corenlp.shtml

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Tokenize, Clean, sentence split, POS, lemma

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Named entities, Parsing, Coreference

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Part–Of–Speech (POS) Tagger

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford POS Tagger  A POS Tagger is a piece of software that reads text in some language and assigns parts of speech to each word.  such as noun, verb, adjective, etc.  This software is a Java implementation of the log-linear part- of-speech tagger. The system requires Java 1.6+ to be installed.  The following examples are base on 64 bit Windows OS. POS Tagger input plain text XML format

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: Plain Text  Input: But then, slowly all nine planets of our Solar System move into frame and align. The last of them is the giant, burning sphere of the sun. Just as the sun enters frame, a solar storm of gigantic proportion unfolds. The eruptions shoot thousands of miles into the blackness of space.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: Plain Text  Command: java -mx300m -classpath stanford-postagger.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -model models/bidirectional-wsj-0-18.tagger -textFile input.txt > output.txt Input file name Output file name More info:

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: Plain Text  POS Tagger output is a text file: But_CC then_RB,_, slowly_RB all_DT nine_CD planets_NNS of_IN our_PRP$ Solar_NNP System_NNP move_NN into_IN frame_NN and_CC align_NN._.The_DT last_JJ of_IN them_PRP is_VBZ the_DT giant_NN,_, burning_NN sphere_NN of_IN the_DT sun_NN._.Just_RB as_IN the_DT sun_NN enters_VBZ frame_NN,_, a_DT solar_JJ storm_NN of_IN gigantic_JJ proportion_NN unfolds_VBZ._.The_DT eruptions_NNS shoot_VBP thousands_NNS of_IN miles_NNS into_IN the_DT blackness_NN of_IN space_NN._.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: XML  Input: Last week, I rediscovered my yoga asana practice. After nearly five months with only a few attempts at practice I signed up for a month of classes at a studio near my apartment that offers Ashtanga, Iyengar and Hatha classes. Even a light Hatha class was a challenge after so long without practice. Coming out of our relaxation the end of class left me lingering on my mat for a few minutes, consumed by joy, sadness, and frustration. Last week, I rediscovered my yoga asana practice. …

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: XML  Command: java -mx300m -classpath stanford-postagger.jar edu.stanford.nlp.tagger.maxent.MaxentTagger -model models/bidirectional-wsj-0-18.tagger -xmlInput docText -textFile input.xml > output.xml XML tags whose content we want the POS tagger to tag

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input: XML  Output is a XML file: Last_JJ week_NN,_, I_PRP rediscovered_VBD my_PRP$ yoga_NN asana_NN practice_NN._. After_IN nearly_RB five_CD months_NNS with_IN only_RB a_DT few_JJ attempts_NNS at_IN practice_NN I_PRP signed_VBD up_RP for_IN a_DT month_NN of_IN classes_NNS at_IN a_DT studio_NN near_IN my_PRP$ apartment_NN that_WDT offers_VBZ Ashtanga_NNP,_, Iyengar_NNP and_CC Hatha_NNP classes_NNS._. Even_RB a_DT light_JJ Hatha_NNP class_NN was_VBD a_DT challenge_NN after_IN so_RB long_RB without_IN practice_NN._. Coming_VBG out_IN of_IN our_PRP$ relaxation_NN the_DT end_NN of_IN class_NN left_VBD me_PRP lingering_VBG on_IN my_PRP$ mat_NN for_IN a_DT few_JJ minutes_NNS,_, consumed_VBN by_IN joy_NN,_, sadness_NN,_, and_CC frustration_NN._. Last week, I rediscovered my yoga asana practice. … POS tagged unchanged

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ The English taggers use the Penn Treebank tag set: Penn Treebank Tag set: Verbs start with “VB” Adjectives start with “JJ” Nouns start with “NN”

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Named Entity Tagger

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Named Entity Recognizer (NER) Named Entity Recognition (NER) labels sequences of words in a text which are the names of things, such as person and company names. Recognizes named (PERSON, LOCATION, ORGANIZATION, MISC) and numerical entities (DATE, TIME, MONEY, NUMBER).

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Named Entity Recognizer (NER) Germany’s representative to the European Union’s veterinary committee Werner Zwingman said on Wednesday consumers should take action. From Stanford CoreNLP online demo

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Named Entity Recognizer (NER) Jim's income is dollars a year, which is 80 percent of his family income. From Stanford CoreNLP online demo

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Named Entity Recognizer (NER) Why NER? – Question Answering – Textual Entailment – Coreference Resolution – Computational Semantics – …

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Dependencies The Stanford dependencies provide a representation of grammatical relations between words in a sentence. Simple descriptions, easily understood. Triples of a relation between pairs of words, such as “the subject of hit is Jim.” Reference: Stanford typed dependencies manual

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Dependencies Input: Bell, based in Los Angeles, makes and distributes electronic, computer and building products.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Dependencies Input: Bell, based in Los Angeles, makes and distributes electronic, computer and building products. DependencyDefinitionDescription nsubj nominal subject A nominal subject is a noun phrase which is the syntactic subject of a clause. partmod participial modifier A participial modifier of an NP or VP or sentence is a participial verb form that serves to modify the meaning of a noun phrase or sentence. nn noun compound modifier A noun compound modifier of an NP is any noun that serves to modify the head noun. prep_in prepositional modifier A prepositional modifier of a verb, adjective, or noun is any prepositional phrase that serves to modify the meaning of the verb, adjective, noun, or even another prepositon. root The root grammatical relation points to the root of the sentence. conj_andconjunct A conjunct is the relation between two elements connected by a coordinating conjunction, such as “and”, “or”, etc. amod adjectival modifier An adjectival modifier of an NP is any adjectival phrase that serves to modify the meaning of the NP. dobj indirect object The indirect object of a VP is the noun phrase which is the (dative) object of the verb.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Example of how I used NER and Dependencies: 1. Annotate film scene files (tokenize, pos, lemma, ner, parse, dcoref) 2. Pick all the verbs (POS tags beginning with “VB”) 3. Figure out the subject and object of a verb (typed dependencies “nsubj”, “agent”, “dobj”, “iobj”, “nsubjpass”) 4. Generalize the subject and object (use NER results if they have, otherwise use lemma) 5. Integrate all the subjects and objects across all film scenes In one film scene file: “Jack unlocks the door.” person UNLOCK door In other film scene files: He unlocked the door. (person UNLOCK door) The door was unlocked by a strange force. (force UNLOCK door) Integrated: {person, force} UNLOCK (door}

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Parsing the Stanford Parser

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Text input Stanford Parser Constituency based parse tree Apply syntactic templates Extracted patterns

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Input  Plain text: At the time of the Constitution there weren't exactly vast suburbs that could be prowled by thieves looking for an open window.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Constituency-based parse tree  Difference from dependency based tree  Parser output: (ROOT (S (PP (IN At) (NP (NP (DT the) (NN time)) (PP (IN of) (NP (DT the) (NNP Constitution))))) (NP (EX there)) (VP (VBD were) (RB n't) (ADVP (RB exactly)) (NP (NP (JJ vast) (NNS suburbs)) (SBAR (WHNP (WDT that)) (S (VP (MD could) (VP (VB be) (VP (VBN prowled) (PP (IN by) (NP (NP (NNS thieves)) (VP (VBG looking) (PP (IN for) (NP (DT an) (JJ open) (NN window)))))))))))))

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Stanford Parser Code // initialize the pipeline Properties props = new Properties(); props.put("annotators", "tokenize, ssplit, pos, lemma, parse"); StanfordCoreNLP pipeline = new StanfordCoreNLP(props); // annotate document with given annotations Annotation document = new Annotation(text); pipeline.annotate(document); // these are all the sentences in this document List sentences = document.get(SentencesAnnotation.class); String parseTree = ""; for(CoreMap sentence: sentences) { // this is the Stanford dependency graph of the current sentence // e.g. (ROOT (S (PP (IN At)... parseTree += sentence.get(TreeAnnotation.class).toString(); }

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ (ROOT (S (PP (IN At) (NP (NP (DT the) (NN time)) (PP (IN of) (NP (DT the) (NNP Constitution))))) (NP (EX there)) (VP (VBD were) (RB n't) (ADVP (RB exactly)) (NP (NP (JJ vast) (NNS suburbs)) (SBAR (WHNP (WDT that)) (S (VP (MD could) (VP (VB be) (VP (VBN prowled) (PP IN by) (NP (NP (NNS thieves)) (VP (VBG looking) (PP (IN for) (NP (DT an) (JJ open) (NN window))))))))))))) Root S PP INNP AtNP DTNN PP thetime NP INNP ofDETNNP the Constitution thetimeofthetime the ofthetime Constitution the ofthetime

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ NP EX thereVBD were RB ADVP VP SBAR NP exactly S n’t JJ NP NNS suburbsvast (ROOT (S (PP (IN At) (NP (NP (DT the) (NN time)) (PP (IN of) (NP (DT the) (NNP Constitution))))) (NP (EX there)) (VP (VBD were) (RB n't) (ADVP (RB exactly)) (NP (NP (JJ vast) (NNS suburbs)) (SBAR (WHNP (WDT that)) (S (VP (MD could) (VP (VB be) (VP (VBN prowled) (PP IN by) (NP (NP (NNS thieves)) (VP (VBG looking) (PP (IN for) (NP (DT an) (JJ open) (NN window)))))))))))))

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN (ROOT (S (PP (IN At) (NP (NP (DT the) (NN time)) (PP (IN of) (NP (DT the) (NNP Constitution))))) (NP (EX there)) (VP (VBD were) (RB n't) (ADVP (RB exactly)) (NP (NP (JJ vast) (NNS suburbs)) (SBAR (WHNP (WDT that)) (S (VP (MD could) (VP (VB be) (VP (VBN prowled) (PP IN by) (NP (NP (NNS thieves)) (VP (VBG looking) (PP (IN for) (NP (DT an) (JJ open) (NN window)))))))))))))

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Syntactic Templates

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ S VP SBAR PPVBN IN VP WHNP MD VP NP VB WDT that would be prowled byNP NNS thieves VBGPP lookingINNP for anopenwindow DTJJNN active-verb prep Rules: 1.VP with VB_ and PP children 2.PP with IN and NP children  prowled by

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ (ROOT (S (VP (VB exert) (NP (PRP$ their) (JJ utmost) (NN skill))))) Root S VP VBNP exertPRPJJNN skilltheirutmost active-verb dobj Rules: 1.Node with NP and VP children 2.VP with VB and NP children  exert their utmost skill NP

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Coreference Resolution Larissa Munishkina UCSC 2013

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ What is reference? Words are used to represent things and experiences in the real or imagined world. Different words can be used to describe the same thing or experience. Referent is the concrete object or concept that is designated by a word or expression. A referent is an object, action, state, relationship, or attribute in the referential realm.referential realm Reference: 1.is the symbolic relationship that a linguistic expression has with the concrete object or abstraction it represents. 2.is the relationship of one linguistic expression to another, in which one provides the information necessary to interpret the other.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ What is coreference resolution? Coreference is the reference in one expression to the same referent in another expression.referent Example: in the following sentence, both you's have the same referent: You said you would come. Coreference resolution is a task of finding all references to the same entity. Coreference resolution includes:  Pronominal anaphora resolution  Nominal coreference resolution

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Example of Coreferences  [Douglas Quail] 1 and [his wife Kirsten] 2, are asleep in bed.  Gradually the room lights brighten.  The clock chimes and begins speaking in a soft, feminine voice.  [They] 1,2 don't budge.  Shortly, the clock chimes again.  [Quail's wife] 2 stirs.  Maddeningly, the clock chimes a third time.  [Quail] 1 reaches out and shuts the clock off.  Then [he] 1 sits up in bed.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Coreference and Event Chains Coreference Chain: (Douglas Quail ; his ; Quail 's ; he ; He ; his ; He ; his ; He ; a good-looking but conventional man in his early thirties ; his ; He ; his ; him ; ), Event Chain: [(be_asleep,nsubj),(sit,nsubj),(swing,nsubj),(sit,nsubj), (put,nsubj),(sit,nsubj),(lose,nsubj),(be_man,nsubj), (seem,nsubj)]

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ StanfordCoreNLP Tool Web site: Excerpt from the code: StanfordCoreNLP pipeline = new StanfordCoreNLP(props); Annotation document = new Annotation(text); pipeline.annotate(document); Map graph = document.get(CorefChainAnnotation.class); System.out.println(graph);

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ StanfordCoreNLP Example Input Raw Text: Jack and Jill went up the hill to fetch a pail of water. Jack fell down and broke his crown, and Jill came tumbling after. Up he got, and home did trot, as fast as he could caper; to old Dame Dob, who patched his nob with vinegar and brown paper. Output Coreference Chains: 1=CHAIN1-["Jack" in sentence 1, "Jack" in sentence 2, "his" in sentence 2, "he" in sentence 3, "he" in sentence 3, "his" in sentence 3], 2=CHAIN2-["Jill" in sentence 1, "Jill" in sentence 2], 12=CHAIN12-["old Dame Dob, who patched his nob with vinegar and brown paper" in sentence 3, "old Dame Dob" in sentence 3, "Dame Dob" in sentence 3], 15=CHAIN15-["trot, as fast as he could caper ;" in sentence 3]}

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Questions? Thank you!

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Natural Language and Dialogue Systems Lab Lots and Lots of other tools

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Other tools out there  FREEBASE ontology. Currently being used heavily for NLU   Semantic dictionaries of various kinds:  LIWC: Have a version in the lab tags at word level  MPQA, Sentiwordnet: polarity, sentiment  Wordnet  Verbnet  FrameNet  Other more detailed named entity frameworks.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ LIWC. We have used a lot.

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Have a version in NLDS, tags at word level

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Using WordNet: Online Thesaurus.   What is the service ceiling of a U-2?  Can access it FROM a program (not just this interface).

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Using WordNet: Online Thesaurus.  What is the service ceiling of a U-2  Can access it as a program. 

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Wikipedia: What knowledge can we get from Wikipedia?

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Extracting Named Entities Person: Mr. Hubert J. Smith, Adm. McInnes, Grace Chan Title: Chairman, Vice President of Technology, Secretary of State Country: USSR, France, Haiti, Haitian Republic City: New York, Rome, Paris, Birmingham, Seneca Falls Province: Kansas, Yorkshire, Uttar Pradesh Business: GTE Corporation, FreeMarkets Inc., Acme University: Bryn Mawr College, University of Iowa Organization: Red Cross, Boys and Girls Club

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ Answer Type Hierarchy: Sheffield TREC group This is where technology was when IBM started their project

NATURAL LANGUAGE AND DIALOGUE SYSTEMS LABUC SANTA CRUZ An Example But many foreign investors remain sceptical, and western governments are withholding aid because of the Slorc's dismal human rights record and the continued detention of Ms Aung San Suu Kyi, the opposition leader who won the Nobel Peace Prize in The military junta took power in 1988 as pro-democracy demonstrations were sweeping the country. It held elections in 1990, but has ignored their result. It has kept the 1991 Nobel peace prize winner, Aung San Suu Kyi - leader of the opposition party which won a landslide victory in the poll - under house arrest since July The regime, which is also engaged in a battle with insurgents near its eastern border with Thailand, ignored a 1990 election victory by an opposition party and is detaining its leader, Ms Aung San Suu Kyi, who was awarded the 1991 Nobel Peace Prize. According to the British Red Cross, 5,000 or more refugees, mainly the elderly and women and children, are crossing into Bangladesh each day. Who won the Nobel Peace Prize in 1991?