CS4705 Natural Language Processing.  Final: December 18 th 1:10-4, 1024 Mudd ◦ Closed book, notes, electronics  Don’t forget courseworks evaluation:

Slides:



Advertisements
Similar presentations
Toward Automatic Music Audio Summary Generation from Signal Analysis Seminar „Communications Engineering“ 11. December 2007 Patricia Signé.
Advertisements

Statistical NLP: Lecture 3
1 Discourse, coherence and anaphora resolution Lecture 16.
Discourse Martin Hassel KTH NADA Royal Institute of Technology Stockholm
Chapter 18: Discourse Tianjun Fu Ling538 Presentation Nov 30th, 2006.
INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING NLP-AI IIIT-Hyderabad CIIL, Mysore ICON DECEMBER, 2003.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
1 Words and the Lexicon September 10th 2009 Lecture #3.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
CS 4705 Algorithms for Reference Resolution. Anaphora resolution Finding in a text all the referring expressions that have one and the same denotation.
CS 4705 Lecture 17 Semantic Analysis: Syntax-Driven Semantics.
Final Review CS4705 Natural Language Processing. Semantics Meaning Representations –Predicate/argument structure and FOPC Thematic roles and selectional.
CS 4705 Lecture 21 Algorithms for Reference Resolution.
Natural Language Generation Martin Hassel KTH CSC Royal Institute of Technology Stockholm
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
CS 4705 Semantic Analysis: Syntax-Driven Semantics.
CS 4705 Final Review CS4705 Julia Hirschberg. Format and Coverage Covers only material from thru (i.e. beginning with Probabilistic Parsing) Same format.
Some slides adapted from.  Linguistic Generation  Statistical Generation.
(C) 2000, The University of Michigan 1 Database Application Design Handout #11 March 24, 2000.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
Natural Language Understanding
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Introduction to Natural Language Generation
9/8/20151 Natural Language Processing Lecture Notes 1.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
1 Computational Linguistics Ling 200 Spring 2006.
CS 4705 Natural Language Processing Fall 2010 What is Natural Language Processing? Designing software to recognize, analyze and generate text and speech.
Natural Language Processing Introduction. 2 Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
© Copyright 2013 ABBYY NLP PLATFORM FOR EU-LINGUAL DIGITAL SINGLE MARKET Alexander Rylov LTi Summit 2013 Confidential.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Linguistic Essentials
CSA2050 Introduction to Computational Linguistics Lecture 1 Overview.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Introduction to Computational Linguistics
Rules, Movement, Ambiguity
Artificial Intelligence: Natural Language
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
Compiler Construction (CS-636)
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
NLP. Introduction to NLP (U)nderstanding and (G)eneration Language Computer (U) Language (G)
SYNTAX 1 NOV 9, 2015 – DAY 31 Brain & Language LING NSCI Fall 2015.
Natural Language Processing (NLP)
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Writing Inspirations, 2017 Aalto University
CSC 594 Topics in AI – Natural Language Processing
Statistical NLP: Lecture 3
Chapter Eight Syntax.
Writing Inspirations, Spring 2016 Aalto University
Why Study Spoken Language?
Why Study Spoken Language?
Chapter Eight Syntax.
CSCI 5832 Natural Language Processing
Linguistic Essentials
CS4705 Natural Language Processing
Artificial Intelligence 2004 Speech & Natural Language Processing
Information Retrieval
Presentation transcript:

CS4705 Natural Language Processing

 Final: December 18 th 1:10-4, 1024 Mudd ◦ Closed book, notes, electronics  Don’t forget courseworks evaluation: only 25% so far have done it.  Office hours as usual next week  Slides should all be accessible in.ppt format. Send if any problems.  HW4: clarification in pyramid question. Use either definition: ◦ “optimal summary with same number of SCUs”.. (807, Book) ◦ “the number of facts in a maximally informative 100 word summary”. (HW problem)

 Midterm Curve ◦ Graduate students  A+>80  A71-79  A  B  B62-65  B  C  C49-57  C-48  D< 48

 Undergrad Curve  A+> 80  A72-80  A  B  B55-66  B  C  C42-46  C  D<40

 Speech Recognition (Spring 09)  Search Engine Technology (Spring 09)  Spoken Language Processing (next year)

 Instructors: Stan Chen, Michael Picheny, and Bhuvana Ramabhadran (all from IBM T. J. Watson)  Time: Monday 4:10-6:10 Prerequisites: Knowledge of basic probability and statistics and proficiency in at least one programming language. Knowledge of digital signal processing (ELEN E4810) helpful but not required. The first portion of the course will cover fundamental topics in speech recognition: signal processing, Gaussian mixture distributions, hidden Markov models, pronunciation modeling, acoustic state tying, decision trees, finite-state transducers, search, and language modeling. In the remainder of the course, we survey advanced topics from the current state of the art, including acoustic adaptation, discriminative training, and audio-visual speech recognition.

 Instructor: Dragomir Radev,  Time: Friday 2-4  Goal of the course A significant portion of the information that surrounds us is in textual format. A number of techniques for accessing such information exist, ranging from databases to natural language processing. Some of the most prestigious companies these days spend large amounts of money to build intelligent search engines that allow casual users to find what they want anytime, from anywhere, and in any language. In this course, we will cover the theory and practice behind the implementation of search engines, focusing on a wide range of topics including methods for text storage and retrieval, the structure of the Web as a graph, evaluation of systems, and user interfaces.

 Speech phenomena ◦ Acoustics, intonation, disfluencies, laughter ◦ Tools for speech annotation and analysis  Speech technologies ◦ Text-to-Speech ◦ Automatic Speech Recognition ◦ Speaker Identification ◦ Dialogue Systems

 Challenges for speech technologies ◦ Pronunciation modeling ◦ Modeling accent, phrasing and contour ◦ Spoken cues to  Discourse segmentation  Information status  Topic detection  Speech acts  Turn-taking  Fun stuff: emotional speech, charismatic speech, deceptive speech….

 An experiment done by outgoing ACL President Bonnie Dorr

 Fill-in-the-blank/multiple choice  Short answer  Problem solving  Essay  Comprehensive (Will cover the full semester)

 Meaning Representations ◦ Predicate/argument structure and FOPC  Thematic roles and selectional restrictions Agent/ Patient: George hit Bill. Bill was hit by George George assassinated the senator. *The spider assassinated the fly

 Compositional semantics ◦ Rule 2 rule hypothesis ◦ E.g. x y E(e) (Isa(e,Serving) ^ Server(e,y) ^ Served(e,x)) ◦ Lambda notation λ x P(x): λ + variable(s) + FOPC expression in those variables  Non-compositional semantics ◦ Metaphor: You’re the cream in my coffee. ◦ Idiom: The old man finally kicked the bucket. ◦ Deferred reference: The ham sandwich wants his check.

 Give the FOPC meaning representation for: ◦ John showed each girl an apple. ◦ All students at Columbia University are tall.  Given a sentence and a syntactic grammar, give the semantic representation for each word and the semantic annotations for the grammar. Derive the meaning representation for the sentence.

 Representing time: ◦ Reichenbach ’47  Utterance time (U): when the utterance occurs  Reference time (R): the temporal point-of-view of the utterance  Event time (E): when events described in the utterance occur George is eating a sandwich. -- E,R,U  George will eat a sandwich?  Verb aspect ◦ Statives, activities, accomplishments, achievements

 Wordnet: pros and cons  Types of word relations ◦ Homonymy: bank/bank ◦ Homophones: red/read ◦ Homographs: bass/bass ◦ Polysemy: Citibank/ The bank on 59 th street ◦ Synonymy: big/large ◦ Hyponym/hypernym: poodle/dog ◦ Metonymy: waitress: the man who ordered the ham sandwich wants dessert./the ham sandwich wants dessert. ◦ The White House announced the bailout plan.

 What were some problems with WordNet that required creating their own dictionary?  What are considerations about objects have to be taken into account when generating a picture that depicts an “on” relation?

Time flies like an arrow.  Supervised methods ◦ Collocational ◦ Bag of words  What features are used?  Evaluation  Semi-supervised ◦ Use bootstrapping: how?  Baselines ◦ Lesk method ◦ Most frequent meaning

 Information Extraction ◦ Three types of IE: NER, relation detection, QA ◦ Three approaches: statistical sequence labeling, supervised, semi-supervised ◦ Learning patterns:  Using Wikipedia  Using Google  Language modeling approach  Information Retrieval ◦ TF/IDF and vector-space model ◦ Precision, recall, F-measure

 What are the advantages and disadvantages of using exact pattern matching versus using flexible pattern matching for relation detection?  Given a Wikipedia page for a famous person, show how you would derive the patterns for place of birth.  If we wanted to use a language modeler to answer definition questions (e.g., “What is a quark?”), how would we do it?

 Referring expressions, anaphora, coreference, antecedents  Types of NPs, e.g. pronouns, one-anaphora, definite NPs, ….  Constraints on anaphoric reference ◦ Salience ◦ Recency of mention ◦ Discourse structure ◦ Agreement ◦ Grammatical function

◦ Repeated mention ◦ Parallel construction ◦ Verb semantics/thematic roles ◦ Pragmatics  Algorithms for reference resolution ◦ Hobbes – most recent mention ◦ Lappin and Leas ◦ Centering

 Challenges for MT ◦ Orthographical ◦ Lexical ambiguity ◦ Morphological ◦ Translational divergences  MT Pyramid ◦ Surface, transfer, interlingua ◦ Statistical?  Word alignment  Phrase alignment  Evaluation strategies ◦ Bleu ◦ Human levels of grading criteria

 How does lexical ambiguity affect MT?  Compute the Bleu score for the following example, using unigrams and bigrams: ◦ Translation: One moment later Alice went down the hole. ◦ References: In another moment down went Alice after it, ◦ In another minute Alice went into the hole. ◦ In one moment Alice went down after it. ◦ [never once considering how in the world she was to get out again.]

 Architecture  Why is generation different from interpretation?  What are some constraints on syntactic choice? Lexical choice?  Functional unification grammar  Statistical language generation ◦ Overgenerate and prune ◦ Input: abstract meaning representation ◦ How are meaning representations linked to English? ◦ What kinds of rules generate different forms?

 ((alt GSIMPLE (  ;; a grammar always has the same form: an alternative  ;; with one branch for each constituent category.  ;; First branch of the alternative  ;; Describe the category clause.  ((cat clause)  (agent ((cat np)))  (patient ((cat np)))  (pred ((cat verb-group)  (number {agent number})))  (cset (pred agent patient))  (pattern (agent pred patient))  ;; Second branch: NP  ((cat np)  (head ((cat noun) (lex {^ ^ lex})))  (number ((alt np-number (singular plural))))  (alt ( ;; Proper names don't need an article  ((proper yes)  (pattern (head)))  ;; Common names do  ((proper no)  (pattern (det head))  (det ((cat article) (lex "the")))))))  ;; Third branch: Verb  ((cat verb-group)  (pattern (v))  (aux none)  (v ((cat verb) (lex {^ ^ lex}))))  ))

 Input to generate: The system advises John.  I1 =((cat np)  (head ((lex “cat")))  (number plural))  Show unification with grammar.  What would be generated?  Suppose we wanted to change the grammar so that we could generate “a cat” or “cats”?

 Structure ◦ Topic segmentation ◦ Lexical Cues for topic shift  Lexical repetition  Introduction of new words  Lexical chains  Possible question: given a discourse, compute the lexical repetition score between each block of 2 sentences  Coherence ◦ Rhetorical Structure  Rhetorical relations  Nucleus and satellite

 Thank you and good luck on the exam!