1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, 939-7118 Office Hours: Wed, 1-2; Mon 3-4 TA: Fadi Biadsy 702 CEPSR, 939-7111.

Slides:



Advertisements
Similar presentations
1 Text Summarization: News and Beyond Kathleen McKeown Department of Computer Science Columbia University.
Advertisements

Ani Nenkova Lucy Vanderwende Kathleen McKeown SIGIR 2006.
Automatic summarization Dragomir R. Radev University of Michigan
The Art of Leading a Great Youth Discussion. Newspaper Headlines Do these make sense to you?
 Take Roll  Discussion – BA 8  Questions?  Tips for revising the introduction  Workshop Time  Homework for next week.
Strategies for Written Argument English 102 Becky Cooper.
LING NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006.
Drawing Trees & Ambiguity in Trees. Some Phrase Structure Rules of English S’ -> (Comp) S S’ -> (Comp) S S -> {NP/S’} (T) VP S -> {NP/S’} (T) VP VP 
1 Do Summaries Help? A Task-Based Evaluation of Multi-Document Summarization Kathleen McKeown, Rebecca Passonneau, David Elson, Ani Nenkova, Julia Hirschberg.
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,
CS4705 Natural Language Processing Fall What will we study in this course? How can machines recognize and generate text and speech? – Human language.
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Tues 4-5; Wed 1-2 TA: Yves Petinot 728 CEPSR,
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst August 30, 2004.
CS4705 Natural Language Processing Fall What will we study in this course? How can machines recognize and generate text and speech? – Human language.
Bilingual Lexical Acquisition From Comparable Corpora Andrea Mulloni.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Tues 4-5 TA: Yves Petinot 719 CEPSR,
Common Core State Standards Professional Learning Module Series
Slide 1. Slide 2 Administrivia Nate's office hours are Wed, 2-4, in 329 Soda! TA Clint will be handing out a paper survey in class sometime this week.
1 Multi-document Summarization and Evaluation. 2 Task Characteristics  Input: a set of documents on the same topic  Retrieved during an IR search 
CS4705 Natural Language Processing Fall  How can machines recognize and generate text and speech? ◦ Human language phenomena ◦ Theories, often.
Natural Language Processing Prof: Jason Eisner Webpage: syllabus, announcements, slides, homeworks.
Natural Language Processing Ellen Back, LIS489, Spring 2015.
SI485i : NLP Day 1 Intro to NLP. Assumptions about You You know… how to program Java basic UNIX usage basic probability and statistics (we’ll also review)
Natural Language Understanding
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
CCSS: Types of Writing.
1 Ling 569: Introduction to Computational Linguistics Jason Eisner Johns Hopkins University Tu/Th 1:30-3:20 (also this Fri 1-5)
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
CCSS: Types of Writing. Common Core: Writing Anchor Standards Overview 1.Write arguments using valid reasoning and evidence 2.Write informative/explanatory.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
EQ: How are Schema and interpretation applied to political cartoons? 10/2 Bell Ringer: How does your schema: background knowledge and skills: help you.
Presentation by Dianne Smith, MJE. Something went wrong In jet crash, expert says.
1 Text Summarization: News and Beyond Kathleen McKeown Department of Computer Science Columbia University.
Summarizing Sticky Notes Dr. Buckwell Rereading Inferring.
CHAPTER 13 NATURAL LANGUAGE PROCESSING. Machine Translation.
AQUAINT 18-Month Workshop 1 Light Semantic Processing for QA Language Technologies Institute, Carnegie Mellon B. Van Durme, Y. Huang, A. Kupsc and E. Nyberg.
Markov Logic and Deep Networks Pedro Domingos Dept. of Computer Science & Eng. University of Washington.
Summarizing Sticky Notes Dr. Buckwell Rereading Inferring Teacher Note: If you want to review, try this Partner Review Game we did at the staff development.
Two Views of Computing Language / Functions Machine / Storage CSCI 312 CSCI 313.
NACLO 2008 North American Computation Linguistics Olympiad Brandeis CL Olympiad Team James Pustejovsky Tai Sassen-Liang Sharone Horowit-Hendler Noam Sienna.
: the art or skill of speaking or writing formally and effectively especially as a way to persuade or influence people.
: the art or skill of speaking or writing formally and effectively especially as a way to persuade or influence people.
CS 188: Artificial Intelligence Spring 2009 Natural Language Processing Dan Klein – UC Berkeley 1.
Speech to the Virginia Convention
: the art or skill of speaking or writing formally and effectively especially as a way to persuade or influence people.
Drawing Trees & Ambiguity in Trees
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Third International Conference on Language Resources and Evaluation, Las Palmas, Canary Islands, Spain, May 2002 Columbia University Catalogued recommended.
Pastra and Saggion, EACL 2003 Colouring Summaries BLEU Katerina Pastra and Horacio Saggion Department of Computer Science, Natural Language Processing.
Overview of the Proposal MIME 4200: Senior Design Projects.
1 Question Answering and Logistics. 2 Class Logistics  Comments on proposals will be returned next week and may be available as early as Monday  Look.
Text Summarization using Lexical Chains. Summarization using Lexical Chains Summarization? What is Summarization? Advantages… Challenges…
Written Report All projects must include a written report. Approximately 5000 words if your project consists of only a written report, e.g. extended essay,
+ DO NOW Please begin reading NYTimes article entitled “Whither Moral Courage.” Select 5 Words from the reading that best expresses the central idea of.
Writing Lab Reports Writing Consultant Presentation EG 1003: Intro to Engineering and Design NYU’s Polytechnic School of Engineering.
Natural Language Processing Vasile Rus
/665 Natural Language Processing
THE ROAD TO GOOD LEGAL WRITING
Science and Technology of Consciousness
Accountability and Performance Management
RHETORIC.
Natural Language Processing for the Web
Research paper & Annotated Bibliography
Natural Language Processing for the Web
Writing a Scientific Research Paper
MLA Formatting English 112 K. Beam.
Presentation transcript:

1 Natural Language Processing for the Web Prof. Kathleen McKeown 722 CEPSR, Office Hours: Wed, 1-2; Mon 3-4 TA: Fadi Biadsy 702 CEPSR, Office Hours: Thurs 6-8

2 Logistics  Remaining classes  CS Conference Room  Except  April 3 rd, back in 223 Mudd  Invited speakers: 7 th Floor Interschool Lab  CS account: apply for one now   Presentations, Discussants  Need two presenters for next week  If you haven’t already signed up, sign up on sheet going around

3 Today  Overview  Single doc summarization systems:  Trimmer (Zajic et al), Kathy  Cut and Paste (Jing and McKeown), Sigfried Gold  Statistical Sentence Compression (Knight and Marcu), Kathy  Tools  Parsers, POS taggers, Barry Schiffman  Evaluation  Pyramids (Nenkova and Passonneau), Joshua Nankin  Rouge (Lin and Hovy), Kathy

4 Sentence extraction  Sparck Jones:  `what you see is what you get’, some of what is on view in the source text is transferred to constitute the summary

5 Background  Sentence extraction the main approach  Some more sophisticated features for extraction  Lexical chains, anaphoric reference  Machine learning model for learning an extraction summarizer: Kupiec, SIGIR 95.

6 Today’s systems  How can we edit the selected text?

7 Karen Sparck Jones Automatic Summarizing: Factors and Directions

8 Sparck Jones claims  Need more power than text extraction and more flexibility than fact extraction (p. 4)  In order to develop effective procedures it is necessary to identify and respond to the context factors, i.e. input, purpose and output factors, that bear on summarising and its evaluation. (p. 1)  It is important to recognize the role of context factors because the idea of a general-purpose summary is manifestly an ignis fatuus. (p. 5)  Similarly, the notion of a basic summary, i.e., one reflective of the source, makes hidden fact assumptions, for example that the subject knowledge of the output’s readers will be on a par with that of the readers for whom the source was intended. (p. 5)  I believe that the right direction to follow should start with intermediate source processing, as exemplified by sentence parsing to logical form, with local anaphor resolutions

9 Questions (from Sparck Jones)  Does subject matter of the source influence summary style (e.g, chemical abstracts vs. sports reports)?  Should we take the reader into account and how?  Is the state of the art sufficiently mature to allow summarization from intermediate representations and still allow robust processing of domain independent material?

10 For the next two classes  Consider the papers we read in light of Sparck Jones’ remarks on the influence of context:  Input  Source form, subject type, unit  Purpose  Situation, audience, use  Output  Material, format, style

11 Trimmer Algorithm

12 Headline Ambiguity  Iraqi Head Seeks Arms  Juvenile Court to Try Shooting Defendant  Teacher Strikes Idle Kids  Kids Make Nutritious Snacks  British Left Waffles on Falkland Islands  Red Tape Holds Up New Bridges  Bush Wins on Budget, but More Lies Ahead  Hospitals are Sued by 7 Foot Doctors  Ban on nude dancing on Governor’s desk  Local high school dropouts cut in half