Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification.

Slides:



Advertisements
Similar presentations
Introduction to Computational Linguistics
Advertisements

Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Comparing Twitter Summarization Algorithms for Multiple Post Summaries David Inouye and Jugal K. Kalita SocialCom May 10 Hyewon Lim.
For Friday No reading Homework –Chapter 23, exercises 1, 13, 14, 19 –Not as bad as it sounds –Do them IN ORDER – do not read ahead here.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Jeffery Loo NLM Associate Fellow ’03 – ’05 chemicalinformaticsforlibraries.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Information Retrieval in Practice
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
1/23 Applications of NLP. 2/23 Applications Text-to-speech, speech-to-text Dialogues sytems / conversation machines NL interfaces to –QA systems –IR systems.
Citances and What should our UI look like? Marti Hearst SIMS, UC Berkeley Supported by NSF DBI and a gift from Genentech.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
(C) 2000, The University of Michigan 1 Database Application Design Handout #11 March 24, 2000.
Using Social Networking Techniques in Text Mining Document Summarization.
Language Technology 2005/06 Hans Uszkoreit Universität des Saarlandes
Knowledge Integration for Gene Target Selection Graciela Gonzalez, PhD Juan C. Uribe Contact:
Srihari-CSE730-Spring 2003 CSE 730 Information Retrieval of Biomedical Text and Data Inroduction.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.
Rui Yan, Yan Zhang Peking University
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Overview of IR Research ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Some Thoughts on HPC in Natural Language Engineering Steven Bird University of Melbourne & University of Pennsylvania.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
NLP.
Chapter 1 Introduction to Data Mining
IProLINK – A Literature Mining Resource at PIR (integrated Protein Literature INformation and Knowledge ) Hu ZZ 1, Liu H 2, Vijay-Shanker K 3, Mani I 4,
Processing of large document collections Part 7 (Text summarization: multi- document summarization, knowledge- rich approaches, current topics) Helena.
This work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior National Business Center contract number.
LexRank: Graph-based Centrality as Salience in Text Summarization
Constructing Knowledge Graph from Unstructured Text Image Source: Kundan Kumar Siddhant Manocha.
BioSumm A novel summarizer oriented to biological information Elena Baralis, Alessandro Fiori, Lorenzo Montrucchio Politecnico di Torino Introduction text.
LexPageRank: Prestige in Multi- Document Text Summarization Gunes Erkan and Dragomir R. Radev Department of EECS, School of Information University of Michigan.
1 CSI 5180: Topics in AI: Natural Language Processing, A Statistical Approach Instructor: Nathalie Japkowicz Objectives of.
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Biological Signal Detection for Protein Function Prediction Investigators: Yang Dai Prime Grant Support: NSF Problem Statement and Motivation Technical.
For Monday Read chapter 24, sections 1-3 Homework: –Chapter 23, exercise 8.
For Friday Finish chapter 24 No written homework.
For Monday Read chapter 26 Last Homework –Chapter 23, exercise 7.
CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 1 (03/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Introduction to Natural.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
Text Based Information Retrieval H02C8A Marie-Francine Moens Karl Gyllstrom Katholieke Universiteit Leuven Study points: 4 Language: English Periodicity:
DFKI GmbH, , R. Karger Perspectives for the Indo German Scientific and Technological Cooperation in the Field of Language Technology Reinhard.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
NATURAL LANGUAGE PROCESSING Zachary McNellis. Overview  Background  Areas of NLP  How it works?  Future of NLP  References.
IR. SI 650/EECS 549 Information Retrieval People search the Web daily Search engines –Google –Bing –Baidu –Yandex Information Retrieval is about search.
Computational Linguistics Courses Experiment Test.
For Monday Read chapter 26 Homework: –Chapter 23, exercises 8 and 9.
Text Information Management ChengXiang Zhai, Tao Tao, Xuehua Shen, Hui Fang, Azadeh Shakery, Jing Jiang.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NTNU Speech Lab 1 Topic Themes for Multi-Document Summarization Sanda Harabagiu and Finley Lacatusu Language Computer Corporation Presented by Yi-Ting.
Katy Börner Teaching & Research Teaching & Research Katy Börner
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
By Kyle McCardle.  Issues with Natural Language  Basic Components  Syntax  The Earley Parser  Transition Network Parsers  Augmented Transition Networks.
A Survey on Automatic Text Summarization Dipanjan Das André F. T. Martins Tolga Çekiç
1 Survey of Biodata Analysis from a Data Mining Perspective Peter Bajcsy Jiawei Han Lei Liu Jiong Yang.
NSF Grant Number: IIS PI: Joseph Picone Institution: Mississippi State University Title: Integrating Prosody, Speech Recognition, Parsing In Spoken-Language.
中国计算机学会学科前沿讲习班:信息检索 Course Overview
Terminology problems in literature mining and NLP
Course Summary (Lecture for CS410 Intro Text Info Systems)
What is IR? In the 70’s and 80’s, much of the research focused on document retrieval In 90’s TREC reinforced the view that IR = document retrieval Document.
Overview of IR Research
Data Warehousing and Data Mining
CSE 635 Multimedia Information Retrieval
Networked Information Resources
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Data Mining.
Information Retrieval
Presentation transcript:

Text summarization MEAD NewsInEssence Cross-document structure Sentence compression Lexrank Political science Discourse dynamics Centrality identification Information retrieval Blog databases Question answering Fact extraction Machine learning Graph-based learning Semi-supervised learning Harmonic functions Monte Carlo methods Information extraction Language modeling Modeling burstiness Biomedical literature analysis Citation network analysis Recognizing protein interactions in text Clustering CLAIR: Computational Linguistics And Information Retrieval Machine translation Syntax-based alignment Text generation Syntax-based features Models of the Web Lexical network models Miscellaneous Language reuse Paraphrase identification Lexical models of the Web Dependency parsing Write to if you have any questions Courses Information Retrieval (SI 650) – Fall 05 Advanced NLP/IR (EECS 767/SI 767) – Winter 06 Natural Language Processing (EECS 595/SI 661) – Fall 06 Language and Information (EECS 597/SI 760) – Fall 06 Database Applications Design (SI 654) – Fall 05 Faculty: Dragomir Radev Students: Güneş Erkan, Arzucan Özgür, Xiaodong Shi, Zhuoran Chen Mark Joseph, Konstantin Zak, Tony Fader, Joshua Gerrish

Main areas of interest  Graph-based methods  Machine learning  Text summarization  Question answering  Text mining in political science, blogometrics, bioinformatics

List of current funded projects BlogoCenter: Infrastructure for Collecting, Mining and Accessing Blogs NSF (joint with Junghoo Cho of UCLA) Probabilistic and link-based Methods for Exploiting Very Large Textual Repositories NSF Representing and Acquiring Knowledge of Genome Regulation NIH (joint with Steve Abney, David States, and H.V. Jagadish) Collaborative research: semantic entity and relation extraction from Web-scale text document collections NSF (joint with Michael Collins of MIT and Steve Abney) DHB: The dynamics of Political Representation and Political Rhetoric NSF (joint with Kevin Quinn of Harvard, Burt Monroe of PSU) NCIBI: National center for integrative bioinformatics NIH (joint with 20 other faculty)

Representative recent papers  News to Go: Hierarchical Text Summarization for Mobile Devices (SIGIR 2006)  Language Model Based Document Clustering Using Random Walks (HLT- NAACL 2006)  An automated method of topic-coding legislative speech over time with application to the 105th-108th u. s. senate (MPSA 2006 – Gosnell Award)  Summarizing online news topics (CACM 2005)  Using random walks for question-focused sentence retrieval (HLT-EMNLP 2005)  Context-based generic cross-lingual retrieval of documents and automated summaries (JASIST 2005)  Probabilistic question answering on the web (JASIST 2005)  Centroid-based summarization of multiple documents (IPM 2004)  A smorgasbord of features for statistical machine translation (HLT-NAACL 2004)  Graph-based centrality as salience in text summarization (JAIR 2004)

Papers in progress or under submission  Summarization evaluation in a cross-lingual information retrieval context. Submitted to Information Processing and Management.  Retrieval of context-specific, dynamic information: A survey of related work. Submitted to ACM Computing Surveys.  Single-document and multi-document summary evaluation using relative utility. Submitted to Information Retrieval.  Exploring Fact-Focused Relevance and Novelty Detection, submitted to Information Processing and Management  Hierarchical Summarization for Delivering Information to Mobile Devices, submitted to Decision Support Systems  Modeling Burstiness in Discourse Using a Stochastic Stack  A topological analysis of semisupervised graph-based learning with harmonic functions  Protein-protein interaction with no external knowledge  An empirical analysis of 100 lexical networks  Hiring networks in information science and computer science  Blind men and elephants: What do citation summaries tell us about a research article  Reinforcement classifiers  Dependency parsing using random walks  Modeling Document Dynamics: An Evolutionary Approach  Cross-document relationship classification for text summarization

Software available  MEAD – text summarization  NSIR – question answering  CLAIRLIB – generic NLP/IR