TDT 2002 Straw Man TDT 2001 Workshop November 12-13, 2001.

Slides:



Advertisements
Similar presentations
Getting Started with Dreamweaver DREAMWEAVER MX. Getting Started with Dreamweaver Contents –What Can Dreamweaver MX Do? –Dreamweaver Learning and Support.
Advertisements

Information Extraction from Spoken Language Dr Pierre Dumouchel Scientific Vice-President, CRIM Full Professor, ÉTS.
 TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Creating the Annotated TDT-4 Y2003 Evaluation Corpus Stephanie Strassel, Meghan Glenn Linguistic.
1 BASIC NOTIONS OF PROBABILITY THEORY. NLE 2 What probability theory is for Suppose that we have a fair dice, with six faces, and that we keep throwing.
Annotating Topics of Opinions Veselin Stoyanov Claire Cardie.
UMass Amherst at TDT 2003 James Allan, Alvaro Bolivar, Margie Connell, Steve Cronen-Townsend, Ao Feng, FangFang Feng, Leah Larkey, Giridhar Kumaran, Victor.
© Copyright 2011 John Wiley & Sons, Inc.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Annotation Types for UIMA Edward Loper. UIMA Unified Information Management Architecture Analytics framework –Consists of components that perform specific.
Event Extraction: Learning from Corpora Prepared by Ralph Grishman Based on research and slides by Roman Yangarber NYU.
Issues in Pre- and Post-translation Document Expansion: Untranslatable Cognates and Missegmented Words Gina-Anne Levow University of Chicago July 7, 2003.
XML Document Mining Challenge Bridging the gap between Information Retrieval and Machine Learning Ludovic DENOYER – University of Paris 6.
SLIDE 1IS 240 – Spring 2010 Prof. Ray Larson University of California, Berkeley School of Information Principles of Information Retrieval.
Using Information Extraction for Question Answering Done by Rani Qumsiyeh.
Jumping Off Points Ideas of possible tasks Examples of possible tasks Categories of possible tasks.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Web Design, Usability, and Aesthetics 1 Notes from book “Don’t Make Me Think: A Common Sense Approach to Web Usability” by Steve Krug.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
CSE 730 Information Retrieval of Biomedical Data The use of medical lexicon in biomedical IR.
 TDT PI Meeting - November 16-17, 2000 Annotation Overview  Background  annotation strategy search-guided complete annotation work with one topic at.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Sharif University of Technology Session # 7.  Contents  Systems Analysis and Design  Planning the approach  Asking questions and collecting data 
Department of Computer Science 1 CSS 496 Business Process Re-engineering for BS(CS)
Information Retrieval in Practice
1 Web Developer Foundations: Using XHTML Chapter 11 Web Page Promotion Concepts.
ELN – Natural Language Processing Giuseppe Attardi
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Dr. MaLinda Hill Advanced English C1-A Designing Essays, Research Papers, Business Reports and Reflective Statements.
Mr. Wilson - LMAC. Where to begin?  So, we have covered Lit Elements and Author’s Techniques; we have talked about Thesis Statements; we have discussed.
The Evolution of Shared-Task Evaluation Douglas W. Oard College of Information Studies and UMIACS University of Maryland, College Park, USA December 4,
Topic Detection and Tracking Introduction and Overview.
Russian Information Retrieval Evaluation Seminar (ROMIP) Igor Nekrestyanov, Pavel Braslavski CLEF 2010.
Chapter 7 Web Content Mining Xxxxxx. Introduction Web-content mining techniques are used to discover useful information from content on the web – textual.
HISTORY FAIR AND YOU Tips for parents and students about History Fair Projects.
AnswerBus Question Answering System Zhiping Zheng School of Information, University of Michigan HLT 2002.
Translingual Topic Tracking with PRISE Gina-Anne Levow and Douglas W. Oard University of Maryland February 28, 2000.
Blogging By Yun Taiho. Your Favorite Blog and Why.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
TRECVID Evaluations Mei-Chen Yeh 05/25/2010. Introduction Text REtrieval Conference (TREC) – Organized by National Institute of Standards (NIST) – Support.
Collaborative Annotation of the AMI Meeting Corpus Jean Carletta University of Edinburgh.
UMass at TDT 2000 James Allan and Victor Lavrenko (with David Frey and Vikas Khandelwal) Center for Intelligent Information Retrieval Department of Computer.
Overview of the TDT-2003 Evaluation and Results Jonathan Fiscus NIST Gaithersburg, Maryland November 17-18, 2002.
ENGL Selecting effective report topics Using worksheets and discussion to plan projects Developing and proposing project plans.
1 01/10/09 1 INFILE CEA LIST ELDA Univ. Lille 3 - Geriico Overview of the INFILE track at CLEF 2009 multilingual INformation FILtering Evaluation.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
TimeML compliant text analysis for Temporal Reasoning Branimir Boguraev and Rie Kubota Ando.
Carnegie Mellon Novelty and Redundancy Detection in Adaptive Filtering Yi Zhang, Jamie Callan, Thomas Minka Carnegie Mellon University {yiz, callan,
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
Plan for the session Look at the tool 15 minutes back story and the research 30 minutes exploring and using the tool in some detail Initial thoughts De.
O RGANIZATION AND F OCUS - D OES THE WRITER EFFECTIVELY ADDRESS ALL PARTS OF THE TASK ( QUESTION ) DEMONSTRATING AN IN - DEPTH UNDERSTANDING.
Document Databases for Information Management Gregor Erbach FTW, Wien DFKI, Saarbrucken ETL, Tsukuba
 TDT 2004 Evaluation Workshop, NIST, December 2-3, 2004 Creating the TDT5 Corpus and 2004 Evaluation Topics at LDC Stephanie Strassel, Meghan Glenn, Junbo.
Topics Detection and Tracking Presented by CHU Huei-Ming 2004/03/17.
TEXT TYPES Writing III. TEXT TYPES & SCHOOL LEARNING (Droga & Humprey, 2005: 9) Common curriculum outcomesText types Classify and describe phenomenaFactual.
1 13/05/07 1/20 LIST – DTSI – Interfaces, Cognitics and Virtual Reality Unit The INFILE project: a crosslingual filtering systems evaluation campaign Romaric.
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
TDT 2000 Workshop Lessons Learned These slides represent some of the ideas that were tried for TDT 2000, some conclusions that were reached about techniques.
New Event Detection at UMass Amherst Giridhar Kumaran and James Allan.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
TDT 2004 Unsupervised and Supervised Tracking Hema Raghavan UMASS-Amherst at TDT 2004.
Hierarchical Topic Detection UMass - TDT 2004 Ao Feng James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst.
ACES User Interface Workshop #1 Prototype Inspection 22. November 2011.
Using Blog Properties to Improve Retrieval Gilad Mishne (ICWSM 2007)
Data mining in web applications
Lecture 16: Filtering & TDT
Main Idea/ Central Idea
Social Knowledge Mining
Designing the PACS 2 RESEARCH PAPER Assignment
How are the two texts similar?
Open Source SUMMA Platform
Presentation transcript:

TDT 2002 Straw Man TDT 2001 Workshop November 12-13, 2001

Corpus  TDT-4 English and Mandarin required Arabic optional, but encouraged Provided in TREC-friendly format Made substantially cheaper  Topics 60 new topics, as before “Brief” redefined to be “more useful”  Stand-off notation standard So sites can provide useful annotations

Tasks  Drop tracking Transition to TREC filtering Entry task if TREC does not pick it up  Keep segmentation (sunset?)  Focus on FSD and SLD  Exploratory evaluations on clustering  New event-based evaluation

Changes to “brief”  Currently “brief” is “less than 10% is on topic” LDC does this strictly So 10.5% is “YES”!  Prefer notion of whether topic is central to the story or not If central topic, then YES If mentioned in passing, then BRIEF Requires a SHARED-CENTRAL possibility?  This idea requires rethinking

Standoff anotation  TDT-2 and TDT-3 distributed with: ASR into text SYSTRAN into English Named entity tagging  Standardize a means for sites to provide other annotations: POS or parsings Co-references for named entities Time expressions with normalization Alternate translations Subject-like headings à la BBN’s tags …

Clustering evaluations  Few people happy with clustering measures  Many people unhappy with central idea of clustering Partitioning of corpus (single topic stories) No hierarchies permitted in results  Allow exploration of new models for clustering Perhaps inspired by IFE Bio and IFE Arabic?  Both systems have UMass detection running  Or new problems based on clustering Linking clusters, describing their substructures, …

Event-based evaluation  Most (not all) TDT approaches would work just as well for IR filtering or event IR document retrieval  Force exploration of TDT-specific needs  Topic is made of events Seminal events and inevitable ones  Event is something that happens somewhere at a particular time Who, where, when, what  Explicitly capture components of events?

Event-based straw man  Based on link detection  Given two stories: Is the perpetrator (“who”) the same? Do they describe events that take place at the same location? …at the same time?  Idea: If two stories talk about events at the same time, they’re more likely to be talking about the same event (obviously more than time needed)