Download presentation
Presentation is loading. Please wait.
Published byOswald Cuthbert Mitchell Modified over 9 years ago
1
TDT 2002 Straw Man TDT 2001 Workshop November 12-13, 2001
2
Corpus TDT-4 English and Mandarin required Arabic optional, but encouraged Provided in TREC-friendly format Made substantially cheaper Topics 60 new topics, as before “Brief” redefined to be “more useful” Stand-off notation standard So sites can provide useful annotations
3
Tasks Drop tracking Transition to TREC filtering Entry task if TREC does not pick it up Keep segmentation (sunset?) Focus on FSD and SLD Exploratory evaluations on clustering New event-based evaluation
4
Changes to “brief” Currently “brief” is “less than 10% is on topic” LDC does this strictly So 10.5% is “YES”! Prefer notion of whether topic is central to the story or not If central topic, then YES If mentioned in passing, then BRIEF Requires a SHARED-CENTRAL possibility? This idea requires rethinking
5
Standoff anotation TDT-2 and TDT-3 distributed with: ASR into text SYSTRAN into English Named entity tagging Standardize a means for sites to provide other annotations: POS or parsings Co-references for named entities Time expressions with normalization Alternate translations Subject-like headings à la BBN’s tags …
6
Clustering evaluations Few people happy with clustering measures Many people unhappy with central idea of clustering Partitioning of corpus (single topic stories) No hierarchies permitted in results Allow exploration of new models for clustering Perhaps inspired by IFE Bio and IFE Arabic? Both systems have UMass detection running Or new problems based on clustering Linking clusters, describing their substructures, …
7
Event-based evaluation Most (not all) TDT approaches would work just as well for IR filtering or event IR document retrieval Force exploration of TDT-specific needs Topic is made of events Seminal events and inevitable ones Event is something that happens somewhere at a particular time Who, where, when, what Explicitly capture components of events?
8
Event-based straw man Based on link detection Given two stories: Is the perpetrator (“who”) the same? Do they describe events that take place at the same location? …at the same time? Idea: If two stories talk about events at the same time, they’re more likely to be talking about the same event (obviously more than time needed)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.