Download presentation
Presentation is loading. Please wait.
Published byValentine McCarthy Modified over 9 years ago
1
CMU TDT Report 12-13 November 2001 The CMU TDT Team: Jaime Carbonell, Yiming Yang, Ralf Brown, Chun Jin, Jian Zhang Language Technologies Institute, CMU
2
Time Line for TDT Activities (Re)Start: Summer 2001 Baseline FSD, Link, Det: Sept 2001 Evaluation (of baseline): Oct 2001 New Techniques: Nov 2001 – Onwards Topic-conditional Novelty Situated NE’s (all tasks) Source-conditional interpolated training
3
Baseline FSD Method (Unconditional) Dissimilarity with Past Decision threshold on most-similar story (Linear) temporal decay Length-filter (for teasers) Cosine similarity with standard weights:
4
FSD Results Story weighted Topic weighted P(miss).6028 P(F/A).0207.0186 Cost.0141.0143 Norm Cost.7043.7217 Opt N. Cost.6807
5
Comparative FSD DET Curves
6
FSD Observations Cross-site comparable baselines (cost =.7) Data/labeling issues (from error analysis) “Events-vs-Topics” issue (e.g. Asia crisis) A few mislabled stories wreak havoc for FSD Eager auto-segmentation a problem (misses) Recommendations for TDT labeling FSD on true events, or events within topic(s) Change auto-segmentation optimality criterion ?? Recommendations for TDT reserachers Keep working hard on FSD – not cracked yet
7
New FSD Directions Topic-conditional models E.g. “airplane,” “investigation,” “FAA,” “FBI,” “casualties,” topic, not event “TWA 800,” “March 12, 1997” event First categorize into topic, then use maximally-discriminative terms within topic Rely on situated named entities E.g. “Arcan as victim,” “Sharon as peacemaker ”
8
A New Approach to First Story Detection for TDT
9
Baseline Story-Link Detection Use same term-weighting and cosine similarity as FSD and detection Decision Thresholds conditioned on language and source Lower threshold for cross-language Lower threshold cross-ASR/newswire Thresholds trained on development set 15% improvement over universal threshold
10
Primary Link
11
CMU Link
12
CMU2 Link
13
CMU Detection Auto-segmented boundaries Pre-established boundaries C det (basic).0076.0063 C det (norm).3786.3138 Incremental Retrospective Clustering Group-Average in Forward Deferral Window Same cosine similarity and terms weight as FSD
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.