TDT 2000 Workshop Lessons Learned These slides represent some of the ideas that were tried for TDT 2000, some conclusions that were reached about techniques.

Slides:



Advertisements
Similar presentations
Change-Point Detection Techniques for Piecewise Locally Stationary Time Series Michael Last National Institute of Statistical Sciences Talk for Midyear.
Advertisements

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Testing Theories: Three Reasons Why Data Might not Match the Theory.
Statistics 100 Lecture Set 7. Chapters 13 and 14 in this lecture set Please read these, you are responsible for all material Will be doing chapters
Analysis. Start with describing the features you see in the data.
WSD for Applications Bill Dolan SenseEval Where is WSD useful?  Lots of work in the field, but still no clear answer Where WSD = classical, dictionary-sense.
1 Developing Statistic-based and Rule-based Grammar Checkers for Chinese ESL Learners Howard Chen Department of English National Taiwan Normal University.
Languages & The Media, 4 Nov 2004, Berlin 1 Multimodal multilingual information processing for automatic subtitle generation: Resources, Methods and System.
Systems Engineering and Engineering Management The Chinese University of Hong Kong Parameter Free Bursty Events Detection in Text Streams Gabriel Pui Cheong.
Link Detection David Eichmann School of Library and Information Science The University of Iowa David Eichmann School of Library and Information Science.
UMass Amherst at TDT 2003 James Allan, Alvaro Bolivar, Margie Connell, Steve Cronen-Townsend, Ao Feng, FangFang Feng, Leah Larkey, Giridhar Kumaran, Victor.
Introduction to Automatic Classification Shih-Wen (George) Ke 7 th Dec 2005.
Combining Prosodic and Text Features for Segmentation of Mandarin Broadcast News Gina-Anne Levow University of Chicago SIGHAN July 25, 2004.
Issues in Pre- and Post-translation Document Expansion: Untranslatable Cognates and Missegmented Words Gina-Anne Levow University of Chicago July 7, 2003.
Maximizing Classifier Utility when Training Data is Costly Gary M. Weiss Ye Tian Fordham University.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
1 LM Approaches to Filtering Richard Schwartz, BBN LM/IR ARDA 2002 September 11-12, 2002 UMASS.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
A Pattern Matching Method for Finding Noun and Proper Noun Translations from Noisy Parallel Corpora Benjamin Arai Computer Science and Engineering Department.
Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Topic Detection and Tracking Introduction and Overview.
Theory testing Part of what differentiates science from non-science is the process of theory testing. When a theory has been articulated carefully, it.
Testing Theories: Three Reasons Why Data Might not Match the Theory Psych 437.
Philosophy of IR Evaluation Ellen Voorhees. NIST Evaluation: How well does system meet information need? System evaluation: how good are document rankings?
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
COMPUTER-ASSISTED PLAGIARISM DETECTION PRESENTER: CSCI 6530 STUDENT.
1 Bins and Text Categorization Carl Sable (Columbia University) Kenneth W. Church (AT&T)
Translingual Topic Tracking with PRISE Gina-Anne Levow and Douglas W. Oard University of Maryland February 28, 2000.
Modern Information Retrieval: A Brief Overview By Amit Singhal Ranjan Dash.
Information Retrieval and Web Search Cross Language Information Retrieval Instructor: Rada Mihalcea Class web page:
Classification and Ranking Approaches to Discriminative Language Modeling for ASR Erinç Dikici, Murat Semerci, Murat Saraçlar, Ethem Alpaydın 報告者:郝柏翰 2013/01/28.
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan.
Multilingual Relevant Sentence Detection Using Reference Corpus Ming-Hung Hsu, Ming-Feng Tsai, Hsin-Hsi Chen Department of CSIE National Taiwan University.
UMass at TDT 2000 James Allan and Victor Lavrenko (with David Frey and Vikas Khandelwal) Center for Intelligent Information Retrieval Department of Computer.
Overview of the TDT-2003 Evaluation and Results Jonathan Fiscus NIST Gaithersburg, Maryland November 17-18, 2002.
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
TDT 2002 Straw Man TDT 2001 Workshop November 12-13, 2001.
Using Surface Syntactic Parser & Deviation from Randomness Jean-Pierre Chevallet IPAL I2R Gilles Sérasset CLIPS IMAG.
Chapter 23: Probabilistic Language Models April 13, 2004.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
Information about and Tips for Responding to Open-Ended Questions.
1 Business Proprietary © 2009 Oculus Info Inc. Everyone’s a Critic: Memory Models and Uses for an Artificial Turing Judge W. Joseph MacInnes, Blair C.
National Taiwan University, Taiwan
1 Broadcast News Segmentation using Metadata and Speech-To-Text Information to Improve Speech Recognition Sebastien Coquoz, Swiss Federal Institute of.
Results of the 2000 Topic Detection and Tracking Evaluation in Mandarin and English Jonathan Fiscus and George Doddington.
LREC 2004, 26 May 2004, Lisbon 1 Multimodal Multilingual Resources in the Subtitling Process S.Piperidis, I.Demiros, P.Prokopidis, P.Vanroose, A. Hoethker,
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.
Topics Detection and Tracking Presented by CHU Huei-Ming 2004/03/17.
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
TypeCraft Software Evaluation 21/02/ :45 Powered by None Complete: 10 On, Partial: 0 Off, Excluded: 0 Off Country: All, Region:
Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.
Carnegie Mellon School of Computer Science Language Technologies Institute CMU Team-1 in TDT 2004 Workshop 1 CMU TEAM-A in TDT 2004 Topic Tracking Yiming.
Combining Text and Image Queries at ImageCLEF2005: A Corpus-Based Relevance-Feedback Approach Yih-Cheng Chang Department of Computer Science and Information.
CMU TDT Report November 2001 The CMU TDT Team: Jaime Carbonell, Yiming Yang, Ralf Brown, Chun Jin, Jian Zhang Language Technologies Institute, CMU.
An evolutionary approach for improving the quality of automatic summaries Constantin Orasan Research Group in Computational Linguistics School of Humanities,
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
New Event Detection at UMass Amherst Giridhar Kumaran and James Allan.
Analysis of Experiments on Hybridization of different approaches in mono and cross-language information retrieval DAEDALUS – Data, Decisions and Language,
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Seed Generation and Seeded Version Space Learning Version 0.02 Katharina Probst Feb 28,2002.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
TDT 2004 Unsupervised and Supervised Tracking Hema Raghavan UMASS-Amherst at TDT 2004.
Hierarchical Topic Detection UMass - TDT 2004 Ao Feng James Allan Center for Intelligent Information Retrieval University of Massachusetts Amherst.
Paul van Mulbregt Sheera Knecht Jon Yamron Dragon Systems Detection at Dragon Systems.
Language Modeling Again So are we smooth now? Courtesy of Chris Jordan.
Supervised Time Series Pattern Discovery through Local Importance
School of Library and Information Science
Presentation transcript:

TDT 2000 Workshop Lessons Learned These slides represent some of the ideas that were tried for TDT 2000, some conclusions that were reached about techniques on some tasks, and various other thoughts on the tasks. In general, the items here arose during presentations or during discussions following each task. These represent the impressions of the group (though mostly of the person typing: me) and their accuracy may not be perfect. Please take them in that spirit. James Allan, November 2000

Goals of meeting Discuss TDT 2000 evaluation Decide on any lessons learned Potential for HLT conference? Relate TDT to TREC filtering Including discussions of merging Decide on TDT 2001 evaluation Reality: little to no funding for new data Look ahead to TDT 2002 (?!?)

Corpus Impact/quality of search-guided annotation? New TDT-3 topics substantially different in quality from old 60 Different numbers of stories in E+M TDT-2 as training/dev data May/June has only 34 topics, AMJ has 69

MEI (Hopkins WS’00) Strictly cross-language tracking (E  M) Point: varying N t stories is like query track Phrase translations by dictionary inversion Results Phrases beat words Translation preferences (for effectiveness) Phrases then words then lemmas then syllables Post-translation re-segmentation Char bigrams are best, syllable bigrams do poorly

Tracking—what people did Models Vector space Clusters, Okapi-esque weights Statistical language model Likelihood, story length, score normalization (all of ‘em) Use detection system—N t seed cluster(s) Cluster on-topic stories (v. 1-NN) Advantage to merging new stories into topic, but heavily weighting N t stories Putting N t stories into variable clusters

Tracking Named entities as features Helped when added to morphs+stemming (IBM) High miss rate when only NE’s used: many stories have no NE’s in common (Iowa) Better for newswire in English Use of  2 for query term selection Used negative exemplars to improve

Tracking (cont) Negative exemplars helped for English (UMd) Not for Mandarin, but perhaps too much noise Character bigrams much better than words Improvements in translation help performance Particularly at the lower miss rates

Tracking lessons Pretty much matched TDT 99 results In sense of getting into 10m/1fa box With N t =1 (this year) vs. N t =4 (then) Automatic story boundaries have noticeable impact on effectiveness Not huge, but ref boundaries dominate Not as clear for English (tuning issue only?) Variability of BBN’s system with different N t stories selected Suggests variability based on sample stories Should have have various samples for running? A way to get zillions of “other” topics to track

Tracking (cont.) Stemming helped (on TDT-2) Challenge (N t =4, ASR, no boundaries) sometimes better or no worse than primary condition (N t =1, CCAP, ref boundaries) NE’s contain useful info, but not enough Negative exemplars may help Translation matters (only at low miss rates?) Score normalization continues to be an issue

Tracking questions Impact of topic size on effectiveness Evidence that small topics easier “Value” of score normalization Per-topic “dial” vs. per-system “dial”

First Story Detection UMass improved slightly ASR hurts slightly Automatic boundaries hurt slightly

Cluster detection—recap/summary Sentence boundaries (CUHK) Named entities (English and Mandarin) Learned on training corpora (CUHK) Translation (M  E) Dictionary, also parallel corpus (passage-aligned) Used to adjust weights of dictionary-translated words Seems to help (though baseline cost is high) Use of deferral window (temporary clusters) Seems reasonable, but value unclear

Cluster detection (cont.) Interpolation rather than backoff Backoff = get missing terms’ stats from GE Interpolation = all scores comb of cl & GE “Targeting” (cf. “blind RF”) Smooth incoming story with info from another corpus (15% from there is best) 20% degradation due to [these] automatic boundaries Stemming hurts for auto boundaries Stemming is a recall-enhancing device, so P(fa) higher

Cluster detection Cost increases when using native orthography SYSTRAN makes a big difference Bigger topics tend to have higher costs Easier to split a big topic? Huge cost of a miss? 1-NN non-agglomerative approaches are not stable Hurt by automatic boundaries in particular

Cluster detection Hurt by including Mandarin docs with English Hard to compare clustering by subsets I.e., cannot figure out effectiveness on X by extracting those results from X+Y results Including Y into a cluster impacts following X’s

Cluster detection questions (For George) Real task is multi-lingual SYSTRAN is just a method to get there Despite Jon’s breaking it out separately Really a contrastive run Measuring effectiveness Cost seems “bouncy”, YDZ of unclear value Minimum cost includes (say) 633 and 2204 Small changes in C det  huge change in #clusters TREC filtering’s utility measures similarly unstable Oasis experience (UMass) Need “better” application model?

Segmentation Fine-grained HMM Model position in story 250 states for start, 0 for end, 1 for middle End states become events occurring later (at start) Model where-in-story-we-are features Single coherent segmentation of text Visualization tools No use of audio information (except X)

Link Detection Lack of interest—why? UMass Much better on E-E, than M-M or M-E Normalization as f(EE,MM,ME) is important LCA smoothing (“targeting”) helpful Issue: how to find smoothing stories vs. how to compare smoothed stories

Event granularity Some events (e.g., Pinochet) seem to have several clear sub-topics over time Clear representation of topic evolution? Others are much more scattered (e.g., Swiss Air crash) Currently password protected