Download presentation
Presentation is loading. Please wait.
Published bySasha Oldroyd Modified over 9 years ago
1
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Creating the Annotated TDT-4 Y2003 Evaluation Corpus Stephanie Strassel, Meghan Glenn Linguistic Data Consortium - University of Pennsylvania {strassel, mlglenn@ldc.upenn.edu}
2
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Data Collection/Preparation Collection Multiple sources, languages October 2000 – July 2001 TDT-4 Corpus V1.0 Arabic, Chinese, English only October 2000 – January 2001 Collection subsampled for annotation Goal: Reduce licensing, transcription and segmentation costs Broadcast sources: select 4 of 7 or 3 of 5 days, stagger selection to maximize coverage by day Newswire sources: sampling consistent with previous years No down-sampling of Arabic NW Reference transcripts Closed-caption text where available Commercial transcription agencies otherwise Spell-check names for English commercial transcripts Provide initial story boundaries & timestamps ASR Output & Machine Translation TDT-4 Corpus V 1.1 Incorporates patches to Mandarin ASR data to fix encoding; removes empty files
3
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 TDT-4 Corpus Overview
4
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 TDT Concepts STORY In TDT2, story is “a section containing at least two independent declarative clauses on same topic” In TDT3, definition modified to capture annotators’ intuitions about what constitutes story Distinction between “preview/teaser” and complete news story TDT4 preserves this content-based story definition Greater emphasis on consistent application of story definition among annotation crew EVENT A specific thing that happens at a specific time and place along with all necessary preconditions and unavoidable consequences TOPIC An event or activity along with all directly related events and activities
5
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Topics for 2003 40 new topics selected, defined, annotated for 2003 evaluation 20 from Arabic seed stories 10 each from Mandarin, English Topic selection strategy same as in 2002 Arabic topics are somewhat different Despite same selection strategy First time we’ve had Arabic seed stories “Topic well” is running dry 80 news topics with high likelihood of cross- language hits from 4-month span!
6
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Selection Strategy Team leaders examine randomly-selected seed story Potential seeds balanced across corpus (source/date/lang) Identify TDT-style seminal event within story Apply rule of interpretation to convert event to topic 13 rules state, for each type of seminal event, what other types of events should be considered related No requirement that selected topics have cross- language hits But team leaders use knowledge of corpus to select stories likely to produce hits in other language sources Handful of “easily confusable” topics
7
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Rules of Interpretation 1. Elections, e.g. 30030: Taipei Mayoral Elections Seminal events include: a specific political campaign, election day coverage, inauguration, voter turnouts, election results, protests, reaction. Topic includes: the entire process, from announcements of a candidate's intention to run through the campaign, nominations, election process and through the inauguration and formation of a newly-elected official's cabinet or government. 2. Scandals/Hearings, e.g. 30038: Olympic Bribery Scandal 3. Legal/Criminal Cases, e.g. 30003: Pinochet Trial 4. Natural Disasters, e.g., 30002: Hurricane Mitch 5. Accidents, e.g., 30014: Nigerian Gas Line Fire 6. Acts of Violence or War, e.g., 30034: Indonesia/East Timor Conflict 7. Science and Discovery News, e.g., 31019: AIDS Vaccine Testing Begins 8. Financial News, e.g., 30033: Euro Introduced 9. New Laws, e.g., 30009: Anti-Doping Proposals 10. Sports News, e.g., 31016: ATP Tennis Tournament 11. Political and Diplomatic Meetings, e.g., 30018: Tony Blair Visits China 12. Celebrity/Human Interest News, e.g., 31036: Joe DiMaggio Illness 13. Miscellaneous News, e.g., 31024: South Africa to Buy $5 Billion in Weapons
8
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Topic Research Provides context Annotators specialize in particular topics (of their choosing) Includes timelines, maps, keywords, named entities, links to online resources for each topic Feeds into annotation queries
9
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Topic Definition Fixed format to enhance consistency Seminal event lists basic facts – who/what/when/where Topic explication spells out scope of topic and potential difficulties Rule of interpretation link Link to additional resources Feeds directly into topic annotation
10
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Annotation Strategy Overview Search-guided complete annotation Work with one topic at a time Multiple stages for each topic; multiple iterations of each stage Two-way topic labeling decision Topic Labels YES: story discusses the topic in a substantial way NO: story does not discuss the topic at all, or only mentions the topic in passing without giving any information about the topic No BRIEF in TDT-4 “Not Easy” label for tricky decisions Triggers additional QC
11
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Annotation Search Stages Stage 1: Initial query Submit seed story or keywords as query to search engine Read through resulting relevance-ranked list Label each story as YES/NO Stop after finding 5-10 on-topic stories, or After reaching “off-topic threshold” At least 2 off-topic stories for every 1 OT read AND The last 10 consecutive stories are off-topic Stage 2: Improved query using OT stories from Stage 1 Issue new query using concatenation of all known OT stories Read and annotate stories in resulting relevance-ranked list until reaching off-topic threshold Stage 3: Text-based queries Issue new query drawn from topic research & topic definition documents plus any additional relevant text Read and annotate stories in resulting relevance-ranked list until reaching off-topic threshold Stage 4: Creative searching Annotators instructed to use specialized knowledge, think creatively to find novel ways to identify additional OT stories
12
TDT 2003 Evaluation Workshop, NIST, November 17-18, 2003 Additional Annotation & QC Top-Ranked Off-Topic Stories (TROTS) Define search epoch First 4 on-topic stories chronologically sorted Find two highly-ranked off-topic documents for each topic-language Precision All on-topic (YES) stories reviewed by senior annotator to identify false alarms All “not easy” off-topic stories reviewed Adjudication Review pooled site results and adjudicate cases of disagreement with LDC annotators’ judgments Pooled 3 sites’ tracking results Reviewed all purported LDC FAs For purported LDC Misses English and Arabic: reviewed cases where all 3 sites disagreed with LDC Mandarin: reviewed cases where 2 or more sites disagreed with LDC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.