Download presentation
Presentation is loading. Please wait.
1
TDT PI Meeting - November 16-17, 2000 Annotation Overview Background annotation strategy search-guided complete annotation work with one topic at a time multiple stages for each topic annotation resources definition/explication rules of interpretation topic research larger role than in 1999 fed directly into topic annotation
2
TDT PI Meeting - November 16-17, 2000 Annotation Strategy STAGE 1: Initial Query submit all known on-topic stories as query to search engine OT stories revealed during topic selection, definition & research read through resulting relevance-ranked list, annotating all stories as YES/BRIEF/NO stop after 5-10 additional on-topic stories identified; or after reaching “off-topic threshold”: at least 2 off-topic stories for every 1 on-topic read AND the last 10 consecutive stories off-topic This was possible for 51 English, 34 Mandarin topics If no pre-existing OT stories, go directly to text-based query (Stage 3)
3
TDT PI Meeting - November 16-17, 2000 Annotation Strategy STAGE 2: Improved Queries Based on Additional On-Topic Stories issue a new query using a concatenation of all known on-topic stories read and annotate stories in resulting relevance-ranked list until reaching “off-topic threshold” minimum of one docno search for all topics with 1+ hit English - maximum of 15 Mandarin - maximum of 20 STAGE 3: Initial Text-based Queries issue a new query using the topic research document plus any additional relevant text (e.g., parts of the topic explication) read and annotate stories in resulting relevance-ranked list until reaching “off-topic threshold” minimum of one text search per topic English - maximum of 9 Mandarin - maximum of 14
4
TDT PI Meeting - November 16-17, 2000 Annotation Strategy STAGE 4: Creative Searching Instructions to Annotators: You are encouraged to use your specialized knowledge (drawn from topic research and the known on-topic stories) to conduct additional manual searches through the corpus. These additional searches will be based on keywords, names, particular on-topic stories, etc. Think creatively! If you come up with a novel way to search for additional on-topic stories, let us know. If you find additional information (names, places, dates, events) about your topic, you should revise the topic research page for that topic. Examples of Creative Searching Topic 10: European Cold Wave Annotator comments: “In annotating this topic I had to go beyond the regular parameters. It was apparent that there were YES stories remaining beyond the “no threshold”. Many of the intervening NO stories were CNN weather reports that had nothing to do with the topic. So I did extra text searches and concentrated on stories within a particular timeframe to find additional hits.” Topic 42: New Paris Subway Line No pre-existing OT stories Annotator searched WWW for topic Used content of story not within TDT3 collection as query
5
TDT PI Meeting - November 16-17, 2000 Mandarin Hits vs. Stories Read
6
TDT PI Meeting - November 16-17, 2000 English Hits vs. Stories Read
7
TDT PI Meeting - November 16-17, 2000 English Hits vs. Stories Read Annotators were permitted to ignore part of the “off-topic threshold” for topics with 50+ hits...
8
TDT PI Meeting - November 16-17, 2000 English Hits vs. Stories Read Annotators were permitted to ignore part of the “off-topic threshold” for topics with 50+ hits... …but thisone didn’t.
9
TDT PI Meeting - November 16-17, 2000 Annotation Statistics
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.