The TERN Task EVALITA 2007 Valentina Bartalesi Lenzi & Rachele Sprugnoli www.celct.it.

Slides:



Advertisements
Similar presentations
Dealing with Italian Temporal Expressions: the ITA-Chronos System Matteo Negri Fondazione Bruno Kessler - IRST, Trento - Italy EVALITA 2007.
Advertisements

EVALITA 2009 Recognizing Textual Entailment (RTE) Italian Chapter Johan Bos 1, Fabio Massimo Zanzotto 2, Marco Pennacchiotti 3 1 University of Rome La.
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
The eXtensible Markup Language (XML) An Applied Tutorial Kevin Thomas.
Title Course opinion mining methodology for knowledge discovery, based on web social media Authors Sotirios Kontogiannis Ioannis Kazanidis Stavros Valsamidis.
1 A scheme for racquet sports video analysis with the combination of audio-visual information Visual Communication and Image Processing 2005 Liyuan Xing,
1 Entity Ranking Using Wikipedia as a Pivot (CIKM 10’) Rianne Kaptein, Pavel Serdyukov, Arjen de Vries, Jaap Kamps 2010/12/14 Yu-wen,Hsu.
Domain-Independent Data Extraction: Person Names Carl Christensen and Deryle Lonsdale Brigham Young University
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
J. Turmo, 2006 Adaptive Information Extraction Summary Information Extraction Systems Multilinguality Introduction Language guessers Machine Translators.
MANISHA VERMA, VASUDEVA VARMA PATENT SEARCH USING IPC CLASSIFICATION VECTORS.
Approaches to automatic summarization Lecture 5. Types of summaries Extracts – Sentences from the original document are displayed together to form a summary.
Designing clustering methods for ontology building: The Mo’K workbench Authors: Gilles Bisson, Claire Nédellec and Dolores Cañamero Presenter: Ovidiu Fortu.
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Arabic Natural Language Processing: P-Stemmer, Browsing Taxonomy, Text Classification, RenA, ALDA, and Template Summaries — for Arabic News Articles Tarek.
Final Report. Dev/Prototype Fall 2011 System Testing (Early) Spring 2012 Pilot Test w/ Selected Departments Spring 2012 Trainings for Academic Chairs.
Learning Table Extraction from Examples Ashwin Tengli, Yiming Yang and Nian Li Ma School of Computer Science Carnegie Mellon University Coling 04.
Towards a semantic extraction of named entities Diana Maynard, Kalina Bontcheva, Hamish Cunningham University of Sheffield, UK.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Erasmus University Rotterdam Introduction With the vast amount of information available on the Web, there is an increasing need to structure Web data in.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Truc-Vien T. Nguyen Lab: Named Entity Recognition.
1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Final Review 31 October WP2: Named Entity Recognition and Classification Claire Grover University of Edinburgh.
NERIL: Named Entity Recognition for Indian FIRE 2013.
The CoNLL-2013 Shared Task on Grammatical Error Correction Hwee Tou Ng, Yuanbin Wu, and Christian Hadiwinoto 1 Siew.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
CROSSMARC Web Pages Collection: Crawling and Spidering Components Vangelis Karkaletsis Institute of Informatics & Telecommunications NCSR “Demokritos”
Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.
Theory and Application of Database Systems A Hybrid Approach for Extending Ontology from Text He Wei.
SYMPOSIUM ON SEMANTICS IN SYSTEMS FOR TEXT PROCESSING September 22-24, Venice, Italy Combining Knowledge-based Methods and Supervised Learning for.
Overview of the TDT-2003 Evaluation and Results Jonathan Fiscus NIST Gaithersburg, Maryland November 17-18, 2002.
Entity Mention Detection using a Combination of Redundancy-Driven Classifiers Silvana Marianela Bernaola Biggio, Manuela Speranza, Roberto Zanoli bernaola,
Binxing Jiao et. al (SIGIR ’10) Presenter : Lin, Yi-Jhen Advisor: Dr. Koh. Jia-ling Date: 2011/4/25 VISUAL SUMMARIZATION OF WEB PAGES.
인지구조기반 마이닝 소프트컴퓨팅 연구실 박사 2 학기 박 한 샘 2006 지식기반시스템 응용.
INTRODUCTION TO ARTIFICIAL INTELLIGENCE Massimo Poesio Supervised Relation Extraction.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
ACE Automatic Content Extraction A program to develop technology to extract and characterize meaning from human language.
A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.
IR Homework #3 By J. H. Wang May 4, Programming Exercise #3: Text Classification Goal: to classify each document into predefined categories Input:
TimeML compliant text analysis for Temporal Reasoning Branimir Boguraev and Rie Kubota Ando.
Event retrieval in large video collections with circulant temporal encoding CVPR 2013 Oral.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Information Retrieval
Class Imbalance in Text Classification
Curriculum Project for Information Extraction. Task definitions Task 1: Entity detection and recognition Task 2: Relation detection and recognition Both.
Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -
Text Categorization by Boosting Automatically Extracted Concepts Lijuan Cai and Tommas Hofmann Department of Computer Science, Brown University SIGIR 2003.
Event-Based Extractive Summarization E. Filatova and V. Hatzivassiloglou Department of Computer Science Columbia University (ACL 2004)
Predicting User Interests from Contextual Information R. W. White, P. Bailey, L. Chen Microsoft (SIGIR 2009) Presenter : Jae-won Lee.
Adventures in exploring the Parts of Speech and the Dance Elements.
WePS2 Attribute Extraction Task Sekine and Artiles WWW 2009 Workshop.
IR Homework #2 By J. H. Wang May 9, Programming Exercise #2: Text Classification Goal: to classify each document into predefined categories Input:
1 Midterm Examination. 2 General Observations Examination was too long! Most people submitted by .
Automatically Labeled Data Generation for Large Scale Event Extraction
User Modeling for Personal Assistant
Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan
Tasks processing Strong participation . Large results comparison on
Social Knowledge Mining
Objects as Attributes for Scene Classification
Family History Technology Workshop
SVM Based Learning System for F-term Patent Classification
The Final Week.
WSExpress: A QoS-Aware Search Engine for Web Services
Extracting Why Text Segment from Web Based on Grammar-gram
Presentation transcript:

The TERN Task EVALITA 2007 Valentina Bartalesi Lenzi & Rachele Sprugnoli

Outline Introduction to the Temporal Expression Recognition and Normalization Task Participants Evaluation and results: –dataset –metrics –systems’ results Conclusion EVALITA 2007 Workshop Rome, September 10, 2007

Introduction to the TERN Task (1) EVALITA 2007 Workshop Rome, September 10, 2007 Recognize and normalize Temporal Expressions (TEs) in Italian natural language texts Two subtasks: –Recognition only –Recognition and Normalization Recognition: detecting TEs occurring in the source data by identifying their extension Normalization: give a representation of TEs meaning by assigning values to a pre-defined sets of attributes

Introduction to the TERN Task (2) EVALITA 2007 Workshop Rome, September 10, 2007 Annotation specifications: TIMEX2 mark-up standard (see the Automatic Content Extraction program - ACE) with adaptation to Italian Markable TEs: –absolute expressions, e.g. 10 settembre 2007/September 10 th 2007 –relative expressions, e.g. ieri/yesterday –durations, e.g. due settimane/two weeks –set of times, e.g. ogni mese/every month –underspecified TEs, e.g. per lungo tempo/for a long time –culturally-determined expressions, e.g. anno scolastico/school year

2007 Participants We had 4 participants: –FBK-irst, Trento (FBKirst_Negri_TIME) –University of Alicante (UniAli_Puchol_TIME) –University of Alicante (UniAli_Saquete_TIME) –University of Perugia (UniPg_Faina_TIME) 3 teams partecipated to the Recognition + Normalization subtask (FBKirst_Negri_TIME - UniAli_Saquete_TIME - UniPg_Faina_TIME) 1 team partecipated to the Recognition only subtask (UniAli_Puchol_TIME) EVALITA 2007 Workshop Rome, September 10, 2007

I-CAB (Italian Content Annotation Bank) 525 news stories from the Italian local newspaper “L’Adige” 4 days 5 categories  7-8 September 2004  7-8 October 2004  News Stories  Cultural News  Economic News  Sports News  Local News EVALITA 2007 Workshop Rome, September 10, ,564 words

I-CAB (2) EVALITA 2007 Workshop Rome, September 10, sections: training (335 news stories) and test (190 news stories) # TEs = 4,603 # TEs – Training = 2,931 # TEs – Test = 1,672 Format:  SGML files containing the source text  APF (ACE Program Format) files containing the annotation

Evaluation Metrics (1) TERN scoring part of the ACE scorer with some adaptation concerning the attribute weights for the Recognition + Normalization subtask The final ranking is based on the TERN value score We also provided the following measures: –Precision –Recall –F-measure EVALITA 2007 Workshop Rome, September 10, 2007

Systems’ results (1) EVALITA 2007 Workshop Rome, September 10, 2007 Results for the Recognition only subtask

Systems’ results (2) EVALITA 2007 Workshop Rome, September 10, 2007 Results for the Recognition + Normalization subtask

Conclusion EVALITA 2007 Workshop Rome, September 10, 2007 First time I-CAB is released and used as a benchmark for an Information Extraction Task like TERN Expected attention in terms of participation for a new and relatively difficult Task We wish that the resources we developed and the results we obtained will encourage other teams to participate in future evaluation exercises We hope that this initiative will become a regular appointment as happens with similar international evaluation campaigns (e.g. ACE)