Integration of Information Extraction with an Ontology M. Vargas-Vera, J.Domingue, Y.Kalfoglou, E. Motta and S. Buckingham Sum.

Slides:



Advertisements
Similar presentations
eClassifier: Tool for Taxonomies
Advertisements

Automatic Timeline Generation from News Articles Josh Taylor and Jessica Jenkins.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
So What Does it All Mean? Geospatial Semantics and Ontologies Dr Kristin Stock.
1 Relational Learning of Pattern-Match Rules for Information Extraction Presentation by Tim Chartrand of A paper bypaper Mary Elaine Califf and Raymond.
Hermes: News Personalization Using Semantic Web Technologies
CHAITALI GUPTA, RAJDEEP BHOWMIK, MICHAEL R. HEAD, MADHUSUDHAN GOVINDARAJU, WEIYI MENG PRESENTED BY: SIDDHARTH PALANISWAMI A Query-based System for Automatic.
Applications Chapter 9, Cimiano Ontology Learning Textbook Presented by Aaron Stewart.
Xyleme A Dynamic Warehouse for XML Data of the Web.
Visual Web Information Extraction With Lixto Robert Baumgartner Sergio Flesca Georg Gottlob.
Developing Semantic Web Sites: Results and Lessons Learnt Enrico Motta, Yuangui Lei, Martin Dzbor, Vanessa Lopez, John Domingue, Jianhan Zhu, Liliana Cabral,
March 17, 2008SAC WT Hermes: a Semantic Web-Based News Decision Support System* Flavius Frasincar Erasmus University Rotterdam.
Gimme’ The Context: Context- driven Automatic Semantic Annotation with CPANKOW Philipp Cimiano et al.
Machine Learning for Information Extraction Li Xu.
Inducing Information Extraction Systems for New Languages via Cross-Language Projection Ellen Riloff University of Utah Charles Schafer, David Yarowksy.
A System for A Semi-Automatic Ontology Annotation Kiril Simov, Petya Osenova, Alexander Simov, Anelia Tincheva, Borislav Kirilov BulTreeBank Group LML,
Information Extraction and Ontology Learning Guided by Web Directory Authors:Martin Kavalec Vojtěch Svátek Presenter: Mark Vickers.
Empirical Methods in Information Extraction - Claire Cardie 자연어처리연구실 한 경 수
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Semi-automatic glossary creation from learning objects Eline Westerhout & Paola Monachesi.
Presented by Zeehasham Rasheed
Semantic Web Queries by Mark Vickers Funded by NSF.
Knowledge Extraction by using an Ontology- based Annotation Tool Knowledge Media Institute(KMi) The Open University Milton Keynes, MK7 6AA October 2001.
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
BYU A Synergistic Semantic Annotation Model December 2007 Yihong Ding,
SIEVE—Search Images Effectively through Visual Elimination Ying Liu, Dengsheng Zhang and Guojun Lu Gippsland School of Info Tech,
Artificial Intelligence Research Centre Program Systems Institute Russian Academy of Science Pereslavl-Zalessky Russia.
1 LOMGen: A Learning Object Metadata Generator Applied to Computer Science Terminology A. Singh, H. Boley, V.C. Bhavsar National Research Council and University.
Principles of Programming Chapter 1: Introduction  In this chapter you will learn about:  Overview of Computer Component  Overview of Programming 
Chapter 1 Introduction to Visual Basic Programming and Applications 1 Exploring Microsoft Visual Basic 6.0 Copyright © 1999 Prentice-Hall, Inc. By Carlotta.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Avalanche Internet Data Management System. Presentation plan 1. The problem to be solved 2. Description of the software needed 3. The solution 4. Avalanche.
1 Wikification CSE 6339 (Section 002) Abhijit Tendulkar.
Mining the Semantic Web: Requirements for Machine Learning Fabio Ciravegna, Sam Chapman Presented by Steve Hookway 10/20/05.
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
FIIT STU Bratislava Classification and automatic concept map creation in eLearning environment Karol Furdík 1, Ján Paralič 1, Pavel Smrž.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
21/11/2002 The Integration of Lexical Knowledge and External Resources for QA Hui YANG, Tat-Seng Chua Pris, School of Computing.
1 A Hierarchical Approach to Wrapper Induction Presentation by Tim Chartrand of A paper bypaper Ion Muslea, Steve Minton and Craig Knoblock.
Food and Agriculture Organization of the UN Library and Documentation Systems Division July 2005 Ontologies creation, extraction and maintenance 6 th AOS.
Semiautomatic domain model building from text-data Petr Šaloun Petr Klimánek Zdenek Velart Petr Šaloun Petr Klimánek Zdenek Velart SMAP 2011, Vigo, Spain,
Introduction to GATE Developer Ian Roberts. University of Sheffield NLP Overview The GATE component model (CREOLE) Documents, annotations and corpora.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
Bootstrapping for Text Learning Tasks Ramya Nagarajan AIML Seminar March 6, 2001.
Evaluating Semantic Metadata without the Presence of a Gold Standard Yuangui Lei, Andriy Nikolov, Victoria Uren, Enrico Motta Knowledge Media Institute,
Talk Schedule Question Answering from Bryan Klimt July 28, 2005.
BioRAT: Extracting Biological Information from Full-length Papers David P.A. Corney, Bernard F. Buxton, William B. Langdon and David T. Jones Bioinformatics.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
MedKAT Medical Knowledge Analysis Tool December 2009.
知識管理報告 Semantic interpretation and knowledge extraction 第四組 M 余思慧 M 林道明 M 謝明哲 M 曾世賢.
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
LREC Authors Mithun Balakrishna, Dan Moldovan, Marta Tatu, Marian Olteanu Presented by Chris Irwin Davis Semi-Automatic Domain Ontology Creation.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Video on the Semantic Web Experiences with Media Streams CWI Amsterdam Joost Geurts Jacco van Ossenbruggen Lynda Hardman UC Berkeley SIMS Marc Davis.
Concepts and Realization of a Diagram Editor Generator Based on Hypergraph Transformation Author: Mark Minas Presenter: Song Gu.
A System for Automatic Personalized Tracking of Scientific Literature on the Web Tzachi Perlstein Yael Nir.
Institute of Informatics & Telecommunications NCSR “Demokritos” Spidering Tool, Corpus collection Vangelis Karkaletsis, Kostas Stamatakis, Dimitra Farmakiotou.
An Ontology-based Automatic Semantic Annotation Approach for Patent Document Retrieval in Product Innovation Design Feng Wang, Lanfen Lin, Zhou Yang College.
NCSR “Demokritos” Institute of Informatics & Telecommunications CROSSMARC CROSS-lingual Multi Agent Retail Comparison WP3 Multilingual and Multimedia Fact.
 Corpus Formation [CFT]  Web Pages Annotation [Web Annotator]  Web sites detection [NEACrawler]  Web pages collection [NEAC]  IE Remote.
RECENT TRENDS IN METADATA GENERATION
Presented by: Hassan Sayyadi
University of Computer Studies, Mandalay
Cross-language Information Retrieval
Presented by: Prof. Ali Jaoua
Automatic Detection of Causal Relations for Question Answering
Extracting Recipes from Chemical Academic Papers
CS246: Information Retrieval
Presentation transcript:

Integration of Information Extraction with an Ontology M. Vargas-Vera, J.Domingue, Y.Kalfoglou, E. Motta and S. Buckingham Sum

Introduction Ontology -> Information Extractor English text (NLP) Group of tools their IE system: KMi Ontology From UMass: Marmot Crystal Badger OCML preprocessor

Presentation Layout Background on tool origins and area of work Description of tool integration Coping with ambiguity Description of output Population of Ontology Future Work

UMass University of Massachutes Amherst Marmot, Crystal, Badger –Classifies text by recognizing extraction patterns and semantic features associated to slots in predefined frames.

Testing Area: KMi Planet Web-based new server –Story Library Collections of news stories and postings –Ontology Library Ontologies stored for use in extracting information from the story library. Uses OCML myPlanet uses cue-phrases defined as “research areas” to query KMi planet through the ontology library and the information extraction tools we’re about to talk about

The Ontology Library 40 different types of events or activities that can be described by the ontology library. Event type 3: demonstration-of-technology technology-being-demostrated (technology) (Info Extraction) has-duration (duration) (30 min) start-time (time-point) (3:30pm) end-time (time-point) (4pm) has-location (a place) (room 120 TMCB BYU campus) other agents-involved (list of person(s)) (Dr. Embley) main-agent (list of person(s)) (Brian Goodrich) location-at-start (a place) (room 120 TMCB BYU campus) location-at-end (a place) (room 120 TMCB BYU campus) medium-used (equipment) (mutli-media projector, ppt) subject-of-the-demo (title) (Integration of Information Extraction with an Ontology)

Marmot Natural Language Processor Noun, Verb, and Prepositional Phrases “John Domingue Wed, 15 Oct David Brown, University for Industry visits the OU.” 2 1 SUBJ(1): DAVID BROWN %COMMA% UNIVERSITY PP (2): FOR INDUSTRY VB (3): VISITS OBJ1(4): THE OU PUNC(5): %PERIOD% 1 1 SUBJ(1): JOHN DOMINGUE ADVP(2): PUNC(3): %PERIOD%

Crystal Dictionary Induction Tool Using keyword to annotate text with semantic tags. Visitor ( David Brown ) Place ( the OU ) Specific-to-general driven data search Relaxes constraints on initial definitions until it finds the most specific definition that covers all instances of the word in the text. Retains results for future use Tested on over 300 stories, 100% precision and recall

Badger Matches sentences from text against concept nodes passed from Crystal. Select the best match by max number of features matching the concept node. Can remove irrelevant sentences from problem set. (fairly certain whoever wrote this section did not speak English as first language)

Coping with Ambiguity Query list of institutions Query list of projects Return list of institutions – no match Return list of project - match No discussion of whether this was automatically done by the extractor or manually by the users.

OCML Code Translator (Operational Conceptual Modeling Language) Tokenise Badger output, find corresponding CN definitions and extract all the objects found in the story

Ontology Maintenance Use Badger (lexicon) and Crystal (concept) output to automatically update Ontology library whenever a new story is added to the Story library Some cannot be automatically updated: –There is not enough information in the story –No current template to match with the sentence concepts.

Conclusion IE system created using Marmot, Crystal, Badger and the OCML translator. Obtained good results in KMi stories. Assessment Sporadic periods of quality technical writing, interspersed with nearly impenetrable English A borrowing of tools, translated to OCML and ported for KMi

Future Work Deriving the type of an object when it does not match a predefined template. Automatic creation of new classes and subclasses. Using this IE tool in other domains (need new training data?) Trying out a new Machine Learning algorithm in Crystal and comparing performance. Using the IE tool hypertext. Saving Badger’s output in XML Creating a more visual gui for the ontologies.