Automating Generation of Textual Class Definitions from OWL to English Robert Stevens, James Malone, Sandra Williams, Richard Power.

Slides:



Advertisements
Similar presentations
Artificial Intelligence: Natural Language and Prolog
Advertisements

Database Design: ER Modelling (Continued)
The use of Ontology in Organising and Managing Protein Family Resources Katy Wolstencroft, University Of Manchester.
Ontologies and Databases Ian Horrocks Information Systems Group Oxford University Computing Laboratory.
An Overview of Ontologies and their Practical Applications Gianluca Correndo
Introducing Formal Methods, Module 1, Version 1.1, Oct., Formal Specification and Analytical Verification L 5.
APA Style Grammar. Verbs  Use active rather than passive voice, select tense and mood carefully  Poor: The survey was conducted in a controlled setting.
Knowledge Representation
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
ISBN Chapter 3 Describing Syntax and Semantics.
I. Characterizing the Concept of ‘Genetic Material’
References Kempen, Gerard & Harbusch, Karin (2002). Performance Grammar: A declarative definition. In: Nijholt, Anton, Theune, Mariët & Hondorp, Hendri.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
EE 399 Lecture 2 (a) Guidelines To Good Writing. Contents Basic Steps Toward Good Writing. Developing an Outline: Outline Benefits. Initial Development.
Four Dark Corners of Requirements Engineering
Software Requirements
Overview of Software Requirements
Describing Syntax and Semantics
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
GO Ontology Editing Workshop: Using Protege and OWL Hinxton Jan 2012.
ANSWERING CONTROLLED NATURAL LANGUAGE QUERIES USING ANSWER SET PROGRAMMING Syeed Ibn Faiz.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Mining and Summarizing Customer Reviews
Ontology Alignment/Matching Prafulla Palwe. Agenda ► Introduction  Being serious about the semantic web  Living with heterogeneity  Heterogeneity problem.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
CHAPTER ONE Problem Solving and the Object- Oriented Paradigm.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Methodologies. The Method section is very important because it tells your Research Committee how you plan to tackle your research problem. Chapter 3 Methodologies.
Scientific writing style Exact  Word choice: make certain that every word means exactly what you want to express. Choose synonyms with care. Be not.
CAS LX 502 8b. Formal semantics A fragment of English.
A view-based approach for semantic service descriptions Carsten Jacob, Heiko Pfeffer, Stephan Steglich, Li Yan, and Ma Qifeng
BioHealth Informatics Group Ontology Tutorial, © 2005 Univ. of Manchester1 Informal Modelling Robert Stevens.
Ontologically Modeling Sample Variables in Gene Expression Data James Malone EBI, Cambridge, UK.
Rapid Development of an Ontology of Coriell Cell Lines Chao Pang, Tomasz Adamusiak, Helen Parkinson and James Malone
EBI is an Outstation of the European Molecular Biology Laboratory. Anatomy ontology ArrayExpress Helen Parkinson,
A Comparison of three Controlled Natural Languages for OWL 1.1 Rolf Schwitter, Kaarel Kaljurand, Anne Cregan, Catherine Dolbear & Glen Hart.
LOGIC AND ONTOLOGY Both logic and ontology are important areas of philosophy covering large, diverse, and active research projects. These two areas overlap.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Grammars Grammars can get quite complex, but are essential. Syntax: the form of the text that is valid Semantics: the meaning of the form – Sometimes semantics.
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Artificial Intelligence: Natural Language
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Master headline RDFizing the EBI Gene Expression Atlas James Malone, Electra Tapanari
Topic 4 - Database Design Unit 1 – Database Analysis and Design Advanced Higher Information Systems St Kentigern’s Academy.
Formal Specification: a Roadmap Axel van Lamsweerde published on ICSE (International Conference on Software Engineering) Jing Ai 10/28/2003.
October 2004CSA3050 NLP Algorithms1 CSA3050: Natural Language Algorithms Morphological Parsing.
CMSC 345 Fall 2000 Requirements Expression. How To Express Requirements Often performed best by working top- down Express general attributes of system.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Using OWL 2 For Product Modeling David Leal Caesar Systems April 2009 Henson Graves Lockheed Martin Aeronautics.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Selecting Relevant Documents Assume: –we already have a corpus of documents defined. –goal is to return a subset of those documents. –Individual documents.
WonderWeb. Ontology Infrastructure for the Semantic Web. IST WP4: Ontology Engineering Heiner Stuckenschmidt, Michel Klein Vrije Universiteit.
Tool for Ontology Paraphrasing, Querying and Visualization on the Semantic Web Project By Senthil Kumar K III MCA (SS)‏
The Role of Semantics and Terminologies in a Service-Oriented Architecture Paul Smits, Michael Lutz European Commission – DG Joint Research Centre Ispra,
Integrating SysML with OWL (or other logic based formalisms)
ece 627 intelligent web: ontology and beyond
Exploiting semantic technologies to build an application ontology
Kenneth Baclawski et. al. PSB /11/7 Sa-Im Shin
Web Ontology Language for Service (OWL-S)
Social Knowledge Mining
Ontology.
ece 720 intelligent web: ontology and beyond
ece 627 intelligent web: ontology and beyond
Software Construction Lecture 2
Description Logics.
TECHNICAL REPORTS WRITING
Presentation transcript:

Automating Generation of Textual Class Definitions from OWL to English Robert Stevens, James Malone, Sandra Williams, Richard Power

Automating Generation of Textual Class Definitions from OWL to English Summary Motivation Use Case Methods and Description Generator Results Evaluation Open Questions (still)

Automating Generation of Textual Class Definitions from OWL to English Motivation Textual definitions are cornerstone of good practice in ontology delivery a requirement of the OBO process hard work to produce Logical definitions make meaning explicit to the computer help maintenance of the ontology’s structure, querying, and so on are also hard to produce but also more difficult to understand The information in one form should reflect the information in the other Need to keep textual and logical definitions synchronised Aim to produce fluent textual definitions from logical definitions/description in OWL

Automating Generation of Textual Class Definitions from OWL to English OWL Smackdown: Computer vs Human

Automating Generation of Textual Class Definitions from OWL to English Our Hypotheses Text = humans Logical = computers (and future human-computer hybrids) Textual definitions ≈ Logical definition Textual definitions tend to be more lossy than logical (cardinalities are often dropped, specific roles not mentioned, etc.) Logical definitions are often more explicit than natural language and therefore should contain sufficient content to produce a textual definition.

Automating Generation of Textual Class Definitions from OWL to English EFO Use Case Experimental Factor Ontology (EFO) is an application ontology which consumes domain ontologies to satisfy specific application focused use cases Primarily Gene Expression data from EBI

Automating Generation of Textual Class Definitions from OWL to English Gene Expression Atlas

Related Work Generating descriptions from ontologies often called ‘ontology verbalisation’ A number concerned only with ABox verbalisation (Hielkema 2009; Galanis and Androutsopoulos, 2007) Others produce only separate sentences, one for each OWL axiom (Kalijurand, 2007) Our approach has much in common but differs in; only a subset of OWL is considered (the simple description logic EL++) instead of realising axioms in isolation we apply some rules for organisation and aggregation to give more natural feel Automating Generation of Textual Class Definitions from OWL to English

Automating Generation of Textual Class Definitions from OWL to English Method Overview An OWL ontology is just a “pile of axioms” We can produce individual sentences based on a grammar that guides transformation from OWL to English (or other natural language) Need to group sentences (group axioms with the same subject together) Need to aggregate axioms (collapse axioms with the same relationship together) Once grouped and aggregated, a paragraph of text can be produced sentence by sentence. hasPart some leg hasPart some body hasPart some head Has parts leg, body and head

Automating Generation of Textual Class Definitions from OWL to English Processing stages Transcode OWL/XML to Prolog Construct a lexicon for atomic entities – (next slide) Group axioms by atomic entity Aggregate axioms with similar structure Generate sentences from aggregated axioms. class(animal). subClassOf(class(cat), class(animal). subClassOf(class(dog), class(animal). => class(animal). subClassOf([class(cat), class(dog)], class(animal)). => ANIMAL. A cat and a dog are both kinds of animals.

Automating Generation of Textual Class Definitions from OWL to English Description Generator Input: OWL/XML ontology Output: Text describing atomic entities generation from label/URL It is assumed that the syntax of each phrase will be severely constrained as follows: individuals are expressed by proper names classes by common nouns (with singular and plural forms) properties by transitive verbs (simple or compound) with slots for a subject and an object. ANIMAL. The following are kinds of animals: a cat, a duck, a giraffe, a person, a sheep, and a tiger. An animal eats a thing. If X has as pet Y then necessarily Y is an animal.

Automating Generation of Textual Class Definitions from OWL to English Results Class labelOWL axioms (Manchester syntax)Natural Language Definition Extracted 22rv1bearer_of some 'prostate carcinoma' derives_from some 'Homo sapiens' derives_from some prostate A 22rv1 is a cell line. A 22rv1 is all of the following: something that is bearer of a prostate carcinoma, something that derives from a homo sapiens, and something that derives from a prostate. HeLabearer_of some 'cervical carcinoma' derives_from some 'Homo sapiens' derives_from some cervix derives_from some 'epithelial cell' A he la is a cell line. A he la is all of the following: something that is bearer of a cervical carcinoma, something that derives from a homo sapiens, something that derives from an epithelial cell, and something that derives from a cervix. Ara-C-resistant murine leukemia has subclass b117h* has subclass b140h* A ara c resistant murine leukemia is a cell line. A b117h, and a b140h are kinds of ara c resistant murine leukemias. GM18507derives_from some 'Homo sapiens' derives_from some lymphoblast has_quality some male A gm18507 is all of the following: something that has as quality a male, something that derives from a homo sapiens, and something that derives from a lymphoblast. *axioms placed on subclasses

Automating Generation of Textual Class Definitions from OWL to English Results Online survey of ontology users at EBI 10 of the 50 verbalisations were evaluated based on widest range of axioms Total Judgement

Automating Generation of Textual Class Definitions from OWL to English Findings Finding of dodgy class; definition for Ara-C-resistant murine leukemia indicated subclasses b117h and b140h types of this, implying that they were diseases rather than cell lines Desire amongst this user group for simplicity of language – avoid ontological formality e.g. bearer of Especially property names for qualities e.g. has as quality male Initial verbalisation making semantics clear was not liked Plural forms occasionally issue: lex(class(EFO_ ),noun, ‘cell line’, ‘cell lines’). lex(class(EFO_ ),noun, ‘22rv1’,’22rv1s’).

Automating Generation of Textual Class Definitions from OWL to English Conclusion Initial results were largely well received and considered useful in most cases Discovery of incorrect class definition demonstrates potential as tool for class validation Preference for text definitions was for ‘clear and simple’ over ‘precise and complex’ Dependent entities could become adjectival forms of the independent entities in which they inhere (cell has quality female becomes female cell) Formal relations/class labels reduce understanding and should be brought closer to domain language Many ontologies are not amenable to text mining – this is an important use case neglected by most Definitions now being imported into EFO

Automating Generation of Textual Class Definitions from OWL to English Next Steps Systematic study of acceptable wordings Different wording styles for different users Adjectival forms for qualities etc; the role of a upper level ontology Moving beyond EL++ Parsing for OBO

Next Steps: Round Tripping Automating Generation of Textual Class Definitions from OWL to English

Open Questions Should textual descriptions ≡ logical descriptions? Are discrepencies acceptable? Automating Generation of Textual Class Definitions from OWL to English

Automating Generation of Textual Class Definitions from OWL to English Acknowledgements Sandra Williams, Richard Power and Robert Stevens are funded by the SWAT project (EPSRC grants EP/G033579/1 and EP/G032459/1); James Malone is funded by EMBL and EMERALD (project number LSHG-CT ). We would like to thank the members of the EBI’s ontology interest group, functional genomics group and Dr Helen Parkinson for comments and survey participation