Chapter 7A Semantic Web Primer 1 Chapter 7 Ontology Engineering Grigoris Antoniou Frank van Harmelen.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
Ontologies - Design principles Cartic Ramakrishnan LSDIS Lab University of Georgia.
1 CSIT600f: Introduction to Semantic Web Conclusion and Outlook Dickson K.W. Chiu PhD, SMIEEE Text: Antoniou & van Harmelen: A Semantic Web PrimerA Semantic.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
1 CSIT600f: Introduction to Semantic Web Ontology Engineering Dickson K.W. Chiu PhD, SMIEEE Text: Antoniou & van Harmelen: A Semantic Web PrimerA Semantic.
1 Draft of a Matchmaking Service Chuang liu. 2 Matchmaking Service Matchmaking Service is a service to help service providers to advertising their service.
How can Computer Science contribute to Research Publishing?
Annotating Documents for the Semantic Web Using Data-Extraction Ontologies Dissertation Proposal Yihong Ding.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. WSMX Data Mediation Adrian Mocan
XML on Semantic Web. Outline The Semantic Web Ontology XML Probabilistic DTD References.
PROMPT: Algorithm and Tool for Automated Ontology Merging and Alignment Natalya F. Noy and Mark A. Musen.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
Part 5: Ontologies.
BIS310: Week 7 BIS310: Structured Analysis and Design Data Modeling and Database Design.
The chapter will address the following questions:
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
CASE Tools And Their Effect On Software Quality Peter Geddis – pxg07u.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Chapter 7A Semantic Web Primer 1 Chapter 7 Ontology Engineering Grigoris Antoniou Paul Groth Frank van Harmelen Rinke Hoekstra.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Systems Analysis – Analyzing Requirements.  Analyzing requirement stage identifies user information needs and new systems requirements  IS dev team.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
Protege OWL Plugin Short Tutorial. OWL Usage The world wide web is a natural application area of ontologies, because ontologies could be used to describe.
Chapter 7A Semantic Web Primer 1 Chapter 7 Ontology Engineering Grigoris Antoniou Frank van Harmelen.
Of 39 lecture 2: ontology - basics. of 39 ontology a branch of metaphysics relating to the nature and relations of being a particular theory about the.
 Knowledge Acquisition  Machine Learning. The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
RDF and OWL Developing Semantic Web Services by H. Peter Alesso and Craig F. Smith CMPT 455/826 - Week 6, Day Sept-Dec 2009 – w6d21.
1 ECE 453 – CS 447 – SE 465 Software Testing & Quality Assurance Instructor Kostas Kontogiannis.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Dimitrios Skoutas Alkis Simitsis
Umi Laili Yuhana December, Context Aware Group - Intelligent Agent Laboratory Computer Science and Information Engineering National Taiwan University.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
FDT Foil no 1 On Methodology from Domain to System Descriptions by Rolv Bræk NTNU Workshop on Philosophy and Applicablitiy of Formal Languages Geneve 15.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 이 은 정
Object-Oriented Modeling: Static Models. Object-Oriented Modeling Model the system as interacting objects Model the system as interacting objects Match.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Working with Ontologies Introduction to DOGMA and related research.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Mining the Biomedical Research Literature Ken Baclawski.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
1 Masters Thesis Presentation By Debotosh Dey AUTOMATIC CONSTRUCTION OF HASHTAGS HIERARCHIES UNIVERSITAT ROVIRA I VIRGILI Tarragona, June 2015 Supervised.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Metadata “Data about data” Describes various aspects of a digital file or group of files Identifies the parts of a digital object and documents their content,
Semantic web Bootstrapping & Annotation Hassan Sayyadi Semantic web research laboratory Computer department Sharif university of.
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Knowledge Representation. Keywordsquick way for agents to locate potentially useful information Thesaurimore structured approach than keywords, arranging.
Winter 2011SEG Chapter 11 Chapter 1 (Part 1) Review from previous courses Subject 1: The Software Development Process.
©Ian Sommerville 2006Software Engineering, 8th edition. Chapter 4 Slide 1 Software Processes.
Managing Semi-Structured Data. Is the web a database?
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
MSG Reuse Catalog T.W. van den Berg 7 April 2010.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Semantic Interoperability in GIS N. L. Sarda Suman Somavarapu.
Of 24 lecture 11: ontology – mediation, merging & aligning.
Chapter 8A Semantic Web Primer 1 Chapter 8 Conclusion and Outlook Grigoris Antoniou Frank van Harmelen.
Semantic Web. P2 Introduction Information management facilities not keeping pace with the capacity of our information storage. –Information Overload –haphazardly.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
ece 627 intelligent web: ontology and beyond
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
ece 720 intelligent web: ontology and beyond
Information Networks: State of the Art
Presentation transcript:

Chapter 7A Semantic Web Primer 1 Chapter 7 Ontology Engineering Grigoris Antoniou Frank van Harmelen

Chapter 7A Semantic Web Primer 2 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Chapter 7A Semantic Web Primer 3 Methodological Questions – How can tools and techniques best be applied? – Which languages and tools should be used in which circumstances, and in which order? – What about issues of quality control and resource management? Many of these questions for the Semantic Web have been studied in other contexts – E.g. software engineering, object-oriented design, and knowledge engineering

Chapter 7A Semantic Web Primer 4 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Chapter 7A Semantic Web Primer 5 Main Stages in Ontology Development 1. Determine scope 2. Consider reuse 3. Enumerate terms 4. Define taxonomy 5. Define properties 6. Define facets 7. Define instances 8. Check for anomalies Not a linear process!

Chapter 7A Semantic Web Primer 6 Determine Scope There is no correct ontology of a specific domain – An ontology is an abstraction of a particular domain, and there are always viable alternatives What is included in this abstraction should be determined by – the use to which the ontology will be put – by future extensions that are already anticipated

Chapter 7A Semantic Web Primer 7 Determine Scope (2) Basic questions to be answered at this stage are: – What is the domain that the ontology will cover? – For what we are going to use the ontology? – For what types of questions should the ontology provide answers? – Who will use and maintain the ontology?

Chapter 7A Semantic Web Primer 8 Consider Reuse With the spreading deployment of the Semantic Web, ontologies will become more widely available We rarely have to start from scratch when defining an ontology – There is almost always an ontology available from a third party that provides at least a useful starting point for our own ontology

Chapter 7A Semantic Web Primer 9 Enumerate Terms Write down in an unstructured list all the relevant terms that are expected to appear in the ontology – Nouns form the basis for class names – Verbs (or verb phrases) form the basis for property names Traditional knowledge engineering tools (e.g. laddering and grid analysis) can be used to obtain – the set of terms – an initial structure for these terms

Chapter 7A Semantic Web Primer 10 Define Taxonomy Relevant terms must be organized in a taxonomic hierarchy – Opinions differ on whether it is more efficient/reliable to do this in a top-down or a bottom-up fashion Ensure that hierarchy is indeed a taxonomy: – If A is a subclass of B, then every instance of A must also be an instance of B (compatible with semantics of rdfs:subClassOf

Chapter 7A Semantic Web Primer 11 Define Properties Often interleaved with the previous step The semantics of subClassOf demands that whenever A is a subclass of B, every property statement that holds for instances of B must also apply to instances of A – It makes sense to attach properties to the highest class in the hierarchy to which they apply

Chapter 7A Semantic Web Primer 12 Define Properties (2) While attaching properties to classes, it makes sense to immediately provide statements about the domain and range of these properties There is a methodological tension here between generality and specificity: – Flexibility (inheritance to subclasses) – Detection of inconsistencies and misconceptions

Chapter 7A Semantic Web Primer 13 Define Facets: From RDFS to OWL Cardinality restrictions Required values – owl:hasValue – owl:allValuesFrom – owl:someValuesFrom Relational characteristics – symmetry, transitivity, inverse properties, functional values

Chapter 7A Semantic Web Primer 14 Define Instances Filling the ontologies with such instances is a separate step Number of instances >> number of classes Thus populating an ontology with instances is not done manually – Retrieved from legacy data sources (DBs) – Extracted automatically from a text corpus

Chapter 7A Semantic Web Primer 15 Check for Anomalies An important advantage of the use of OWL over RDF Schema is the possibility to detect inconsistencies – In ontology or ontology+instances Examples of common inconsistencies – incompatible domain and range definitions for transitive, symmetric, or inverse properties – cardinality properties – requirements on property values can conflict with domain and range restrictions

Chapter 7A Semantic Web Primer 16 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Chapter 7A Semantic Web Primer 17 Existing Domain-Specific Ontologies Medical domain: Cancer ontology from the National Cancer Institute in the United States Cultural domain: – Art and Architecture Thesaurus (AAT) with 125,000 terms in the cultural domain – Union List of Artist Names (ULAN), with 220,000 entries on artists – Iconclass vocabulary of 28,000 terms for describing cultural images Geographical domain: Getty Thesaurus of Geographic Names (TGN), containing over 1 million entries

Chapter 7A Semantic Web Primer 18 Integrated Vocabularies Merge independently developed vocabularies into a single large resource E.g. Unified Medical Language System integrating100 biomedical vocabularies – The UMLS metathesaurus contains 750,000 concepts, with over 10 million links between them The semantics of a resource that integrates many independently developed vocabularies is rather low – But very useful in many applications as starting point

Chapter 7A Semantic Web Primer 19 Upper-Level Ontologies Some attempts have been made to define very generally applicable ontologies – Mot domain-specific Cyc, with 60,000 assertions on 6,000 concepts Standard Upperlevel Ontology (SUO)

Chapter 7A Semantic Web Primer 20 Topic Hierarchies Some “ontologies” do not deserve this name: – simply sets of terms, loosely organized in a hierarchy This hierarchy is typically not a strict taxonomy but rather mixes different specialization relations (e.g. is-a, part-of, contained-in) Such resources often very useful as starting point Example: Open Directory hierarchy, containing more then 400,000 hierarchically organized categories and available in RDF format

Chapter 7A Semantic Web Primer 21 Linguistic Resources Some resources were originally built not as abstractions of a particular domain, but rather as linguistic resources These have been shown to be useful as starting places for ontology development – E.g. WordNet, with over 90,000 word senses

Chapter 7A Semantic Web Primer 22 Ontology Libraries Attempts are currently underway to construct online libraries of online ontologies – Rarely existing ontologies can be reused without changes – Existing concepts and properties must be refined using rdfs:subClassOf and rdfs:subPropertyOf – Alternative names must be introduced which are better suited to the particular domain using owl:equivalentClass and owl:equivalentProperty – We can exploit the fact that RDF and OWL allow private refinements of classes defined in other ontologies

Chapter 7A Semantic Web Primer 23 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Chapter 7A Semantic Web Primer 24 The Knowledge Acquisition Bottleneck Manual ontology acquisition remains a time- consuming, expensive, highly skilled, and sometimes cumbersome task Machine Learning techniques may be used to alleviate – knowledge acquisition or extraction – knowledge revision or maintenance

Chapter 7A Semantic Web Primer 25 Tasks Supported by Machine Learning Extraction of ontologies from existing data on the Web Extraction of relational data and metadata from existing data on the Web Merging and mapping ontologies by analyzing extensions of concepts Maintaining ontologies by analyzing instance data Improving SW applications by observing users

Chapter 7A Semantic Web Primer 26 Useful Machine Learning Techniques for Ontology Engineering Clustering Incremental ontology updates Support for the knowledge engineer Improving large natural language ontologies Pure (domain) ontology learning

Chapter 7A Semantic Web Primer 27 Machine Learning Techniques for Natural Language Ontologies Natural language ontologies (NLOs) contain lexical relations between language concepts – They are large in size and do not require frequent updates The state of the art in NLO learning looks quite optimistic: – A stable general-purpose NLO exist – Techniques for automatically or semi-automatically constructing and enriching domain-specific NLOs exist

Chapter 7A Semantic Web Primer 28 Machine Learning Techniques for Domain Ontologies They provide detailed descriptions Usually they are constructed manually The acquisition of the domain ontologies is still guided by a human knowledge engineer – Automated learning techniques play a minor role in knowledge acquisition – They have to find statistically valid dependencies in the domain texts and suggest them to the knowledge engineer

Chapter 7A Semantic Web Primer 29 Machine Learning Techniques for Ontology Instances Ontology instances can be generated automatically and frequently updated while the ontology remains unchanged Fits nicely into a machine learning framework Successful ML applications – Are strictly dependent on the domain ontology, or – Populate the markup without relating to any domain theory – General-purpose techniques not yet available

Chapter 7A Semantic Web Primer 30 Different Uses of Ontology Learning Ontology acquisition tasks in knowledge engineering – Ontology creation from scratch by the knowledge engineer – Ontology schema extraction from Web documents – Extraction of ontology instances from Web documents Ontology maintenance tasks – Ontology integration and navigation – Updating some parts of an ontology – Ontology enrichment or tuning

Chapter 7A Semantic Web Primer 31 Ontology Acquisition Tasks Ontology creation from scratch by the knowledge engineer – ML assists the knowledge engineer by suggesting the most important relations in the field or checking and verifying the constructed knowledge bases Ontology schema extraction from Web documents – ML takes the data and meta-knowledge (like a meta- ontology) as input and generate the ready-to-use ontology as output with the possible help of the knowledge engineer

Chapter 7A Semantic Web Primer 32 Ontology Acquisition Tasks(2) Extraction of ontology instances from Web documents – This task extracts the instances of the ontology presented in the Web documents and populates given ontology schemas – This task is similar to information extraction and page annotation, and can apply the techniques developed in these areas

Chapter 7A Semantic Web Primer 33 Ontology Maintenance Tasks Ontology integration and navigation – Deals with reconstructing and navigating in large and possibly machine-learned knowledge bases Updating some parts of an ontology that are designed to be updated Ontology enrichment or tuning – This does not change major concepts and structures but makes an ontology more precise

Chapter 7A Semantic Web Primer 34 Potentially Applicable Machine Learning Algorithms Propositional rule learning algorithms Bayesian learning – generates probabilistic attribute-value rules First-order logic rules learning Clustering algorithms – They group the instances together based on the similarity or distance measures between a pair of instances defined in terms of their attribute values

Chapter 7A Semantic Web Primer 35 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Ontology Mapping A single ontology will rarely fulfill the needs of a particular application; multiple ontologies will have to be combined This raises the problem of ontology integration (also called ontology alignment or ontology mapping) Current approaches deploy a whole host of different methods; we distinguish linguistic, statistical, structural and logical methods Chapter 7A Semantic Web Primer 36

Linguistic methods The most basic methods try to exploit the linguistic labels attached to the concepts in source and target ontology in order to discover potential matches This can be as simple as basic stemming techniques or calculating Hamming distances, or it can use specialized domain knowledge (e.g. the difference between Diabetes Melitus type I and Diabetes Melitus type II is not a negligible difference to be removed by a small Hamming distance) Chapter 7A Semantic Web Primer 37

Statistical Methods Some methods use instance data, to determine correspondences between concepts A significant statistical correlation between the instances of a source concept and a target concept, gives us reason to believe that these concepts are strongly related These approaches rely on the availability of a sufficiently large corpus of instances that are classified in both the source and the target ontologies Chapter 7A Semantic Web Primer 38

Structural Methods Since ontologies have internal structure, it makes sense to exploit the graph structure of the source and the target ontologies and try to determine similarities, often in coordination with other methods − If a source target and a target concept have similar linguistic labels, then the dissimilarity of their graph neighborhoods could be used to detect homonym problems where purely linguistic methods would falsely declare a potential mapping Chapter 7A Semantic Web Primer 39

Logical Methods The most specific to mapping ontologies A serious limitation of this approach is that many practical ontologies are semantically rather lightweight and thus don’t carry much logical formalism with them Chapter 7A Semantic Web Primer 40

Ontology-Mapping Techniques Conclusion Although there is much potential, and indeed need, for these techniques to be deployed for Semantic Web engineering, this is far from a well-understood area No off-the-shelf techniques are currently available, and it is not clear that this is likely to change in the near future Chapter 7A Semantic Web Primer 41

Chapter 7A Semantic Web Primer 42 Lecture Outline 1. Introduction 2. Constructing Ontologies Manually 3. Reusing Existing Ontologies 4. Semiautomatic Ontology Acquisition 5. Ontology Mapping 6. On-To-Knowledge SW Architecture

Chapter 7A Semantic Web Primer 43 On-To-Knowledge Architecture Building the Semantic Web involves using – the new languages described in this course – a rather different style of engineering – a rather different approach to application integration We describe how a number of Semantic Web-related tools can be integrated in a single lightweight architecture using Semantic Web standards to achieve interoperability between tools

Chapter 7A Semantic Web Primer 44 Knowledge Acquisition Initially, tools must exist that use surface analysis techniques to obtain content from documents – Unstructured natural language documents: statistical techniques and shallow natural language technology – Structured and semi-structured documents: wrappers induction, pattern recognition

Chapter 7A Semantic Web Primer 45 Knowledge Storage The output of the analysis tools is sets of concepts, organized in a shallow concept hierarchy with at best very few cross-taxonomical relationships RDF/RDF Schema are sufficiently expressive to represent the extracted info – Store the knowledge produced by the extraction tools – Retrieve this knowledge, preferably using a structured query language (e.g. RQL)

Chapter 7A Semantic Web Primer 46 Knowledge Maintenance and Use A practical Semantic Web repository must provide functionality for managing and maintaining the ontology: – change management – access and ownership rights – transaction management There must be support for both – Lightweight ontologies that are automatically generated from unstructured and semi-structured data – Human engineering of much more knowledge-intensive ontologies

Chapter 7A Semantic Web Primer 47 Knowledge Maintenance and Use (2) Sophisticated editing environments must be able to – Retrieve ontologies from the repository – Allow a knowledge engineer to manipulate it – Place it back in the repository The ontologies and data in the repository are to be used by applications that serve an end-user – We have already described a number of such applications

Chapter 7A Semantic Web Primer 48 Technical Interoperability Syntactic interoperability was achieved because all components communicated in RDF Semantic interoperability was achieved because all semantics was expressed using RDF Schema Physical interoperability was achieved because All communications between components were established using simple HTTP connections

Chapter 7A Semantic Web Primer 49 On-To-Knowledge System Architecture