Co-funded by the European Union Semantic CMS Community Semantifying Your CMS Copyright IKS Consortium 1 Lecturer Organization Date of presentation
Page: Copyright IKS Consortium Introduction of Content Management Foundations of Semantic Web Technologies Storing and Accessing Semantic Data Knowledge Interaction and Presentation Knowledge Representation and Reasoning Semantic Lifting Designing Interactive Ubiquitous IS Requirements Engineering for Semantic CMS Designing Semantic CMS Semantifying your CMS Part I: Foundations Part II: Semantic Content Management Part III: Methodologies (2) (1) (3) (4) (5) (6) (7) (8) (9) (10)
Page: What is this Lecture about? We have introduced... ... an RE approach for semantic CMS. ... a component-based reference architecture for the design of semantic CMS. What‘s next? A systematic method that can be used by developers to extend „traditional“ CMS with semantic capabilities. Copyright IKS Consortium 3 Designing Interactive Ubiquitous IS Requirements Engineering for Semantic CMS Designing Semantic CMS Semantifying your CMS Part III: Methodologies (7) (8) (9) (10)
Page: Outline Content repository specifications JCR CMIS Generic model Extracting semantics from CMSs Bridges Content discovery Using extracted semantics Aligning external ontologies
Page: Content Management Systems Content management systems (CMS) are designed to support a content management cycle analyze content creation and collection of content the publication of content for access by users and/or other systems the management of these content
Page: Standardized API Each CMS provides an API to interact with the repository which can be used within content-oriented applications To prevent each CMS vendor providing their own proprietary API, two main specifications are being used in the community JCR: Content Repository API for Java CMIS: Content Management Interoperability Services
Page: What Is JCR? Abbreviation of Content Repository API for Java (JCR) It is a specification for a Java platform API for accessing content repositories in a uniform manner. JSRs: Java Specification Requests JSR 283: Content Repository for Java TM Technology API Version 2.0
Page: What Is JCR? Provides a functional view and a common vocabulary over the content repository One does not need to learn dozens of proprietary APIs Encourages code portability Prevents content lock in isolated silos by providing a standardized repository model and access
Page: Repository Model In JCR
Page: Repository Model In JCR Each node has a node type definition Each node type can have Property definitions specifying the properties that can be used by instance of the node type Child definitions specifying the node types of child nodes that instances of current node type can have
Page: What Is CMIS? Abbreviation of Content Management Interoperability Services Defines a domain model and bindings that are designed to be layered on top of existing Content Management systems and their existing programmatic interfaces.
Page: What Is CMIS? Standard repository model and binding interface allows: reduction of the work for integration of multi-vendor, multi- repository content management environments sweeping away the need for maintaining proprietary code developing independent business units without infrastructure considerations
Page: Repository Model In CMIS The entities managed by CMIS are modeled as typed Objects CMIS comes with four types of base objects Document object Folder object Relationship object Policy object Every CMIS object has a set of properties
Page: Repository Model In CMIS All CMIS objects are strongly typed Object-Type defines a fixed and non-hierarchical set of properties that all objects of that type have CMIS has four base object types corresponding to four base objects: cmis:document cmis:folder cmis:relationship cmis:policy Object types have their specific set of property definitions as in JCR specification.
Page: Repository Model In CMIS
Page: Comparison of JCR and CMIS Both provides High level domain model to represent the content in the repository Get rid of proprietary API of each content repository
Page: Comparison of JCR and CMIS
Page: Comparison of JCR and CMIS Both JCR and CMIS define a hierarchical repository model. JCR calls the building blocks as Nodes CMIS calls the building blocks as Objects Both JCR and CMIS specifies type definitions Restrict properties Restrict hierarchical structure Content items of JCR and CMIS both have properties according that are defined their type definitions
Page: Metadata Management In CMS Organizing the content as hierarchies Through properties/parameters of nodes/objects/documents Free format values, or selected from a constrained vocabulary ( which can be a taxonomy) Can be used as content categories By representing relationships between nodes/objects/documents Taxonomies can be represented as tags hierarchies (as a hierarchy of nodes..)
Page: Generic Repository Model Considering the JCR and CMIS repository models to semantify a CMS, we need a generic repository model The generic repository model should allow to represent CMS objects from both specifications
Page: Generic Repository Model
Page: Generic Repository Model In the generic repository model Object entity corresponds to JCR node and CMIS object Object type entity corresponds to JCR node types and CMIS object types Property and property definition notions are also represented in the generic repository model.
Page: Generic Repository Model Classification Object and Content Object notions are introduced on top of the representation which covers JCR and CMIS model They differentiate data and metadata Content objects are used to represent repository items that contain actual data. Classification Objects represent hierarchical taxonomies of CMSs which are used to classify “content objects”
Page: Strength of Semantic Technologies An ontology consists of following artifacts: A vocabulary to describe a domain A specification for intended meaning of vocabulary including the how concept classification is done Constraints providing additional knowledge about the domain Thus, an ontology represents a formal and machine manipulable model of a domain
Page: Strength of Semantic Technologies A machine manipulable model of a domain enables reasoning on it Reasoning provides Recognising semantic similarity in spite of syntactic differences Recognising implicit consequences given explicitly stated facts
Page: Enhancing CMS With Semantic Technologies Benefits to CMSs Provided functionalities on domain ontology
Page: Extracting Semantics From CMSs as Ontologies Content Repositories already provide certain amount of semantics for content items Through content hierarchies, properties, taxonomies, node/object types However this semantics is not “machine understandable”; can not be reasoned on
Page: Need For A Methodology There is a need for an “Integrated semantic engineering method” Enabling CMS developers to easily utilize semantic functionalities provided by ontologies, reasoners, without a major change in their systems
Page: Extracting Semantics From CMSs as Ontologies Nodetypes/Object types/Document Types can be automatically converted in to OWL classes Properties as object and datatype properties Restrictions when necessary Nodes of these nodetypes can be created as instances…
Page: Extracting Semantics From CMSs as Ontologies
Page: What About Resources Having Semantic Worth? How should other resources be treated? Links between content items Taxonomies Content hierarchies There should be configurable bridges from CMS to ontology
Page: Bridges Should provide Extracting certain CMS objects as ontology classes Extracting certain CMS objects as ontology individuals Extracting hierarchical structure through certain properties between CMS objects Extracting certain properties of CMS objects indicating a semantic value Treating differently to extracted properties according to their annotations
Page: Concept Bridge Takes a query specifying the target CMS objects Transforms the target objects to ontology classes together with the possible hierarchical relations Is able to include Subsumption Bridges to enable hierarchy through certain properties Is able to include Property Bridges to enable extract certain properties of target objects and set appropriate annotations in the ontology
Page: Subsumption Bridge Takes a query specifying the target CMS objects Takes a predicate name Forms subclass/superclass relations between the target CMS objects through the specified predicate
Page: Instance Bridge Takes a query to select target CMS objects Transforms selected CMS object into ontology individuals As Concept Bridge, it is able to include Property Bridges to treat differently based on annotations of properties of CMS objects
Page: Property Bridge Provides selectively lift some of the CMS objects properties in the ontological representation This enables lifting properties having semantic value only It can be included in and Concept Bridge or an Instance Bridge
Page: Backend Knowledge Base For CMSs As a result of semantic lifting mechanism we have the ontological representation of the content repository semantics The ontological representation should be kept in a backend knowledge base and kept synchronized with the changes in the repository A reasoner should be used collaboratively with the knowledge base to recognize implicit facts from the explicit ones in the ontology
Page: Backend Knowledge Base For CMSs Existing triple stores Providing built-in reasoner like Jena, Sesame While Sesame supports only RDFS reasoning, Jena provides RDFS, OWL and Rule Based reasoner It is also possible to integrate external reasoner with triple stores Considering the pros and cons of different triple stores, a generic interface to communicate with triple stores host knowledge-base on different triple stores through the generic interface the semantic lifting mechanism can feed and query ontologies hosted.
Page: Using the Extracted Semantics in Content Discovery After extracting semantics of a CMS into an ontology, the ontology can be used to provide semantic functionalities on top of it. Semantic search It can be further enhanced by aligning/merging external domain ontologies
Page: Initial CMS Structure Workspace NewsSubjectCodes Health Economy Business Finance Disaster/ Accident Education NewsArticles Article2 Article1 Article3 Disease Illness Cancer ViralDiseases classifiedBy SwineFlu HealthTreatment Content Management System Structure Eating Disorder Obesity Neurological Disease
Page: Ontological Representation Of CMS Represent the CMS structure in the previous slide ontologically Represent the “news subject codes” branch as an ontology class hierarchy Represent the “news articles” branch as a set of ontology individuals
Page: Ontological Representation Of CMS -NewsSubjectCodes -ArtsCultureEntertainment -DisasterAccident -EconomyBusinessFinance -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -ViralDisease -Cancer Medicine -SocialIssues Article1 instanceOf -Disease Article2 Article3 instanceOf Representation of New Subject Codes as hierarchical ontology classes Representation of new articles as ontology individuals Individual types are set with corresponding ontology classes -SwineFlu
Page: Make a Search Find me articles categorized by “Health” … The answer contains: Article1, Article2 and Article3 due to subsumption relation between the ontology classes.
Page: Make a Rule Based Search Rule: If a Disease isCausedBy PathogenicAgent Then it is an InfectiousDisease. Facts: Virus is a PathogenicAgent. Fungi is a PathogenicAgent. ViralDisease isCausedBy Virus. Find me InfectiousDisease articles… The answer is: Article 3
Page: Go Back To Example To represent “news subject codes” as a class hierarchy in the ontological representation, we need a Concept Bridge. Having a query which targets the CMS objects under “/Workspace/NewsSubjectCodes”
Page: Go Back To Example To represent “news articles” as individuals in the ontological representation, we need an Instance Bridge Having a query which targets the CMS objects under “/Workspace/NewsArticles” Having an inner Property Bridge which has “classifiedBy” as predicate name This will provide setting types of the individuals with the ontology class corresponding to value of “classifiedBy” property
Page: Aligning External Ontologies It is possible to align external domain ontologies with the ontology representing the structure of CMS to be able to use semantics in the external ontology
Page: Go Over An Example Initially, assume that we have the following ontology representation of CMS -NewsSubjectCodes -ArtsCultureEntertainment -EnvironmentalIssues -Health -HealthTreatment -Illness -Medicine -SocialIssues -Disease -Obesity -EatingDisorder - NeurologicalDisease MotorNeuroneDiseaseGeneClue … Professor Christopher Shaw, from the Institute of Psychiatry at Kings College London, said … GeneticCluesToEatingDisorders …Doctors studying the causes of the eating disorders anorexia and bulimia believe it has less to do with media images of slim-figured models and more to do with biological and genetic factors… Representation of New Subject Codes as hierarchical ontology classes instanceOf Representatio n of two of the News Articles as individuals
Page: Align CMS Representation With External Ontology -NewsSubjectCodes -ArtsCultureEntertainment -DisasterAccident -EconomyBusinessFinance -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -Medicine -SocialIssues -Disease Representation of New Subject Codes as hierarchical ontology classes -Obesity -EatingDisorder -MeSH -Anatomy -Diseases -Organisms -BehaviorMechanisms -Psychiatry -BehaviorDisciplines -MentalDisorders -AnxietyDisorders -EatingDisorders -SleepingDisorders -SomotoformDisorders Mesh Biomedic al Ontology equivalentTo
Page: Align CMS Representation With External Ontology -Education -EnvironmentalIssues -Health -HealthTreatment -Illness -Medicine -SocialIssues -Disease -Obesity -EatingDisorder -Organisms -BehaviorMechanisms -Psychiatry -BehaviorDisciplines -MentalDisorders -AnxietyDisorders -EatingDisorders -SleepingDisorders equivalentTo GeneticCluesToEatingDisorders …Doctors studying the causes of the eating disorders anorexia and bulimia believe it has less to do with media images of slim-figured models and more to do with biological and genetic factors… instanceOf...
Page: Make A Search Find me articles related with “psychiatry” Search results will not only include the article “MotorNeuroneDiseaseGeneClue” but also the article “GeneticCluesToEatingDisorders” The keyword “psychiatry” will be matched with the ontology class “Psychiatry”. Through reasoning, it will be inferred that the “GeneticCluesToEatingDisorders” is an indirect instance of “Psychiatry” class.
Page: References v1.0.html v1.0.html kassel.de/conf/iccs05/horrocks_iccs05.pdf kassel.de/conf/iccs05/horrocks_iccs05.pdf mmary.ppt mmary.ppt