Towards KOS-based semantic services Doug Tudhope Hypermedia Research Unit University of Glamorgan Helsinki, November 30, 2007.

Slides:



Advertisements
Similar presentations
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Advertisements

Ontology Assessment – Proposed Framework and Methodology.
DC2001, Tokyo DCMI Registry : Background and demonstration DC2001 Tokyo October 2001 Rachel Heery, UKOLN, University of Bath Harry Wagner, OCLC
Applications of NKOS: some examples and questions Doug Tudhope Hypermedia Research Unit University of Glamorgan DC-2005 NKOS Special Session.
Delivering HILT as a shared service Rachel Heery UKOLN, University of Bath
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
STELLAR Introduction Ceri Binding, Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
Semantic Annotations in the Archaeological Domain Andreas Vlachidis, Ceri Binding, Keith May, Douglas TudhopeSTAR STAR Semantic Technologies for Archaeological.
Scoping study of KOS registries Doug Tudhope Hypermedia Research Unit University of Glamorgan NKOS workshop 2007.
Vocabulary registries and services Doug Tudhope Hypermedia Research Unit University of Glamorgan Ecoterm, FAO, Rome, Oct 2009.
STELLAR Introduction Douglas Tudhope Hypermedia Research Unit, University of Glamorgan.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
A Registry for controlled vocabularies at the Library of Congress
Creating Knowledge V, 2008 A search thesaurus for the domain of linguistics Creating a domain specific search tool on the basis of user behaviour study.
Ontology-based Access Ontology-based Access to Digital Libraries Sonia Bergamaschi University of Modena and Reggio Emilia Modena Italy Fausto Rabitti.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Digging Up Data: The Archaeotools project, Faceted Classification and Natural Language Processing in an archaeological context. Stuart Jeffrey, Julian.
Stuart Jeffrey, Julian Richards, Fabio Ciravegna Stewart Waller, Sam Chapman, Ziqi ZhangTony Austin. STAR/Archaeotools Workshop, York, 9 th May Stuart.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
Data Exchange Tools (DExT) DExT PROJECTAN OPEN EXCHANGE FORMAT FOR DATA enables long-term preservation and re-use of metadata,
KOS-based tools for archaeological dataset interoperability: NKOS Workshop, ECDL 2010 C. Binding, K. May 1, D. Tudhope, A. Vlachidis Hypermedia Research.
Terminology services and the DDC: the High-Level Thesaurus and beyond Presented to the symposium Dewey goes Europe: on the use and development of the Dewey.
Practical RDF Chapter 1. RDF: An Introduction
Rutherford Appleton Laboratory SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory Semantic Web Best Practices and Deployment.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
A centre of expertise in digital information management The MEG Metadata Schemas Registry Pete Johnston, Research Officer (Interoperability),
1 Technologies for distributed systems Andrew Jones School of Computer Science Cardiff University.
1 Issues in Reusing and Sharing the Content of Thesauri and Taxonomies in OOR Marcia Zeng NKOS (Networked Knowledge Organization Systems/Services) My participating.
Ontology Summit2007 Survey Response Analysis -- Issues Ken Baclawski Northeastern University.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
The Agricultural Ontology Service (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Library and Documentation Systems.
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Ontology Summit2007 Survey Response Analysis Ken Baclawski Northeastern University.
EnTaG Enhanced (social) Tagging for Discovery Doug Tudhope Hypermedia Research Unit, University of Glamorgan Exeter.
Semantic Annotation of Grey Literature from an Archaeological Digital Library Andreas Vlachidis, Doug Tudhope Hypermedia Research Unit University of Glamorgan.
Enhancing social tagging with a knowledge organization system Brian Matthews STFC.
10/24/09CK The Open Ontology Repository Initiative: Requirements and Research Challenges Ken Baclawski Todd Schneider.
It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
The Archaeotools project, faceted classification and natural language processing in an archaeological context. University of York, April 2008.
Oreste Signore- Quality/1 Amman, December 2006 Standards for quality of cultural websites Ministerial NEtwoRk for Valorising Activities in digitisation.
Introduction to the Semantic Web and Linked Data Module 1 - Unit 2 The Semantic Web and Linked Data Concepts 1-1 Library of Congress BIBFRAME Pilot Training.
Introduction to the Semantic Web and Linked Data
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
User Profiling using Semantic Web Group members: Ashwin Somaiah Asha Stephen Charlie Sudharshan Reddy.
Working with Ontologies Introduction to DOGMA and related research.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
6 th ECDL NKOS Workshop Organisers: Doug Tudhope Traugott Koch Marianne Lykke Nielsen NKOS Workshop, Budapest, 2007.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
DANIELA KOLAROVA INSTITUTE OF INFORMATION TECHNOLOGIES, BAS Multimedia Semantics and the Semantic Web.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
STAR, STELLAR and SKOS Ceri Binding, Phil Carlisle, Keith May, Doug Tudhope, Andreas Vlachidis University of Glamorgan and English Heritage.
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
Geospatial metadata Prof. Wenwen Li School of Geographical Sciences and Urban Planning 5644 Coor Hall
When ontology and reality collide:
TRSS Terminology Registry Scoping Study
knowledge organization for a food secure world
EnTag Enhanced Tagging for Discovery Koraljka Golub, Jim Moon,
PREMIS Tools and Services
LOD reference architecture
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
C. Binding, K. May1, R. Souza, D. Tudhope, A. Vlachidis
Presentation transcript:

Towards KOS-based semantic services Doug Tudhope Hypermedia Research Unit University of Glamorgan Helsinki, November 30, 2007

Presentation Possible approaches to interoperability (from Nov 29) –Standards –Combination of KOS STAR – combining KOS and core Ontology Semantic Technologies for Archaeological Resources EnTag – combining KOS and social tagging ‘folksonomies’ SKOS (standards) based services

STAR Semantic Technologies for Archaeological Resources Acknowledgement Ceri Binding for some of the STAR material

Project Outline  3 year AHRC funded project  Started January 2007, finish December 2010  Collaborators  English Heritage  RSLIS Denmark  Aim – “To investigate the potential of semantic terminology tools for widening access to digital archaeology resources, including disparate datasets and associated grey literature”

Background Current EH situation one of fragmented datasets and applications, with different terminology systems Interpretation may not consist of same terms as context Searchers from different scientific perspectives may not use same terminology Need for integrative metadata framework EH have designed an upper ontology based on CRM standard Work to date focused on modelling

Databases not meaningfully connected Even simply expressed queries currently difficult to answer, due to lack of tools for cross database searching "Specialists could only talk to [field] archaeologists and not talk to each other". (from discussion with a palaeoenvironmental archaeologist) Wider questions arising from science analysis by finds specialists often referred back to field archaeologist since databases documenting different scientific aspects not meaningfully connected

General architecture Database Common Ontology (CRM) API (web services?) Grey Literature documents Grey Literature documents Thesauri Application

Resource Description Framework (RDF)   XML / URI based format  Modelling of graph structures  RDF triples: subjectobject predicate “Cardiff”“Wales” is_in

Simple Knowledge Organisation Systems (SKOS)  STAR thesauri represented in SKOS   Formal representation of thesauri, taxonomies, classification schemes etc. in RDF/XML with looser semantics than OWL cost/benefit advantages for SKOS for STAR purposes  Semantics at a suitable level for retrieval purposes  Less overhead to set up  Can apply concept-based semantic expansion from FACET (see Nov 29)

CIDOC Conceptual Reference Model (CRM)  “A reference ontology for the interchange of cultural heritage information” [  International standard ISO 21127:2006ISO 21127:2006  Extensions by English Heritage Modelling workflow of excavation process and analysis  CRM as overarching framework  We use RDFS representation

ADS grey literature online Some controlled vocabulary indexing

EH extension to CRM Currently in pdf file Need to represent in machine readable format

EH_E0007.Context E62.String EH_E0061.ContextUID EH_E0005.Group EH_E0022.ContextDepiction EH_E0008.ContextStuff E54.Dimension E60.Number E55.Type E58.MeasurementUnit P3.has_note P3.1.has_type E55.Type P87.is_identified_by (identifies) P87.is_identified_by (identifies) P89.falls_within (contains) P89.falls_within (contains) P87.is_identified_by (identifies) P87.is_identified_by (identifies) EH_P3.occupied P43.has_dimension (is_dimension_of) P43.has_dimension (is_dimension_of) P90.has_value P91.has_unit (is_unit_of) P91.has_unit (is_unit_of) P2.has_type (is_type_of) P2.has_type (is_type_of) Description, Interpretive comments, Post-ex comments Length, width, height, diameter etc. A closer look…

Representing Thesauri in SKOS RDF

English Heritage Thesauri  Monument types thesaurus  Classification of monument type records  Evidence thesaurus  Archaeological evidence  MDA object types thesaurus  Archaeological objects  Building materials thesaurus  Construction materials  Archaeological sciences thesaurus  Sampling and processing methods and materials

CSV Access DB XML SKOS RDF thesauri EH2SKOS (C# app) XSLT RDF validation SKOS validation Validation reports KEA indexing Indexing Results Grey literature CSV Validation results SKOS browser English Heritage Thesauri – conversion to SKOS RDF format CSV

Environmental Archaeology Thesaurus Scope Notes Extract (i) Altered by Animals SN:Modification or damage by an animal RT Worked (use where modification is by humans in ASPECT)where Anoxic SN:Material preserved by exclusion of oxygen usually due to saturation with water which inhibits decay by micro-organisms Non Preferred Term: Waterlogged Burnt SN:Use for material that has been burnt Calcined SN:Material burnt at a high temperature (above 700 degrees centigrade) leaving only the mineral component. Non-preferred term: cremated BT: Burnt RT Cremation Charred SN:Material that has been burnt and at least in part reduced to carbon as a result of burning in a reducing atmosphere below 500 degrees C. Non-preferred term: Carbonised BT: Burnt Silicified SNUse for material that has been burnt at high temperatures in a good air supply such that only silica component remains BT: Burnt …… Mineral Replaced SN:Replacement of organic material by minerals, including calcium carbonate and calcium phosphate Non Preferred Term: Mineralised, Fossilised Mineral Preserved SN: Preservation of material by the toxic effect of corrosion products in the immediate vicinity, or within, the material Non Preferred Term: Mineralised Plant damage SN:Material that has been penetrated or disrupted by the roots or rhizomes of plants.

Example of CRM - Thesaurus connection FlotationSampleResidueType – EH_E0067 CRM entityE55: Type Classification of flot and/or residue contents Mapping (by EH collaborators): Use Arch Science Thesaurus Terms: Object type, Material type, Modification state, Aspect

Example CRM - Thesaurus connection 2 ContextSampleType – EHE0053 CRM entityE55: Type Derived from the Environmental guidelines list Samples taken will be of a particular type depending upon the technique that will be used to analyse them. For Specialist Scientific Sampling it would be appropriate to use Archaeological Science Thesaurus terms for “Investigative Techniques”, but for samples taken by non-specialists the investigative technique may not be known at the point of sampling.

CRM - Thesaurus connection ? How to model connection between CRM-EH ontology and the STAR thesauri? Initial analysis suggests does not necessarily fit neatly under CRM-EH entities as a direct relationship Currently considering applying BSI Part 4 mapping relationships (subset SKOS mapping relationships)

Data modelling The data extraction process involved selected data from the following archaeological datasets: Raunds Roman Analytical Database (RRAD) Raunds Prehistoric Database (RPRE) York Archaeological Trust Integrated Archaeological Database (IADB) Approach was to extract modular parts of the larger data model from the RRAD, RPRE and IADB databases via SQL queries, and store the data retrieved in a series of RDF files. This allows data instances to be later selectively combined as required.

Example – data mapping to CRM-EH

Example – data modelling These relationships between entities were extracted and modelled in RDF, for each dataset:

Potential mapping problems in mapping from different datasets to a general high level ontology “The first issue is the abstractness of the concepts (e.g. Time Appellation, Man-Made Object) defined by the global ontology, which makes them ambiguous to any human user. Even expert users have produced ambiguous mappings and have required several iterations to produce consistent mapping definitions. If several experts specify mappings independently from each other, it is very likely that they will produce incompatible mappings and fail the goal of enabling interoperability. Another point directly connected to the abstractness of the concepts, is the presentation to the user. Basically a graphical user interface is required which hides the complexity of the global ontology and allows the user to formulate queries over more concrete concepts.” From BRICKS FP6 IP CIDOC CRM Poster at ECDL07, Budapest

Potential mapping problems in mapping from different datasets to high level ontology STAR approach BRICKS worked with high level basic CIDOC CRM schema CRM-EH has detailed extension of high level CIDOC CRM entities modelling workflow of excavation and analysis Our approach is that this will be easier to map to STAR confined to UK archaeology domain With mapping done by project team and main collaborators Consider generalisation issues as part of project evaluation

Investigate cost/benefit issues … Granularity of modelling in CRM-EH ontology Granularity of mapping to data Extent to which ontology of excavation and analysis process useful for retrieval purposes Extent to which ontology can inform user interface Scalability of exporting data to RDF Generalisability beyond immediate UK context

Presentation Possible approaches to interoperability –Standards –Combination of KOS STAR – combining KOS and core Ontology EnTag – combining KOS and social tagging ‘folksonomies’ SKOS (standards) based services

EnTag project Enhanced Tagging for Discovery Ongoing JISC funded project to investigate the combination and comparison of controlled and folksonomy approaches to semantic interoperability in the context of repositories and digital collections Partners UKOLN, Glamorgan, CCLRC, Intute, with unfunded support from OCLC Research and Royal School of Library and Information Science, Denmark Glamorgan role Develop and evaluate a Demonstrator combining Dewey and social tagging

EnTag project Enhanced Tagging for Discovery Social tagging applications hold promise of reducing indexing costs by drawing end-users into contributing this resource. However existing social tagging applications have not been designed with information discovery and retrieval in mind. The resulting folksonomies are completely uncontrolled, lacking even basic control of word forms, spelling, synonyms and disambiguation of homonyms. EnTag aims to compare a ‘vanilla’ social tagging application for Intute Social Sciences collection with an advanced hybrid application, where Dewey terminology resources are employed to structure and ‘improve’ the indexing, while retaining social tagging nature

2007Marianne Lykke Nielsen Evaluation of EnTag: considerations of test design Marianne Lykke Nielsen Information Interaction and Architecture Royal School of Library and Information Science

2007Marianne Lykke Nielsen Evaluation – focus and objective Context : tagging as part of information searching and relevance assessment, tagging for recommendation and sharing Hybrid system : investigate whether tagging can be improved by a combination of traditional tag clouds and clouds of controlled descriptors Improve tagging Relevance of tags (perspective, aspects, specificity, exhaustivity, terminology (linguistic level, semantic level, contextual level) Consistency Efficiency (time used) Use (tags selected, clouds consulted, order of consultation) Improve retrieval Effectiveness (degree of match between user and system terminology)

2007Marianne Lykke Nielsen Evaluation – test setting Comparison test : comparison between 1) control, ”vanilla” system and 2) experimental, hybrid system. Open test - taggers know that they use and evaluate two systems Number of taggers : 50 taggers, post graduate students using the Intute Social Science as part of common information seeking strategy Number of documents : 100 documents, covering up to four topics of relevance for the group of taggers Initial Draft Test design : Each tagger tags all documents: System 1: 50 documents System 2: 50 documents Each document is tagged by all 50 taggers: System 1: 25 taggers System 2: 25 taggers 5000 tagging sessions System 1: 2500 System 2: 2500

2007Marianne Lykke Nielsen

Presentation Possible approaches to interoperability –Standards –Combination of KOS STAR – combining KOS and core Ontology EnTag – combining KOS and social tagging ‘folksonomies’ SKOS (standards) based services

SKOS API SKOS API a deliverable of SWAD-Europe Thesaurus Activity SKOS API designed to provide programmatic access to thesauri and related KOS in SKOS Core Example SKOS API calls –getConcept (uri) –getConceptsMatchingKeyword/Regex (string) –getAllConceptRelatives (concept) –getSupportedSemanticRelations –getAllConceptRelatives (concept, relation) –getAllConceptsByPath (concept, relation, distance)

Pilot KOS Browser Client Web Service Developed C# client application for SKOS thesaurus server A 'rich client' browser displays details for SKOS concepts via web service calls Uses subset of SKOS API with extension for semantic expansion

Also ongoing Javascript (AJAX) interface widgets for desktop access to SKOS-based services HTML with JavaScript and CSS styling, using Javascript AJAX calls to communicate with the same (SOAP) web service that the C# demonstrator uses Contact if interested in trying out early prototypes

Future issues More complex services as API protocol elements: more advanced natural language functionality cross-mapping provision data-dependent filters (such as number of postings) semantic expansion as a service –different configurations KOS interface displays by single call –novel interfaces, such as navigation via semantic expansion –Query expansion for various ranked result query services –Term suggestion to assist indexing/annotation –More details: KOS at your Service: Programmatic Access to Knowledge Organisation Systems

SKOS based lightweight semantic services Arguably, information retrieval KOS provide a semantic structure at a suitable granularity for the general problem of search and retrieval where fuzzy aboutness relationship connects concepts and information resources SKOS standard representation, combined with other developments in standard identifiers and service protocols, offers a lightweight approach for a wide variety of annotation, search and browsing oriented applications that don’t require first order logic.

Contact Information Doug Tudhope School of Computing University of Glamorgan Pontypridd CF37 1DL Wales, UK

References Binding C, Tudhope D 2004 KOS at your Service: Programmatic Access to Knowledge Organisation Systems, Journal of Digital Information, 4(4) CIDOC CRM SKOS Simple Knowledge Organisation Systems STAR Project. Tudhope D., Binding C Towards Terminology Services: experiences with a pilot web service thesaurus browser. ASIST Bulletin, 32(5), 6-9, June/July. Available online at Tudhope D, Binding C, Blocks D, Cunliffe D 2006 Query expansion via conceptual distance in thesaurus indexed collections. Journal of Documentation, 62 (4):