Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008.

Slides:



Advertisements
Similar presentations
Using SKOS in practice, with examples from the classification domain
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
Multilinguality & Semantic Search Eelco Mossel (University of Hamburg) Review Meeting, January 2008, Zürich.
Alexandria Digital Library Project Integration of Knowledge Organization Systems into Digital Library Architectures Linda Hill, Olha Buchel, Greg Janée.
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC (inluding cool graphics by Frank van Harmelen) STITCH Project Book.
Accessing Cultural Heritage Collections using Semantic Web Techniques Antoine ISAAC STITCH Project SIKS Semantic Web Seminar, Utrecht April 11 th, 2007.
STITCH final event KB July Agenda Brief presentation of STITCH main achievements Demo: annotation suggestion at KB The future use of STITCH results.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Subject Analysis: An Introduction Based on BASIC SUBJECT CATALOGING USING LCSH edited by Lori Robare.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
CS570 Artificial Intelligence Semantic Web & Ontology 2
PoolParty Vasiljevic Vladica,
ICT Monica Monachini – 1° KYOTO Workshop – Amsterdam 2/ KYOTO (ICT ) Yielding Ontologies for Transition-Based Organization Intelligent.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
Using Metadata in CONTENTdm Diana Brooking and Allen Maberry Metadata Implementation Group, Univ. of Washington Crossing Organizational Boundaries Oct.
Aligning Thesauri for an integrated Access to Cultural Heritage Collections Antoine ISAAC (including slides by Frank van Harmelen) STITCH Project UDC Conference.
The Value of Usage Scenarios for Thesaurus Alignment in Cultural Heritage Context Antoine Isaac, Claus Zinn, Henk Matthezing, Lourens van der Meij, Stefan.
An Empirical Study of Instance-Based Ontology Mapping Antoine Isaac, Lourens van der Meij, Stefan Schlobach, Shenghui Wang funded by NWO Vrije.
Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation Antoine Isaac, Dirk Kramer, Lourens van.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Thesaurusmanagement Quickstart Introduction. What are controlled vocabularies? organized arrangement of words and phrases used to index content and/or.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
8/28/97Organization of Information in Collections Introduction to Description: Dublin Core and History University of California, Berkeley School of Information.
Distributed Access to Data Resources: Metadata Experiences from the NESSTAR Project Simon Musgrave Data Archive, University of Essex.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Multilingual Information Exchange APAN, Bangkok 27 January 2005
Meta Tagging / Metadata Lindsay Berard Assisted by: Li Li.
1 Ontology-based Semantic Annotatoin of Process Template for Reuse Yun Lin, Darijus Strasunskas Depart. Of Computer and Information Science Norwegian Univ.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Standards for the Representation of Knowledge on the Semantic Web Antoine ISAAC STITCH Project Offene Archivierbare Formate Oct. 25 th, 2007.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
AGROVOC Thesaurus. 1980s: developed as multilingual structured thesaurus for agricultural terminology (“rice”) : parallel effort to express thesaurus.
Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
KB subject prediction tool. STITCH final event KB subject prediction prototype Introduction Subject prediction is a special case of book reindexing What.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair Vienna,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Semantic Web Overview Diane Vizine-Goetz OCLC Research.
The Agricultural Ontology Server (AOS) A Tool for Facilitating Access to Knowledge AGRIS/CARIS and Documentation Group Food and Agriculture Organization.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Online Information and Education Conference 2004, Bangkok Dr. Britta Woldering, German National Library Metadata development in The European Library.
Setting the stage: linked data concepts Moving-Away-From-MARC-a-thon.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
UNIFIED MEDICAL LANGUAGE SYSTEMS (UMLS)
Information modeling and infrastructures for metadata
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Introduction to Metadata
Applications of IFLA Namespaces
PREMIS Tools and Services
LOD reference architecture
Database Design Hacettepe University
RDA in a non-MARC environment
Márton Németh – László Drótos How to catalogue a web archive?
Metadata supported full-text search in a web archive
Presentation transcript:

Interoperability in the Cultural Heritage Domain Lourens van der Meij VU Amsterdam – KB (part of sheets by A.Isaac) October 3 rd, 2008

Interoperability in the Cultural Heritage Domain Background CATCH (NWO) Continuous Access To Cultural Heritage Computer science research projects Applied to Cultural Heritage (Libraries, Musea) STITCH SemanTic Interoperability To access Cultural Heritage Interoperability: Exchanging (standardization) Integrating (translating, linking) metadata

Interoperability in the Cultural Heritage Domain Intention Show through example applications that Integration of data, collections, and services Interoperability: Data standardized such that it can be used across different applications Functionality reusable via services. Creating mappings, semantic links between data from different sources is important in the Cultural Heritage Domain

Interoperability in the Cultural Heritage Domain First Illustrate Integrated access to collections in the CH domain by looking at use case. Introduction of the use case About vocabulaires Introduce the collections that will be integrated Faceted browsing What we want -> Demo Requirements, details

Interoperability in the Cultural Heritage Domain (Integrated) Access to collections Collections: (records) of books, pieces of art,… Electronic access, web portal. STITCH focuses on semantics: structured access using the available knowledge sources, not full text search Records: meta data, information about the object Author Date Subject CH institutes often maintain knowledge structures(KOS), vocabularies, to facilitate storage and access and maintenance. Subject meta data, access through KOS focus of STITCH.

Interoperability in the Cultural Heritage Domain Vocabularies (Knowledge Structures, KOS) Thesauri, classification systems, structuring collections, describing content, form, aspects of collection elements. Many vocabularies, within the KB: STITCH is cooperation between VU Amsterdam (KRR group), National Library(KB) and MPI Nijmegen. In the KB in the order of 10 vocabularies are maintained internally, and 20 or more external vocabularies play a role. Why? History Specialized collections, particular views on the collection and theories how access should be provided. Examples of vocabularies in the demos.

Interoperability in the Cultural Heritage Domain Vocabularies Many different (kinds) of Vocabularies Many different representations, data formats, methods of access. Integrated access requires standardized representation of vocabularies and collections standardized access => services Providing links between elements of vocabularies, alignment of vocabularies Next: example of integration

Interoperability in the Cultural Heritage Domain Illustration, use case STITCH Integrated access to two collections: KB : geillumineerde manuscripten BnF: Mandragore, manuscrits enluminés STITCH focus: Integration Alignment, techniques (and standards) Interoperability RDF, SKOS Those aspects will be discussed after the first demo.

Interoperability in the Cultural Heritage Domain KB Illustrated Manuscripts

Interoperability in the Cultural Heritage Domain KB Illustrated Manuscripts: Iconclass

Interoperability in the Cultural Heritage Domain Mandragore

Interoperability in the Cultural Heritage Domain Mandragore

Interoperability in the Cultural Heritage Domain Faceted browsing Access the collection, using structure of the vocabularies Different dimensions: subject, author,.. Use the hierarchy of vocabularies if there is such to group together objects Lions, Giraffes, Zebras -> animals. Distinguish them as a group.

Interoperability in the Cultural Heritage Domain What we have

Interoperability in the Cultural Heritage Domain What we want

Interoperability in the Cultural Heritage Domain Demo KB Illuminated Manuscripts BNF Mandragore Manuscripts mandraNewNONE, amphibianshttp://galjas.cs.vu.nl:33333/MANDRA-SV-ICE- mandraNewNONE Wheat

Interoperability in the Cultural Heritage Domain Integrated Access Integrated semantic access requires standardized representation of vocabularies and collections standardized access => services Providing links between elements of vocabularies.

Interoperability in the Cultural Heritage Domain Standardized representation Use of semantic web techniques “Things” are represented as “resources”,URIs, over any application and data set Values as simple strings, numbers(Literals), URIs Properties as typed, named links between URIs and URIs and Literals Theory, reasoning methods.  interoperability, some standardization  Still need standardization on how to represent CH objects (xml:Dublin core), vocabularies (SKOS), links between elements of vocabularies.

Interoperability in the Cultural Heritage Domain skos:Concept rdf:type skos: broader skos: prefLabel “the Virgin skos: prefLabel “la Vierge skos: inScheme skos:ConceptScheme rdf:type SKOS: Example

Interoperability in the Cultural Heritage Domain SKOS (Simple Knowledge Organization System) SKOS offers building blocks to represent KOSs in RDF Objects: Concept and ConceptScheme Lexical properties (multilingual) prefLabel altLabel Semantic relations broader, narrower related Notes scopeNote definition …

Interoperability in the Cultural Heritage Domain Vocabulary alignment Aim: finding semantic correspondences between vocabulary elements “klassieke ruïnes” ≈ “landschap met ruïnes” “maagd Maria” = “Heilige Moeder” Doing it (semi-) automatically Vocabularies are big (tens of thousands concepts) They change

Interoperability in the Cultural Heritage Domain Automatic alignment techniques Lexical Labels of entities and textual definitions Structural Structure of the vocabularies Background knowledge Using a shared conceptual reference to find links Extensional Object information (e.g. book indexing) céréale, grain, blé blé

Interoperability in the Cultural Heritage Domain Automatic alignment techniques Lexical Labels of entities and textual definitions Structural Structure of the vocabularies Background knowledge Using a shared conceptual reference to find links Extensional Object information (e.g. book indexing) céréale, grain, blé blé

Interoperability in the Cultural Heritage Domain Extensional Statistical Alignment Object information (e.g. book indexing) Thesaurus 1 Thesaurus 2 Collection of books “Dutch Literature” “Dutch”

Interoperability in the Cultural Heritage Domain Results 1: ( ) Schilderijen - schilderkunst 2: ( ) Kwaliteitszorg - kwaliteitsmanagement 3: ( ) Personeelsmanagement - personeelsbeleid 4: ( ) Beeldende kunsten - beeldende kunst 5: ( ) Nederlands - Nederlandse taalkunde 17: ( ) Diabetes mellitus - suikerziekte

Interoperability in the Cultural Heritage Domain Alignment: no Trivial Solution Current techniques are not reliable as unique source of knowledge What is a good alignment? Evaluation criteria? => What will it be used for? Usage scenarios Integrated Search Reindexing Thesaurus merging Navigation => faceted browsing

Interoperability in the Cultural Heritage Domain What next Evaluation, lessons learned What next -> Second use case: reindexing (Vocabulary service) Conclusion

Interoperability in the Cultural Heritage Domain Why usage scenarios Evaluation of alignments depends on its use. Real world applications provide test of quality of alignments Requirements on alignments depend on their use. What kinds of links should be distinguished? Optional demo evaluation: Next, reindexing, nearest to real world application.

Interoperability in the Cultural Heritage Domain Situation at Dutch libraries, National Library(=KB) KB: two large collections: DEPOT?Deposit collection: all Dutch language publications) Own Scientific collection Subject indexing using two completely different indexing systems Brinkman, GOO Common automation system for NL, Eu (OCLC-Pica) Meta data of books, contains lots of fields Een boek, publicatie door verschillende bibliotheken voorzien van meta data, gebruik makend van vele verschillende vocabulaires.

Interoperability in the Cultural Heritage Domain Reindexing KB has about 20 people indexing books daily, about 20,000 books per year are being indexed. Indexing even internally according to different vocabularies. Indexing: adding keywords and classification information to books. Some books come with indexing done by other libraries (openbare bibliotheken, Biblion). If Biblion indices, or combinations could be translated to KB indices (Brinkman). Less work for KB.

Interoperability in the Cultural Heritage Domain WinIBW OCLC (PICA) automatiseringssysteem voor bibliotheken in Nederland, ook gebruikt binnen Europa Online Public Access Catalogue (OPAC) WinIBW internet access to Pica system (local and central). Adding records, adding meta data, searching records. Demo, closest to real world application.

Interoperability in the Cultural Heritage Domain Reindexing Biblion -> Brinkman Fietstochten, Kapellen, Beesel, Heiligenbeelden,… -> Brinkman? Use alignment.. Bibl:Fietstochten -> Brinkman? Bibl:Kappellen -> Brinkman? DEMO (Voorbeeld z sel gd? 79)

Interoperability in the Cultural Heritage Domain

Result

Interoperability in the Cultural Heritage Domain Reindexing Under evaluation Improvement: Use other meta data Adapt scenario (pass 95% confidence records) Many other uses.

Interoperability in the Cultural Heritage Domain Schets vocabulaires van belang voor de KB

Interoperability in the Cultural Heritage Domain Integrated Access Services through the internet Protocols, SOAP, REST,.. Collection Access? Vocabulary Access, Alignment access

Interoperability in the Cultural Heritage Domain Lessons Using semantic web techniques interoperability and integration of collections can be made easier. Aligning vocabularies is of use in different situations. The alignment methods need to be fine-tuned to the application they are meant for. Introducing new techniques, interaction between field CH and scientific institutes very valuable. Standardization of access to collections and vocabularies should be dealt with (prototype has been developed).

Interoperability in the Cultural Heritage Domain Begrippen An ontology in both computer science and information science is a formal representation of a set of concepts within a domain and the relationships between those concepts. It is used to reason about the properties of that domain, and may be used to define the domain.computer science information sciencedomainreason Metadata (meta data, or sometimes metainformation) is "data about data", of any sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, for example a database schema.datumdatabase schema

Interoperability in the Cultural Heritage Domain begrippen A library classification is a system of coding and organizing library materials (books, serials, audiovisual materials, computer files, maps, manuscripts, realia) according to their subject and allocating a call number to that information resource. Similar to classification systems used in biology, bibliographic classification systems group entities that are similar together typically arranged in a hierarchical tree structure.systembooksmapsmanuscriptsrealia In information technology, a thesaurus represents a database or list of semantically orthogonal topical search keys. In the field of Artificial Intelligence, a thesaurus may sometimes be referred to as an ontology.information technologyorthogonal Artificial Intelligenceontology