Antoine Isaac SEMIC conference Use of semantic technologies for publishing and re-using cultural and scientific heritage data Antoine Isaac SEMIC conference June 18 2012, Brussels
?
Europeana facts “Single, direct and multilingual access point to the European cultural heritage.” European Parliament, 27 September 2007 23.5 million objects more than 2.200 institutions 33 countries
Who submits data to Europeana? Horizontal Aggregators Vertical Aggregators National Aggregators Archives Culture Grid APEnet Libraries GLAMs The European Library Regional Aggregators Dark Aggregators Flanders museums ATHENA ELocal Film archives European Film Gateway Mn;kl;k;klj;lkj;lkj;jh;lkj;klj;klj;klj;klj GLAMs Museums GLAMs
What is submitted to Europeana? 3. Links to digital objects online 2. Metadata 1. Thumbnails
TEXT IMAGE VIDEO AUDIO
Making metadata work for Europeana Building a search engine on top of metadata is difficult Traditional metadata quality problems: correctness, coverage Especially when data is so heterogeneous 100s of formats, multilingual data We currently use a simple flat interoperability format (ESE)
More semantics-enabled services Enhance access by semantics Query expansion, clustering of results Exploiting various relations: "located in", “more specific concept"… Goal: to make richer data and services available to us and others Semantics are already there, in original metadata Thesauri, classifications… ESE loses information
Building a "semantic layer” context
Matches interest for linked data in libraries, archives and museums LOD-LAM
Available Linked Library Data http://www.w3.org/2005/Incubator/lld/XGR-lld-vocabdataset/
Available Library Linked Data Element sets/schemas/ontologies SKOS, Dublin Core, OAI-ORE… Value vocabularies/thesauri/authority lists LCSH, VIAF… Datasets British Library, Chronicling America…
Europeana and Linked Data Provide trusted, reference data for cultural objects Promote the use of the technology Promoting the exchange of data in the community and with third parties: Open (meta)data!
Europeana and Linked Data http://vimeo.com/36752317
Some steps in production services
Re-use and linking Currently: GeoNames, GEMET… Data re-use can be serendipitous! From our domain (VIAF, UDC) or others (Eurovoc) Multilingual resources are key for us
Europeana Data Model Representing objects & others: persons, places... Linking to internal or external data sources Separating original data from enrichments Enabling domain-specific data profiles Model re-uses existing vocabularies http://pro.europeana.eu/edm-documentation
data.europeana.eu Europeana Linked Open Data Pilot Fully open metadata 2.4 M objects 200 individual providers 15 countries We created a Linked Open data pilot. Via word of mouth practically, we invited some of our partners to allow us to publish their data as LOD. This 3m data is now online. If you want to know more, check Data.europeana.eu
Challenges of semantic technology Really big impact on processes IF you wish so Requires a lot of education/evangelisation More complex data modeling is an art finding the right balance & linking to requirements Linking datasets remain difficult needs tooling, involvement of stakeholders
Ongoing work EDM implementation in Metadata enrichment Data harvesting Search, browse etc. Data publishing interfaces Search API, Linked Open Data, data dumps, OAI-PMH… Metadata enrichment
Summary: benefits of semantic technologies for Europeana Vocabularies and datasets to re-use Flexible approach to building & re-using standards More flexible approach to interoperability custom vocabularies co-existing with standard ones No constraints on the granularity of the data model Technical ease of connecting and publishing data Vision relates to open data strategies
ISA? Contribution to data modeling and exchange Core vocabularies give good hints on what is needed Source of data for re-use Helping our data to be re-used ADMS
Thank you Antoine Isaac aisaac@few.vu.nl