Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana Antoine Isaac Vrije Universiteit Amsterdam Europeana DANS Linked Data.

Slides:



Advertisements
Similar presentations
Using SKOS in practice, with examples from the classification domain
Advertisements

Dublin Core for Digital Video: Overview of the ViDe Application Profile.
Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
Multilingual Access to Online Content - the Europeana Experience Vivien Petras (Humboldt-Universität zu Berlin) With the help of.
Catherine Worrall Slide Library Co-ordinator, University College Falmouth.
Andy Powell, Eduserv Foundation Feb 2007 The Dublin Core Abstract Model – a packaging standard?
Controlled Vocabularies in TELPlus Antoine ISAAC Vrije Universiteit Amsterdam EDLProject Workshop November 2007.
On Libraries & Linked Data Antoine Isaac UB Utrecht, April 6, 2011.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
A Semantic Web View on Concepts and their Alignments Antoine Isaac Vrije Universiteit Amsterdam Europeana Concepts in Context, Köln, July 19 th 2010.
Data modeling at Europeana Antoine Isaac METS Workshop at the Digital Libraries 2014 Conference London, Sept. 11, 2014.
SKOS and Linked Data Antoine Isaac ISKO, London, Sept. 14th 2010.
SKOS and Other W3C Vocabulary Related Activities Gail Hodge Information International Assoc. NKOS Workshop Denver, CO June 10, 2005.
A web-based repository service for vocabularies and alignments in the Cultural Heritage domain Lourens van der Meij Antoine Isaac Claus Zinn.
Notes on ThoughtLab / Athena WP4 November 13, 2009 Antoine Isaac
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
Aligning Thesauri for an integrated Access to Cultural Heritage Collections Antoine ISAAC (including slides by Frank van Harmelen) STITCH Project UDC Conference.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Vocabulary Matching for Book Indexing Suggestion in Linked Libraries – A Prototype Implementation & Evaluation Antoine Isaac, Dirk Kramer, Lourens van.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
A Registry for controlled vocabularies at the Library of Congress
SemanTic Interoperability To access Cultural Heritage Frank van Harmelen Henk Matthezing Peter Wittenburg Marjolein van Gendt Antoine Isaac Lourens van.
National libraries and identity in the Semantic Web Gordon Dunsire BNE, Madrid, 14 Dec 2011.
Accessing Cultural Heritage using Semantic Web Techniques Antoine ISAAC VU Amsterdam - KB Digital Access to Cultural Heritage Master March 20 th, 2008.
Networking Session: Global Information Structures for Science & Cultural Heritage - The Interoperability Challenge «INTEROPERABILITY FROM THE CULTURAL.
Metadata Standards and Applications 5. Applying Metadata Standards: Application Profiles.
1/ 27 The Agriculture Ontology Service Initiative APAN Conference 20 July 2006 Singapore.
PREMIS Tools and Services Rebecca Guenther Network Development & MARC Standards Office, Library of Congress NDIIPP Partners Meeting July 21,
The Europeana Data Model: tackling interoperability via modelling Carlo Meghini, Antoine Isaac, Stefan Gradmann, Guus Schreiber, et al. DL.org Autumn School.
Europeana and Open Data Robina Clayphan Interoperability Manager, Europeana LDBC TUC meeting, 19 November, 2013.
The OAI-ORE based data model of Europeana and the Digital Public Library of America: implications for educational publishing Dov Winer MAKASH – Advancing.
SKOS Simple Knowledge Organization System Antoine Isaac Dublin Core tutorial, Sept. 21, 2011.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
The Europeana Data Model: Constraints and Opportunities Prof. Dr. Stefan Gradmann Based on work done with M. Doerr, S. Hennicke, A. Isaac, C. Meghini,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
METADATA QUALITY IN EUROPEANA , Den Haag.
Europeana as a Linked Open Data case (in progress) Antoine Isaac ISKO UK Seminar “Making Metadata Work” London, June 23, 2014.
Europeana and semantic alignment of vocabularies Antoine Isaac Jacco van Ossenbruggen, Victor de Boer, Jan Wielemaker, Guus Schreiber Europeana & Vrije.
Linked data the next network?. The Web of documents is for people The Web of data is for computers The Web of documents is difficult for computers to.
D4: SKOS and HIVE—Enhancing the Creation, Design and Flow of Information Speakers: Hollie White Jane Greenberg Coordinator: Alan Keely.
Metadata Modularization Concepts and Tools Carl Lagoze CS
Aligning library-domain metadata with the Europeana Data Model Sally CHAMBERS Valentine CHARLES ELAG 2011, Prague.
Towards a semantic web Philip Hider. This talk  The Semantic Web vision  Scenarios  Standards  Semantic Web & RDA.
, 1/21, © Library and Documentation Systems Division 21 st APAN Meeting Tokyo, January 2006 AGROVOC and AOS, Margherita Sini, FAO From.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Antoine Isaac 1 st PRELIDA Workshop Pisa, June 26, 2013.
A Systemic Approach for Effective Semantic Access to Cultural Content Ilianna Kollia, Vassilis Tzouvaras, Nasos Drosopoulos and George Stamou Presenter:
It’s all semantics! The premises and promises of the semantic web. Tony Ross Centre for Digital Library Research, University of Strathclyde
SKOS. Ontologies Metadata –Resources marked-up with descriptions of their content. No good unless everyone speaks the same language; Terminologies –Provide.
EConnect WP1 & semantic issues VU members –Guus Schreiber, Antoine Isaac, Jacco van Ossenbruggen, Jan Wielemaker.
1 Linked Open Europeana: Semantic Leveraging of European Cultural Heritage Prof. Dr. Stefan Gradmann Humboldt-Universität.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
The Mint Mapping tool The MoRe aggregator Vassilis Tzouvaras, Dimitris Gavrilis National Technical University of Athens Digital Curation Unit - IMIS, Athena.
EUROPEANA DATA MODEL, short-term plans EDM worskhop 2015 Netherlands, Public Domain , Rijksmuseum Anonymous Arrival of a Portuguese ship.
EUROPEANA DATA MODEL, Past and Present Valentine Charles | EDM worskhop 2015 Netherlands, Public Domain , Rijksmuseum Anonymous Arrival of a.
EDM Europeana Data Model Guus Schreiber with input from Carlo Meghini, Antoine Isaac, Stefan Gradmann, Maxx Dekkers et al. from Europeana V1.
Digitization – Basics and Beyond workshop Interoperability of cultural and academic resources New services for digitized collections Muriel Foulonneau.
A centre of expertise in digital information management UKOLN is supported by: Functional Requirements Eprints Application Profile Working.
Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar.
Objectives and scope of semantic enrichment and tools Europeana v1.0 work package 3 meeting Berlin, 25/26 January 2010 Stefan Gradmann / Marlies Olensky.
PREMIS Controlled vocabularies Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair Vienna,
Differences and distinctions: metadata types and their uses Stephen Winch Information Architecture Officer, SLIC.
Describing resources II: Dublin Core CERN-UNESCO School on Digital Libraries Rabat, Nov 22-26, 2010 Annette Holtkamp CERN.
Linked Library (+AM) Data Presented LITA Next-Generation Catalog IG Corey A Harper Publish, Enrich, Relate and Un-Silo.
Metadata standards and interoperability 384C – Organizing Information Spring 2016 Karen Wickett School of Information University of Texas at Austin.
1 The Europeana Data Model (EDM): Object Representations, Context and Semantics Prof. Dr. Stefan Gradmann Humboldt-Universität zu Berlin / School of Library.
Antoine Isaac.
PREMIS Tools and Services
Antoine Isaac SEMIC conference
Presentation transcript:

Semantic Web and Linked Data for cultural heritage materials Approaches in Europeana Antoine Isaac Vrije Universiteit Amsterdam Europeana DANS Linked Data and RDF workshop, Den Haag, July 28 th 2010

A web of cultural heritage data? ?

?

The current portal

Towards semantic search: facets

Building a search engine on top of metadata is difficult Intrinsic quality problems: correctness, coverage Especially when data is so heterogeneous 100s of formats From flat 5-fields records to 100-nodes XML trees Language issue! We currently use a simple interoperability format Quick-win showing quickly its limits

We can better use institutions’ original metadata Accommodate their different practices Data structures and semantics Access objects via a semantic layer of vocabularies for subjects, persons, places… Semantic ThoughtLab: experimenting solutions

Towards semantics-enabled search Building a "semantic layer" to help accessing content

Towards semantics-enabled search Enhance access to Europeana content by semantics – Query expansion, clustering of results Exploiting various types of relations – "located in", "lived in", "is more specific concept"… Semantics are already there, in metadata and "controlled vocabularies" used in metadata – Thesauri, classifications… Requires to make it properly machine-accessible

Prototype: Europeana Thought Lab

Semantic auto-completion

Clustering of results

Baseline: matching concepts' label Controlled place name from a vocabulary at the Rijskmuseum Metadata for the object

A "more specific Egypte"?

Metadata for the object

A place more specific than the Egypt one Semantic information on the Giza place in the Rijskmuseum Vocabulary

Following other relations

Following other relations - creator Metadata for the object Controlled person name from a vocabulary at the Rijskmuseum

Following other relations - match Information on Gustave Le Gray from the Rijskmuseum Vocabulary Matched to a "Gustave Le Gray" from another Vocabulary

Following other relations – death place Information on Gustave Le Gray from the Union List of Artist Names (Getty)

Following other relations – death place Information on Cairo from the Thesaurus of Geographic Names (Getty) Matched to "Cairo" from another vocabulary…

A hell of relations? Well, they were in the original data, we just had to make them explicit! Cultural Heritage institution often have a wealth of metadata to share and exploit

Enabling bits & pieces Exploiting semantic links in CH vocabularies Rijksmuseum thesaurus: Concept “Giza” narrower than concept “Egypte” Mapping/alignment between CH vocabularies Louvre’s “Égypte” equivalent to Rijksmuseum’s “Egypte” Enrichment of existing metadata The string “Egypt” in a metadata record indicates the concept of Egypt defined in Rijksmuseum thesaurus

SKOS, Knowledge Organization Systems and Linked Data SKOS allows representing (simple) KOS data as RDF animals NT cats cats UF domestic cats RT wildcats BT animals SN used only for domestic cats domestic cats USE cats wildcats

SKOS, KOSs and LD SKOS allows bridging across KOSs from different contexts

SKOS is used! Many Libraries – not a surprise! Swedish National Library’s Libris catalogue and thesaurus Library of Congress’ vocabularies, including LCSH DNB’s Gemeinsame Normdatei (incl. SWD subject headings) Documentation at BnF’s RAMEAU subject headings OCLC’s DDC classification and VIAF STW economy thesaurus National Library of Hungary’s catalogue and thesauri (example) Other fields Wikipedia categories through Dbpedia New York Times subject headings IVOA astronomy vocabularies GEMET environmental thesaurus UMTHES Agrovoc Linked Life Data Taxonconcept UK Public sector vocabularies (e.g., )

KOS Alignments? Quite many of them are linked to some other resource LCSH, SWD and RAMEAU interlinked through MACS mappings GND linked to DBpedia and VIAF Libris linked to LCSH Agrovoc to CAT, NAL, SWD, GEMET NYT to freebase, DBpedia, Geonames dbPedia links are overwhelming Hungary, STW, TaxonConcept, GND…

Enabling bits & pieces (c’ed) Appropriate data model for objects Generic constructs for creation, title, subject, etc. that are useful for querying Flexible data model SW ontology linking features allow to keep close to original data while having the generic notions above

Formal semantics, metadata schemas and querying The query: The existing description: Why is there a match? For the Europeana ontology, every rma:depicts statement implies a vra:subject statement rma:gezicht_in_cairo rma:Cairo rma:depicts rma:Egypt skos:broader ?x ?y vra:subject rma:Egypt skos:broader

Where are the challenges? Semantic conversion of data – Using appropriate data models – Enriching legacy metadata Semantic alignments – Between description ontologies vra:depicts rdfs:subPropertyOf dc:subject – Between concepts in controlled vocabularies iconclass:bird skos:closeMatch ddc:bird

Alignment of semantic references

Where are the challenges? Semantic alignment (c'ed) – Find correspondences between large vocabularies – In a multilingual context Scalability – Plugging the semantic features into the Europeana production environment

The Europeana Data Model (EDM) with input from Carlo Meghini, Guus Schreiber, Stefan Gradmann, Maxx Dekkers, Steffen Hennicke, Viktor de Boer et al. from Europeana V1

Rationale of EDM Precursor: ESE (Europeana Semantic Elements) –represents lowest common denominator for object metadata convert datasets to Dublin-Core like standard –forces interoperability –major drawback: original metadata is lost –most values are simple strings EDM goals –preserve original data while still allowing for interoperability –Semantic Web representation A community-driven effort –Core experts, validation by representatives of various CH domains

EDM requirements & principles 1.Distinction between “provided object” (painting, book, program) and digital representation 2.Distinction between object and metadata record describing an object 3.Allow for multiple records for same object, containing potentially contradictory statements about an object 4.Support for objects that are composed of other objects 5.Standard metadata format that can be specialized 6.Standard vocabulary format that can be specialized 7.EDM should be based on existing standards

EDM basics OAI ORE for organization of metadata about an object Dublin Core for metadata representation SKOS for vocabulary representation + Links to CIDOC-CRM and other shared ontologies

Dublin Core EDM uses the latest version of DCMI Metadata Terms for a core of semantically interoperable properties –And for backward compatibility, cf. ESE Specified with an RDF model Specialization of 15 original DC elements Can be specialized itself –see requirement -> this is a crucial distinction with ESE Used in the richest way possible –Pointers to resources

SKOS: vocabulary publication on the Web Already seen…

OAI ORE Specification: Specified with an RDF model Four key notions (RDF classes) –Object: the book/painting/program being described –Aggregation: organizes object information from a particular provider (museum, archive, library) –Proxy: the object as viewed in a metadata record –Digital representation: some digital form of the object with a Web address

The Example

The Example

Aggregation organizes data of a provider 43 aggregation digital representation object provenance metadata

Proxy: metadata record for an object 44 proxy object metadata

Multiple aggregations = multiple providers 45 aggregation of DMF aggregation of Louvre

Multiple aggregations = multiple providers 46 DMF proxy Louvre Proxy Louvre title DMF title The “real” painting

Europeana is “just” a special provider with processed/enriched metadata 47 Europeana aggregation enriched metadata Europeana landing page

A flexible model: different semantic grains Cf. goal: “preserve original data while still allowing for interoperability” Keep data expressed as close as possible to original model Using mappings to more interoperable level

A flexible model: objects, events and the rest Preserving and exploiting original data also means being compatible with descriptions beyond simple object level Also crucial for semantic enrichment

A flexible model: object and events (2) Classes and Properties for event-, agent-, place-centric modeling Instances of (local) vocabularies using skos:Concept Using RDF, EDM allows any kind of network to be attached to a provided object.

A flexible model: object and events (3)

Advanced modeling in EDM Relations between provided objects –Part-whole links for complex (hierarchical) objects –Derivation and versioning relations –Relations between provided objects, for instance artistic derivation between works; ens:isRepresentationOf ens:isNextInSequence

Linked data and cultural heritage?

The case for linked data in cultural heritage Not just a more sophisticated way to represent data! Ease of getting data from external sources – Just going to the URI and fetch the RDF there Ease of publishing data – Linked data as a dissemination channel for Europeana data Ease of linking across datasets – Linked data as a dissemination channel for Europeana data Object identification as cornerstone – Records are just a side feature!

From a movement supported by researchers To much wider awareness Open government initiatives, libraries… Continuing effort: show benefits of collaborating to a cultural heritage data web Library Linked Data W3C incubator Encouraging open linked data adoption

Linked Library Cloud beginning 2008 [Ross Singer, Code4Lib2010]

Linked Library Cloud mid-2010 Plus: Germany NL Hungary NL STW GEMET NYT Agrovoc [Ross Singer, Code4Lib2010]

Is that a surprise? Not really, let’s have a look at a real-world case…

Johan Stapel, Koninklijke Bibliotheek KOS & collection

A broad range of datasets That describe the same objects Or related objects Which are about similar subjects Which were made by the same persons Or related persons In the same places Etc…

Thanks! Europeana.eu team Web and Media Vrije Universiteit Amsterdam EuropeanaConnect project