Download presentation
Presentation is loading. Please wait.
Published byMorris Roberts Modified over 9 years ago
1
Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities Robina Clayphan Interoperability Manager, EDLF ECDL Workshop – Harvesting Metadata: Practices and Challenges September 30 2009
2
Introduction A look at the metadata schema we use and the elements that must be in a standard form The whole ingestion process Summary of the aspects of and approach to aggregation
3
Europeana Europeana brings together and makes available digital content from: Four cultural heritage sectors Museums, Archives, Libraries, Audio-visual archives Twenty-nine countries EU plus Norway and Switzerland Twenty-six languages Four types of material Image, sound, video, text ….need for a metadata lingua franca…
4
ESE V3.2 Europeana Semantic Elements (ESE) V3.2 developed for the prototype A Dublin core-based application profile Cross-domain schema for heterogeneous data Not to capture the full semantics of provider’s data 37 Dublin Core terms – used principally to describe the objects 12 Europeana coined terms - used to support portal functionality Needed to have consistent data for the portal to work
5
The Dublin Core elements
6
Europeana elements ElementWho is responsibleFunction europeana:isShownAt or europeana:isShownBy Provider must provide at least one of these elements - both if applicable. URL Links to object europeana:object Provider - if appropriate to the data URL Source of thumbnail europeana:provider Provider must provide this element. Controlled list. Facet europeana:type Provider must provide this element. Controlled list Facet europeana:unstored Provider – only if appropriate to your data. Text string Container element europeana:country Europeana is responsible for providing all these elements. Facet europeana:hasObject System use europeana:language Facet europeana:uri System Identifier europeana:usertag User provided tags (future) europeana:year Facet, timeline
10
Normalised elements Language ISO 369-1 standard two character code. Country ISO 3166 standard Year Four digit year from Gregorian calendar (YYYY) Generated where possible from date supplied in Provider Controlled list of names, in the language of provider Type Controlled list (in English) of four types: Text, Image, Sound, Video mapped from the diverse types used in source data (by provider)
11
Mapping and Normalisation Three key reference documents for providers: ESE Specification V3.2 Normalisation Guidelines V1.2 ESE V3.2 XML schema + explanatory text All available from the “Provide Content” section of the Europeana Group pages: http://group.europeana.eu/web/guest/provide_content
13
Content Ingestion ……starting right from the beginning
14
Global Europeana ingestion workflow
15
Activity diagram: Steps I5 to I8
16
Content Ingestion Europeana has provided a Content Checker tool which has two parts: The Content Ingestor Allows uploading of a data set Validation against the ESE V3.2 XML schema Importing the data into the database Indexing of data Caching of thumbnails The Test Portal Separate from the operational portal Allows provider to search for uploaded data
17
Content Ingestor Select “new data set” - the ingestor automatically creates a new ID – “null05” in this example
18
Content Ingestor - upload
19
Content Ingestor - validate
20
Import
21
Index
22
Cache
23
Test Portal - search
24
Aggregation and the Content Strategy Move on to a look at various aspects of aggregation in Europeana – the need for it, the approach to it.
25
Aggregation - terminology A Content Provider an organization that provides metadata that enables access to its digital objects An Aggregator collects metadata from a group of content providers transmits them to Europeana, helps content providers with guidance on conformance with Europeana norms transforms metadata if necessary supports the content providers with administration, operations and training
26
Roles and benefits Content providers Know their content and data best – fewer mapping errors Look at the results before ingested in operational system Aggregators Know the needs of the providers (domain, level) Play a bridging role between providers and Europeana – single point of contact, conduit for information in both directions Europeana Supporting role for consultation, co-ordination, standardisation Management of the 10 million objects Offer the cross-domain and multi-lingual service
27
Organisational Model
28
Types of aggregator Matrix of aggregators: cross-domain, single domain, thematic level of operation – regional, national, European, global Domain/Geographic coverageRegionalNationalEuropeanWorldwide Cross-domain (horizontal) Thuis in BrabantCulturaItaliaEuropeana Single- domain (vertical) MovE (museums in East Flanders ) Direcção-Geral de Arquivos (Portuguese archives) Dismarc (music) TEL (books) EFG (movies) World Digital library WorldCat Them- atic Cross domainJudaicaArXiv.org Single domain Great War Archive
29
Why aggregation? November 2008 – 5 million items in Europeana July 2009 - content from over 1000 providers July 2010 – target of 10 million items Many individual organisations asking to contribute Currently there are six projects that aggregate content for Europeana (amongst other objectives) another three projects starting later this year Europeana Group site at: http://group.europeana.eu/web/guest/home
31
Why aggregation? Labour-intensive administration and ingestion processes Not due to the amount of data – but the number of organisations Aggregation provides economies of scale allowing Europeana Office to remain relatively small Promoting aggregation and providing services and expertise to aggregators will be key to Europeana’s Content Strategy Europeana is a small organisation!
32
Aggregation activities Aggregators survey Establish shared issues and need for support Formation of Aggregators group Council of Content Providers and Aggregators is now part of Europeana Governance structure Training for aggregators Generic and bespoke training days as the need arises Identifying potential aggregators “EuropeanaLabs” for Aggregators Test environment for content delivery and/or software development
33
Aggregation activities Handbook for aggregators. Content to be decided as part of survey but likely to cover: Europeana source code, APIs, content checker etc Technical documentation for participating in Europeana Templates and documentation for budget planning, fundraising, revenue generation, sustainability Templates and documentation for administrative and organisational aspects of running an aggregator Templates and documentation on IPR and European Licensing framework Documentation for establishing political and networks support Templates and documentation for dissemination activities Wiki for aggregator issues
34
Thank you! robinaclayphan@kb.nl
35
Thank you! robinaclayphan@kb.nl
36
isShownBy 1
37
isShownAt 2
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.