Presentation is loading. Please wait.

Presentation is loading. Please wait.

CESSDA Workplan: Metadata Harvesting Tool

Similar presentations


Presentation on theme: "CESSDA Workplan: Metadata Harvesting Tool"— Presentation transcript:

1 CESSDA Workplan: Metadata Harvesting Tool
Ørnulf Risnes (NSD) John Shepherdson (UKDS) EDDI 15, Copenhagen 3 December 2015

2 CESSDA Consortium of European Social Science Data Archives
Bring together social science data archives across Europe Developing a pan-European Research Infrastructure (RI) Facilitate researcher access to important resources of relevance to the European social science research agenda regardless of the location of either researcher or data Abstract The UK Data Service is providing access to its data and metadata holdings by making public some of its web service APIs. These REST APIs facilitate a self-service approach for our data producers and researchers alike, whilst also enabling 3rd party developers to write applications that consume our APIs and present novel and exciting ways of accessing and viewing some of the data collections that we hold. We have put new infrastructure in place to enable the provision of these APIs and have already run an App Challenge (for external developers to build mobile applications against our APIs) and added a data collection usage ‘leader board’ as initial tests of the functionality, capacity, account management, developer documentation and performance aspects of our public APIs. The main infrastructure elements are an API management service, HTTP caching and routing and various API endpoints. The other major consideration was a set of design principles for the APIs so that developers have a consistent and predictable experience. This presentation will elaborate on the key components of the infrastructure and the API design guidelines.

3 Commissioned by CESSDA
2015 work plan launched RI development Metadata Harvester task is part of the work plan “The objective of this Task is to select and/or develop and implement into CESSDA service a metadata harvesting tool that will enable the efficient compilation and operation of the CESSDA Product and Service Catalogue, the CESSDA Secure Access Portal, and other data management tasks and data supply services.” Open Source bundle due Q2 2016

4 Task Objectives Produce an easy to use metadata harvesting service
Extensible design - use plugin architecture for inputs and outputs wide range of metadata sources must be harvested data to be emitted in a variety of metadata standards (more to life than DDI!)

5 Delivery Partners NSD and UKDS FSD, SND, DDA
Design, implement, quality assure and document the metadata harvester to produce an Open Source bundle NSD lead the Task FSD, SND, DDA Test the OS project - build additional input/output plugins and harvest a variety of metadata source types and languages

6 Foundations Build on outputs of Data without Boundaries WP12
Prototype Resource Discovery Portal (DwB-RDP) Harvests DDI 1.2 from CESSDA SP’s Nesstar servers and converts to DwB-Disco format See harvesting ingest system report and DwB Resource Discovery Portal description for more details Needs to interoperate with CESSDA metadata model yet to be defined

7 Harvester as complex system
Consumers Metadata sources

8 Harvester as complex stateful system
Data store Consumers Metadata sources Images sourced via Google Images

9 Metadata Model Based on RDF-Disco How is it different?
will perform gap analysis Harvesting challenges and normalisation Simplifying assumption normalisation handled by Consumer resumption and completeness are responsibility of Consumer => Harvester as stateless system

10 Functional description

11 Harvester as simple adaptor
Metadata sources Consumers Harvester plugins Images sourced via Google Images

12 Harvester Extension Mechanism
Webhooks Harvester calls a provided URI URI handles the call Images sourced via Google Images from pusher.com

13 Functional description
Queries: ListRecordsForRepository GetRecordFromRepository Arguments: Repository/Record URI Repository type Type handler URI (optional)

14 Success Criteria Impact Analysis Quality/Usability of OS bundle
How will it affect CESSDA Service Providers, CESSDA ERIC, EU Researchers Quality/Usability of OS bundle Establish maturity rating using NASA Reuse Readiness Levels, prior to testing Testing undertaken by FSD, SND, DDA against System Usability Scale

15 NASA Reuse Readiness criteria
Ten levels for each of following: Documentation Extensibility Intellectual Property Modularity Packaging Portability Standards Compliance Support Verification and Testing Security Internationalisation and Localization

16 Deliverables Metadata harvester as a service Administration tool
Provides an API for clients to consume it Administration tool Used to monitor and manage the harvester service May be readable to many, but will be writable by few Publically available Open Source Bundle Code base and documentation Facilitates creation of new harvesters and output formatters

17 How will it be developed?
Where will it run?

18 Development Environment
Common, cloud-based tool chain lower barriers to entry for all Service Providers no need to install and configure locally code repositories automated build and test documentation area Enure CESSDA has access to source code configuration files technical documentation that underpin its products and services

19 Production Environment
Short-term cloud based hosting Experience will feed in to CESSDA’s requirements for compute and storage in order to host and run the components of the Research Infrastructure

20 Thanks for your attention


Download ppt "CESSDA Workplan: Metadata Harvesting Tool"

Similar presentations


Ads by Google