Data catalogues and the data repository ADMIRe JISC MRD

Slides:

Advertisements

Similar presentations

IRRA DSpace April 2006 Claire Knowles University of Edinburgh.

Advertisements

Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.

Business Development Suit Presented by Thomas Mathews.

Tom Lewis Director, Academic & Collaborative Applications University of Washington.

“Can you digitise this for me please?” The University of Auckland's approach to managing digitisation proposals John Garraway Digital Services & Information.

The Documentum Team Lance Callaway, Brooke Durbin, Perry Koob, Lorie McMillin, Jennifer Song Missouri University of Science and Technology Rolla, Missouri.

Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.

Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004.

PAWN: A Novel Ingestion Workflow Technology for Digital Preservation

School of something FACULTY OF OTHER University Library The Library’s Digital Repository or Whatever happened to MIDESS? Michael Emly Jonathan Ainsworth.

Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.

PAWN: A Novel Ingestion Workflow Technology for Digital Preservation Mike Smorul, Joseph JaJa, Yang Wang, and Fritz McCall.

AIIM Presentation Selecting and Implementing A Records Management System June 5, 2008.

System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.

Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.

Digital Asset Management for All? Visualising a Flexible DAMS Solution for Small and Medium Scale Institutions Paul Bevan Llyfrgell Genedlaethol Cymru.

ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.

MAHI Research Database Data Validation System Software Prototype Demonstration September 18, 2001

Using the SAS® Information Delivery Portal

5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.

UC3 Standards and Best Practices for Datasets and Other Supplemental Journal Article Materials UC3 Stephen Abrams Patricia Cruse John Kunze.

The repositories Landscape: where are Repositories now and what’s around the corner? UKDA-store Louise Corti UKDA, University of Essex MIMAS OPEN FORUM.

Dec 9-11, 2003ICADL Challenges in Building Federation Services over Harvested Metadata Hesham Anan, Jianfeng Tang, Kurt Maly, Michael Nelson, Mohammad.

Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.

CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &

This presentation describes the development and implementation of WSU Research Exchange, a permanent digital repository system that is being, adding WSU.

Funded by: © AHDS Preservation in Institutional Repositories Preliminary conclusions of the SHERPA DP project Gareth Knight Digital Preservation Officer.

A Multi-Tiered Architecture for Distributed Data Collection and Centralized Data Delivery Stacy Kowalczyk and James Halliday April 28, 2008.

Millman—Nov 04—1 An Update on Digital Libraries David Millman Director of Research & Development Academic Information Systems Columbia University

Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.

Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.

The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.

Implementing PREMIS in DigiTool Michael Kaplan ALA 2007 Update.

Digital Library Storage Strategies Robert Cartolano, Director Library Information Technology Office November 14, 2008.

NSDL STEM Exchange: Technical Overview and Implications for Active Dissemination of Federally Funded Resources Across Implementation Systems.

University of St Andrews Towards e-Research June 16 th 2005 Research-related computing developments in St Andrews Birgit Plietzsch, Anna Clements, Jeremy.

Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.

Automating the Audit: Updates from the Metadata Upgrade Project at the University of Houston Libraries Andrew Weidner, Metadata Librarian Santi Thompson,

Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.

5/29/2001Y. D. Wu & M. Liu1 Content Management for Digital Library May 29, 2001.

International Planetary Data Alliance Registry Project Update September 16, 2011.

IPDA Registry Definitions Project Dan Crichton Pedro Osuna Alain Sarkissian.

Database Principles: Fundamentals of Design, Implementation, and Management Chapter 1 The Database Approach.

Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN CoE offerings Simon Lambert STFC All Hands Meeting, Amsterdam,

Yannis Ioannidis, Professor Evita Mailli University of Athens Dept. of Informatics & Telecom. MaDgIK Lab.

Moshe Shechter | Alma Product Manager

GISELA & CHAIN Workshop Digital Cultural Heritage Network

Moving on : Repository Services after the RAE

Reusing and repurposing metadata in a Current Research Information System and Institutional Repository 3 June 2010 Robin Armstrong Viner Cataloguing.

Data Ingestion in ENES and collaboration with RDA

Fernando Aguilar, IFCA-CSIC

ICAT- Experience and activities at ISIS

Bentley Project Reel Digitization Bentley Historical Library t

VI-SEEM Data Repository

OGSA Data Architecture Scenarios

CFI John R Evans Leaders Fund Digital Data Management

VI-SEEM Data Repository

CNI Spring 2010 Membership Meeting

EQ101 review - 1.

Islandora Learning Objectives

eCulture Science Gateway – reloaded

Research Data Management

The MRC Research Data Gateway

GISELA & CHAIN Workshop Digital Cultural Heritage Network

Jisc Research Data Shared Service (RDSS)

Development roadmap of Suomi.fi-services

Reportnet 3.0 Database Feasibility Study – Approach

SDMX IT Tools SDMX Registry

Fundamental Science Practices (FSP) of the U.S. Geological Survey

Presentation transcript:

Data catalogues and the data repository ADMIRe JISC MRD Dr Tom Parsons March 2013 Sunday, November 11, 2018 ADMIRe

A world-class university One of the world’s top 100 universities, Nottingham is recognised globally for ground-breaking research and teaching excellence. 40,000 students from more than 150 countries, two overseas campuses and strong links with universities around the world Heavily focused on research: Medical & Health Sciences, Sciences, Engineering, Social Sciences and Arts Large research income (£100m) – primarily RCUK, UK/EU government, commercial and charities Sunday, November 11, 2018 ADMIRe

Key priorities for ADMIRe: RDM policy “1.5. The University will provide mechanisms and services for storage, backup, registration, deposit, retention and preservation of research data assets in support of current and future access, during and after completion of research projects.” Key priorities for ADMIRe: Is the current provision good enough? Where are the gaps? What do we need to provide? Sunday, November 11, 2018 ADMIRe

Understanding requirements Approaches: Survey (summer 2012) Focus groups (November 2012) Interviews (May 2012 onwards) Mixture of ADMIRe, in-house, JISC MRD & Sero Outputs: service model, detailed requirements catalogue, logical models & prototype Institutional requirements: “Enterprise Architecture compliant”, use and integrate with existing systems Sunday, November 11, 2018 ADMIRe

Survey results: Types of data Sunday, November 11, 2018 ADMIRe

Survey results: Data storage Sunday, November 11, 2018 ADMIRe

Survey results: Metadata… Sunday, November 11, 2018 ADMIRe

Sharing data? Sunday, November 11, 2018 ADMIRe

Survey results: Total research data estimates From the survey’s 366 responses 75 Gb average (mean/frequency) Sunday, November 11, 2018 ADMIRe

Total research data estimates 75 Gb average x approx. numbers of PIs & post-grads (4000) = 300TB (+-90%) Large number of unknowns A large amount of data, a large amount of files and a good case for managing it Sunday, November 11, 2018 ADMIRe

Focus groups to understand more Five Faculty based focus groups (30 people in total) Based upon California Digital Library model Sunday, November 11, 2018 ADMIRe

Active data Sunday, November 11, 2018 ADMIRe

Archive data Sunday, November 11, 2018 ADMIRe

Preservation activities Function Actors Req. Freq R S A 1 – Tag Enter metadata describing a bag of research data assets M 2 – Bag Zip the data files up in a bag C 3 – Transfer + Transfer a bag to archival storage 4 – Ingest Ingest a bag in to storage 5 – Update Update (enhance, correct) metadata for a stored bag O L 6 – GetDOI Get (public, private) DOIs for designated assets 7 – Publish Publish assets appropriately on landing pages 8 – Relocate Relocate assets and update locators 9 – Search Search for assets by keyword or field H 10 – Access Access metadata and data according to permissions 11 – Notify Notify actors automatically about data events P 12 – Annotate Create notes about a bag or its contents 13 - Check Check (verify) that the contents of a bag are in order 14 – Report Run reports on aspects of the system (DOI, bag, user) 15 - Administer Administer permissions and system parameters Sunday, November 11, 2018 ADMIRe

Mapping requirements

Where are we now? Sunday, November 11, 2018 ADMIRe

Interfaces/Integrations Direct Users Solution Description Scope Interfaces/Integrations Direct Users Data Retention Platform A storage platform that enables storage of “unstructured” data files. BPM Metastorm frontend. Storage of files and very basic (file type, size, retention period, user) AD to support access. (Note that Open Access will be supported by providing a persistent account used by the Research data web site server that has read only access to all “Open” data sets. Researchers Research data search and retrieve web site Web Site. Expected to be CMS or possibly SharePoint Web site with relevant information and screens to search and return results 1. Data Retention Platform via REST to enable http(s) data transfer. 2. FAST (embedded function) to allow search from a web page. 3. Equella (API) to expose metadata onto search results. 4. Active Directory/LDAP to authenticate file access Those searching for data sets Equella Metadata Database Stores metadata See Metastorm, FAST and Research Web Site N/A FAST Search Engine Provides search results and rich search functionality on the metadata 1. Potential federation to Primo 2. Crawl of Equella Anyone Baggit File collection tool Tool to assist researchers in selecting and bringing files into a collection Linked to from Metastorm PI

Interfaces/Integrations Direct Users Solution Description Scope Interfaces/Integrations Direct Users DMP Online On line tool providing support for creating Data Management plan that is managed to ensure Research Council Requirements are met Used to create Data Management Plan 1. Metastorm will link this within curation workflow 2. Metastorm will take the XML output of this and read key fileds directly to automate some metadata creation in Equella 3. Metastorm will save the output file of this tool PI DOI On line tool for creating a unique digital object identifier Workflow to fork out to this system to allow researcher to create a persistent object identifier. See Metastorm Active File Services File services primarily for storage of active (ie not curated) files The source of files for curation (“Bagging”). Selectable by browsing using Baggit tool. “Other Repository” Sometimes Selectable by browsing using Baggit tool as the source of files for curation (“Bagging”). However these may be databases or alternative repositories that are used instead. If used, and where possible, the DOI will point to these.

ADMIRe Phasing: Drop 1 (to June 2013) Objective: Deliver Key Functions but without over integration Deliverables: 1. Instructions and links on web site on how and why to use DMP Online 2. Instructions and links on web site on how and why to use DOI 3. Implementation (but not integration) of Baggit for Research users 4. Delivery of Metadata in Equella Including instructions and links on web site on how and why to use 5. Creation of Research Data Search Page Implementation of FAST search crawl Embed of FAST in web page Delivery of Results page to include relevant information 6. Metastorm development that: Creates User (PI Researcher) interface to Equella Provides fields to add all metadata into Equella Including Research Project Information, Subject Specific Information, Technical Metadata Allows Researcher to choose when a page is searchable Sunday, November 11, 2018 ADMIRe

ADMIRe Phasing: Drop 2 (to Dec 2013) Deliverables 1. Delivery of Retention platform Delivered outside of ADMIRe project 2. Delivery of Open Access Platform (Subset of Retention platform) 3. Definition and Delivery of End to end workflow automation and integration for data management process with a vision of “Input Once” Integrations of Baggit, Agresso Awards Management, DMP Online, DOI 4. Definition and Delivery of a report for Research Councils that Confirms project adherence (at Project close) to funding requirements for data management and access Enables non-conformance to be addressed Sunday, November 11, 2018 ADMIRe

Reusable outputs Focus groups/interview formats Requirements catalogue Use cases Survey – questions, write-up etc Software? No… Sunday, November 11, 2018 ADMIRe

ADMIRe Project Manager Questions? tom.parsons@nottingham.ac.uk ADMIRe Project Manager Sunday, November 11, 2018 ADMIRe