Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach.

Slides:



Advertisements
Similar presentations
Data Publishing Service Indiana University Stacy Kowalczyk April 9, 2010.
Advertisements

Long-Term Preservation. Technical Approaches to Long-Term Preservation the challenge is to interpret formats a similar development: sound carriers From.
Swimming Upstream: Assessing the Librarys Role in Managing the River of Data on Campus Christie Peters | Science & Engineering Librarian Anita R. Dryden.
MacKenzie Smith Associate Director for Technology MIT Libraries.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Funded by: © AHDS Sherpa DP – a Technical Architecture for a Disaggregated Preservation Service Mark Hedges Arts and Humanities Data Service King’s College.
Andrea Fojtu Charles University in Prague, National Library of the CR.
Fedora 3.0 and METS: A Partnership for the Organization, Presentation and Preservation of Digital Objects Open Repositories Georgia Tech, Atlanta,
1 Extending the Implementation of PREMIS to Geospatial Resources in the Stanford Digital Repository: An Exploration By Nancy J. Hoebelheinrich Metadata.
1 Institutional Repository (IR) Models Rutgers University Community Repository (RUcore) A digital library perspective (objects and collections) Flexible.
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
PAWN: A Novel Ingestion Workflow Technology for Digital Preservation
AIP Archival Information Package – Defines how digital objects and its associated metadata are packaged using XML based files. METS (binding file) MODS.
WMS: Democratizing Data
NSDL 2 nd Generation Mathematics Digital Library ASEE Annual Meeting June 13, 2005 Portland, OR William H. Mischo
Incompatible or Interoperable? A METS bridge for a small gap between two digital preservation software packages Lucas Mak Metadata & CatalogLibrarian
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Metadata standards, tools and processes for audio preservation at the British Library: An overview of new systems for audio description, preservation and.
PeDALS Persistent Digital Archives & Library System Richard Pearce-Moses Deputy Director for Technology & Information Resources Arizona State Library,
R utgers C ommunity R epository RU CORE 1 Research Data and Context  Presentation Goals  The challenge of context  Metadata design to support context.
DATA CURATION & PRESERVATION CSG Fall Meeting, Princeton Mairéad Martin Penn State September, 2012.
USING METADATA TO FACILITATE UNDERSTANDING AND CERTIFICATION ABOUT THE PRESERVATION PROPERTIES OF A PRESERVATION SYSTEM Jewel H. Ward, Hao Xu, Mike C.
Jenn Riley Metadata Librarian Indiana University Digital Library Program.
PURR: A RESEARCH DATA CURATION SERVICE MODEL USING HUBZERO Courtney Earl Matthews Digital Data Repository Specialist HUBBUB 2012 Purdue University.
A River Runs Through It ARL Membership Meeting Sayeed Choudhury Sheridan Libraries, Johns Hopkins October 15, 2009.
Implementing an Integrated Digital Asset Management System: FEDORA and OAIS in Context Paul Bevan DAMS Implementation Manager
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
NSF Data Management Plan Requirement Presentation May 25, 2011 William Mischo & Mary Schlembach.
DAITSS: Dark Archive in the Sunshine State Priscilla Caplan, Florida Center for Library Automation DCC Workshop on Long-term Curation within Digital Repositories.
NCSU Libraries 27 March 2006 Digital Preservation in State Government – Wilmington, NC North Carolina Geospatial Data Archiving Project Workflow, Tools,
George E. Brown, Jr. Network for Earthquake Engineering Simulation Data Curation and Quality Assurance at NEES Stanislav Pejša HUBbub 2012 Indianapolis,
HUB AND SPOKE TOOL SUITE PREMIS Implementation Fair – 7 October 2009 Bill Ingram Visiting Research Programmer University of Illinois at Urbana-Champaign.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
Implementor’s Panel: BL’s eJournal Archiving solution using METS, MODS and PREMIS Markus Enders, British Library DC2008, Berlin.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
Implementation of PREMIS in METS Rebecca Guenther Sr. Networking & Standards Specialist, Library of Congress PREMIS Implementation Fair San.
Digital preservation activities at the NLW Sally McInnes 18 September 2009.
Habing1 Integrating PREMIS and METS PREMIS Tutorial Implementers’ Panel June 21, 2007, 9:00-5:30 Library of Congress, Jefferson Building, Whittall.
UKOLN is supported by: Digital Preservation Benefits Tools Project Dissemination Workshop Dr Liz Lyon, Associate Director, UK Digital Curation Centre Director,
Data in the NEES Data Repository Conditions for Current and Future Use and Re-Use Quake Summit 2012, Boston, Massachusetts July 12, 2012 Stanislav Pejša.
PREMIS Implementation Fair, San Francisco, CA October 7, Stanford Digital Repository PREMIS & Geospatial Resources Nancy J. Hoebelheinrich Knowledge.
HATHI TRUST A Shared Digital Repository Use of PREMIS for Internet Archive AIPs September 22, 2010.
VITAL at the National Library of Wales Glen Robson
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
PREMIS at the British Library Markus Enders, The British Library PREMIS Implementation Fair, San Fransisco, CA 07 October 2009.
Digital Preservation Panel Medusa at the University of Illinois at Urbana-Champaign: A Digital Preservation Service Based on PREMIS Kyle Rimkus, Preservation.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
DAITSS and the Florida Digital Archive Priscilla Caplan Florida Center for Library Automation iPRES 2006.
Lifecycle Metadata for Digital Objects November 15, 2004 Preservation Metadata.
The OAIS Reference Model and Trustworthy Repositories Josh Lubell Manufacturing Engineering Laboratory NIST
Institutional Repositories July 2007 DIGITAL CURATION creating, managing and preserving digital objects Dr D Peters DISA Digital Innovation South.
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
NLW. Object Classes Class 1  1 MARC Record  1 Image  No METS Class 2  1 MARC Record  Many images  No METS Class 3  1 MARC Record  Many.
THE PURDUE UNIVERSITY LIBRARIES DATA SERVICES A BRIEF HISTORY OF AN EVOLVING SERVICE Amy Van Epps Megan Sapp Nelson Purdue University Libraries
Developing a Dark Archive for OJS Journals Yu-Hung Lin, Metadata Librarian for Continuing Resources, Scholarship and Data Rutgers University 1 10/7/2015.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Informatics for Scientific Data Bio-informatics and Medical Informatics Week 9 Lecture notes INF 380E: Perspectives on Information.
OAIS (archive) OAIS (archive) Producer Management Consumer.
DAITSS: Dark Archive in the Sunshine State
DAITSS and the Florida Digital Archive
Trove Tufts Digital Image Library
DataNet Collaboration
Exercise: understanding authenticity evidence
Exercise: understanding authenticity evidence
Better than it was Finding what works for processing born-digital archives at the Bentley Historical Library Mike Shallcross U-M Bentley Historical Library.
Data stewardship life cycle
Integrating PREMIS and METS
NSF Data Management Plan Requirement
Metadata in Digital Preservation: Setting the Scene
Digital Preservation through EPrints-Archivematica Integration
Presentation transcript:

Active Data Curation in Libraries: Issues and Challenges ASEE ELD Presentation June 27, 2011 William H. Mischo & Mary C. Schlembach

Active Data Curation Curation is the active use of data. It is a lifecycle process. Curation requires discipline specific knowledge and experience. Domain dependent curation rules and preservation actions must be merged into the scientific workflow processes. Need to automate data ingest, descriptive metadata creation, preservation and digital object relationships.

Scientific Workflow Fedora/Hydra Trusted Digital Repository (OAIS compliant) Knowledge Creation Tools Preservation Actions Metadata Management METS, PREMIS, MODS, DC, XSLT The Grainger Library Active Data Curation Lifecycle Elements Curation Rule Engine Operates on Metadata, Content Objects AIPs, OAI-ORE Curation Rule Engine: -- Domain dependent -- Can be invoked explicitly -- But also automated based on system trigger events CI-3, CI-5 Responses Access Mechanisms and E-Scholarship Services, GRIPs DIP Packages SIP packages Appraisal and Selection Migration and Emulation Tools Use, Reuse, Repurposing Tools Ingest scripts: fixity, integrity, authentication, transformation

Say What? What is the role of the library? The engineering librarian? The campus? The subject discipline? Libraries are creating content asset preservation systems. Trusted Digital Repositories. Fedora/Hydra/archivematica at UIUC Library. Role for the science/engineering library: connecting data to literature. Knowledge creation process and libraries. GrIPs (Group Information Profiles). NSF Data Management Plans.

What Data should be Curated? Defining data curation: DataNet projects: Data Conservancy (Hopkins), DataONE (New Mexico). Purdue profiles. Raw data and processed data. We surveyed several groups in specific disciplines. –Atmospheric Sciences (experimental) –Biophysics (simulation data).

Atmospheric Science: Experimental Data Five levels and two data streams: –Level 1: raw voltages from an instrument –Level 2: calibrated data derived from raw voltages –Level 3: image products displaying the data –Level 4: derived parameters, statistics, etc. from calibrated data –Level 5: analysis of Level 4 data that winds up in papers, publications, etc. Two other necessary data streams: ancillary instrument information and metadata.

Biophysics: Simulation Data Modeling of interactions of atomic level molecular data. Three levels: –Level 1: raw data from simulation run: positions and velocities of particles; software widely used. –Level 2: various raw data extracts of subsets of particles run data. –Level 3: visualization files (movie, images); analysis products generated from the visualization data for publication data. Also necessary are input parameters (starting coordinates, etc.) and other metadata.

Data Management Plan The Data Management Plan (DMP) is a new NSF mandatory supplementary document for all research proposals. – Each directorate, including the Engineering Directorate (ENG) is providing specific directions and required elements. The ENG document:

Data Management Plan The digital data to be archived includes analyzed data – typically data that will go into articles and papers, and the metadata that defines the data that was generated. For Engineering Directorate grants, raw data from sensors or other instruments is not required to be archived.

Data Management Plan Maximum of two pages and will not count against the 15 page limit for proposals. UIUC Grainger Library has prepared overview document and template for DMPs. Working on Wizard. As part of NSF Ethics CORE Digital Library, working on RCR Requirement database and Wizard.