Research Data Lifecycle Management Workshop Report Curt Hillegas 9/8/2011
The workshop NSF funded Joint initiative between CASC and EDUCAUSE ACTI CCI Working Group 70 – 75 attendees 1.5 days 4 speakers 7 break-out sessions 2 panels
Secure Research Data Create national working group to guide compliance to federal standards for research computing data Catalog solutions for remote access to restricted data Find data solutions for Clinical and Translational Science Awards
Policy Create a catalog of issues (and approaches to solutions) with data ownership and responsibility Workshop for campus leaders (VPR and Provost) Workshop for community/discipline leaders Develop a discipline-blind framework for data policies and standards Researchers, librarians and IT professionals approach Provost and VPR together
Assessment and Selection of Research Data Develop a framework for creating and implementing workflows that allow researchers to be a partner in the process Educate key audiences about the need for curatorial practice and key concepts – Researchers – Graduate students Encourage policy makers to rethink roles in key units of the institution
Funding and Operation Repository builders must collaborate with others from the start Make data movable – able to move from one caretaker to another Funding will change throughout the lifecycle of the data Prepare repositories for handing off data Perform a study of existing models and create a report
Partnering Researchers, IT Staff, Librarians and Archivists Communication of what’s out there Institute more training for grad students Substantial workshop report Hold a workshop to define best institutional practices in communicating between researchers and librarians Survey our campuses on data management practices
Standards for Provenance, Metadata and Discoverability Common framework for data - some emerging, like Metadata Encoding and Transmission Standard (METS) Role of ontologies – domains recognizing standardized terminologies Instrumented data – if numeric data is off, then data is useless Metadata needs to be captured at point of data creation Need standards of provenance – what’s the purpose of creating this data? Relationships between datasets are critical
Partnering Funding Agencies, Research Institutions and Communities, and Industrial and Corporate Partnerships Joint study of the feasibility of the “digital sheepskin” Conduct an aggregated study of TCO models using trusted party (academia) for storage for perpetuity or for ten years. Identify the missing pieces of the research data software stack, and encourage collaborations between academia and industry. A study on criteria for throwing data away, by discipline. Continue to emphasize that data volume is growing much faster than our ability to move data around. Think about where we need to site data. What are the possible models for joint activity with industrial partners?
Summary Researchers, Librarians/Archivists, IT Professionals, Funding Agencies, and Vendors must work together Create frameworks of best practices that allow for discipline specific implementation Involve Provosts and Chief Research Officers Start educating early in researchers’ careers