Data Sharing, Small Science, and Institutional Repositories Melissa H. Cragin & Carole L. Palmer Center For Informatics Research in Science and Scholarship.

Slides:



Advertisements
Similar presentations
Swimming Upstream: Assessing the Librarys Role in Managing the River of Data on Campus Christie Peters | Science & Engineering Librarian Anita R. Dryden.
Advertisements

University of Sydney – Academic Forum – 13 April 2005 John Shipp University Librarian THE FUTURE OF THE UNIVERSITY LIBRARY CHANGES IN SCHOLARLY COMMUNICATION.
Contouring Curation in Research Libraries: Defining “Working” Data Units and Communities Carole L. Palmer Center for Informatics Research in Science &
OVERVIEW & LIBRARY SUPPORT FOR DATA MANAGEMENT/SHARING Jim Van Loon, MSME/MLIS Science Librarian.
Selecting a Data Sharing Repository. 2 Why Share Data? Enabling others to replicate and verify results as part of the scientific process Allows researchers.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School.
The Analytic Potential of Long-Tail Data: Sharable Data and Re-use Value Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate.
INFLIBNET CALIBER INFLIBNET CENTRE, GANDHINAGAR – MARCH 22, 2013 Databib: Cataloging the Data Repositories of the World Michael Witt Assistant Professor.
December 2008 MRC Data Support Services (DSS) Chris Morris 13 th February 2009 Sharing Research Data: Pioneers, Policies and Protocols The seventh cat.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate.
The Imperial College Tissue Bank A searchable catalogue for tissues, research projects and data outcomes Prof Gerry Thomas - Dept. Surgery & Cancer The.
IDENTIFIERS & THE DATA CITATION INDEX DISCOVERY, ACCESS, AND CITATION OF PUBLISHED RESEARCH DATA NIGEL ROBINSON 17 OCTOBER 2013.
Learning Hands-on and by Trial & Error with Data Curation Profiles D. Scott Brandt assoc dean for research Framing the digital curation curriculum International.
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
Connecting with Data Megan Sapp Nelson, Associate Professor of Library Sciences, Purdue University Libraries
Rutgers University Libraries What is RUcore? o An institutional repository, to preserve, manage and make accessible the research and publications of the.
1 What is RUcore?  A cyberinfrastructure for the Rutgers Community that includes:  An institutional repository, to preserve, manage and make accessible.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
Introduction to Implementing an Institutional Repository Delivered to Technical Services Staff Dr. John Archer Library University of Regina September 21,
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Learning by Doing: Cases of Librarians Working with Faculty Research Data for the First Time IASSIST 2010 Jake CarlsonMichael Witt Data Research Interdisciplinary.
Co-funded by the European Union under FP7-ICT Alliance Permanent Access to the Records of Science in Europe Network Co-ordinated by aparsen.eu #APARSEN.
The LIS role in RDM Session 1.3 Sep-2012 RDMRose: Research Data Management for LIS Session 1 Introductions, RDM, and the role of LIS Session 1.3 The LIS.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Databib An Online Bibliography of Research Data Repositories Michael Witt Assistant Professor of Library Science Purdue.
Using data management plans as a research tool: an introduction to the DART Project NISO Virtual Conference Scientific Data Management: Caring for Your.
IMLS NLG Collection Registry & Item-Level Metadata Repository at the University of Illinois Timothy W. Cole Mathematics Librarian &
Sun PASIG Fall 2008 Meeting 26 October 2008 Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School of Library and Information.
Supporting the local research data environment via cross-campus collaboration and leveraging of national expertise Hannah F. Norton, Rolando Garcia Milian,
Data Curation Education and Biological Information Specialists DigCCurr 2007 Chapel Hill, April 20, 2007 P. Bryan Heidorn, Carole L. Palmer, Melissa H.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
PURR: A RESEARCH DATA CURATION SERVICE MODEL USING HUBZERO Courtney Earl Matthews Digital Data Repository Specialist HUBBUB 2012 Purdue University.
Managing Research Data – The Organisational Challenge at Oxford James A J Wilson Friday 6 th December,
13 September 2012 The Libraries’ Role in Research Data Management: A Case Study from the University of Minnesota Meghan Lafferty, Chemistry, Chemical Engineering,
ACCESS for VALIDITY ACCESS for INNOVATION. Starting January 2011 for NEW proposals Not voluntary – “integral part” of proposal and FastLane Required for.
THROUGH OR AROUND? SCIENTIFIC RESEARCH DATA AND THE INSTITUTIONAL REPOSITORY Panel Presentation for the International Conference on University Libraries.
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
Data Management Planning
EBank UK: linking scientific data, scholarly communication and learning Michael Day and Rachel Heery UKOLN, University of Bath
JENN RILEY, HEAD, CAROLINA DIGITAL LIBRARY AND ARCHIVES WHAT EVERY LIBRARIAN NEEDS TO KNOW ABOUT DIGITAL COLLECTIONS.
Open Access in Russia (a view from inside Russian Academy of Sciences) Sergey Parinov, CEMI RAS, principal researcher euroCRIS, Board member.
Data Curation Education JCDL Pittsburgh, June 20, 2008 Linda C. Smith Melissa H. Cragin, Carole L. Palmer, W. John MacMullen, P. Bryan Heidorn.
Why data matters to librarians – and how to educate the next generation Christine L. Borgman Professor & Presidential Chair in Information Studies, UCLA.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
Data Curation in LIS Education and Libraries Melissa Cragin Center for Informatics Research in Science and Scholarship Graduate School of Library and Information.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
© 2007, IDEALS This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License. To view a copy of this license, visit
Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.
DMPTool and Data Management Basics Hannah Norton July 29, 2014 Image modified from :
Earth System Curator and Model Metadata Discovery and Display for CMIP5 Sylvia Murphy and Cecelia Deluca (NOAA/CIRES) Hannah Wilcox (NCAR/CISL) Metafor.
CombeDay Making Data Openly Available Simon Coles.
Research and Scholarly Communication in the Humanities New Partnerships Between Librarians and Scholars Presented to the Humanities Research Institute.
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
THE PURDUE UNIVERSITY LIBRARIES DATA SERVICES A BRIEF HISTORY OF AN EVOLVING SERVICE Amy Van Epps Megan Sapp Nelson Purdue University Libraries
Introduction to Research Data Management Joy Davidson and Sarah Jones Digital Curation Centre
Open Access and the ESRC New directions in scholarly communications in the social sciences.
Michael Witt, Jacob Carlson, D. Scott Brandt Purdue University Melissa H. Cragin University of Illinois at Urbana-Champaign Constructing Data Curation.
Redefining the Library’s Role through an Institutional Repository Sharon Mader, Dean Jeanne Pavy, Scholarly Communications Librarian Earl K. Long Library.
A. D. SMITH – SEPTEMBER 28, 2011 DATA CURATION PROFILE.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
Data Curation and Data Analytics for Advancing Science and Scholarship GSLIS Research Showcase 9 April 2011 Carole Palmer & Cathy Blake Center for Informatics.
UDAcademe is a faculty activity records management system that provides a single convenient place for faculty members to archive their achievements.
Fresno State Digital Repository
Moving on : Repository Services after the RAE
Data Fundamentals A. D. Smith – September 26, 2011.
Introduction to Implementing an Institutional Repository
SSarah The Value of Scholarly Communications Programming: Perspectives from Three Settings Sarah Beaubien • Scholarly Communications.
Purdue University The PURR campus data repository service: institutional effort looking towards international engagement Michael Witt, associate.
OPEN ACCESS POLICY Larshan Naicker Rhodes University Library
Bird of Feather Session
Presentation transcript:

Data Sharing, Small Science, and Institutional Repositories Melissa H. Cragin & Carole L. Palmer Center For Informatics Research in Science and Scholarship Grad. School of Library and Information Science, University of Illinois Jacob R. Carlson & Michael Witt Purdue University Libraries

A view from the Institutional Repository Advancing university-based cyberinfrastructure is dependent on our understanding of how to support data practices and needs. Sharing is at the heart of success: collecting, storing, and making use of data can only come after the means for sharing are in place. We cannot collect and curate all data, particularly in a way that facilitates effective re-use. We will need to work with researchers to develop selection and appraisal guidelines, and data services.

Data Curation Profiles Project Project focus: which data are researchers willing to share, when, and with whom? Objectives: derive requirements for managing data sets in IRs develop policies for archiving and access identify librarian roles & skill sets for supporting data management, sharing & curation. Biochemistry Biology Civil Engineering Electrical Engineering Food Sciences Earth and Atmospheric Sciences Soil Science Anthropology Geology Plant Sciences Kinesiology Speech and Hearing Earth and Atmospheric Sciences Soil Science

Methods Institutional Review Board for approval of Human Subjects Research increasingly focused, materials-based interviews Pre-interview Worksheet Requirements Worksheet “data set” samples Data Curation Profiles

“Faculty of the Environment” Data Needs Project Collaborators: Bryan Heidorn, Michelle Wander, U of I Environmental Council

Smallish Science single PI (often) often dependent on graduate students ad hoc data management systems idiosyncratic sharing practices “success” dependent on using one’s own data But… may be working at community level may be producing all digital data may be conducting “data-driven” science may be producing very large data sets

Data Characteristics CrystallographyGeology Type 1. “ Raw data ” Most information rich, long-term value for re-use … 4. “ CIF file ” – crystallography exchange Most commonly shared data type 1. “Reduced spreadsheet” – table with average values for multiple observations Most often requested by others Format 1. Binary data – image 4. Crystallographic Information File – text (field-wide standard for numerical data) 1. Excel spreadsheet Size 1. Each image or “ frame ” ¼ to 1 Mb Set is approx. 2,400 frames = approx 1Gb 4. > 500Kb 1. spreadsheet size – under 1Mb Intellectual Property/Data Owners Service model provide a service to chemists by solving crystal structures Ownership of the data is ambiguous, and require negotiation before data “hand-off” Depends on source of funding governmental and private grants, gov. institutions, industry Ownership of and right to the data range from full to very limited, some long-term “embargoes” Accessibility Field-wide repositories Many journals require deposit of CIF files OAI-PMH tools becoming available for CIF files Difficult and ad hoc Well-known researchers receive direct requests for data, often based on publications Profiling complexities & differences

Findings Distinguishing exchange from open sharing exchange: sharing amongst collaborators is a primary concern, often with significant barriers (more) open access: limited by need for control and reward system, but also Sharing with wider “publics” is conditioned by both data management pressures and personal experience the “known person – cost” algorithm incidents of misuse What is most easily or willingly shared is not always the data that has the most re-use value

Field Specific Research AreaForm to be sharedFormats Type of data setSize Shared when? Atmospheric science severe weather modeling compressed output of the modelVis5D 1 file / dataset Mb 4-6 month embargo, Agronomy water quality, drainage, and plant growth cleaned and reviewed sensor and hand- collected sample data.xls approx. 100 files ~1MB each, up to 20 Mb After publication Geology rock, water and microbes averaged sensor and hand-collected sample data; photographs.xls; jpg 1 file; images< 1 Mb After publication Civil Engineering traffic movement cleaned and normalized sensor data MySQL (postgresql) 1 database approx K/day 1 month to 1 year embargo Examples of what, and when

Implications for Institutional Repositories embargo services are a *must* (~66%, 14/20) clear, explicit data citation information in IR records disconnect: application of metadata standards highly important, but many unaware of existing standards preservation services are needed to support re-use: 11/19 participants said their data would be useful for more than 10 years.

Supporting the science process data exchange infrastructure support for data management planning data literacy instruction - integral to scientific information work Broader implications for academic institutions Leadership Opportunities for Libraries

Thank you This research is supported by the Institute of Museum and Library Services, (IMLS) grant # LG D. Scott Brandt, PI Co-PIs: M. Witt & J. Carlson, (Purdue) and C. Palmer & S. Shreeves (UIUC) RAs: D. Leiter (Purdue) and M. Kogan (UIUC)