Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School.

Slides:



Advertisements
Similar presentations
Joint Information Systems Committee Digital Library Services BL/JISC Workshop Rachel Bruce JISC Programme Director The Digital Library and its Services,
Advertisements

Swimming Upstream: Assessing the Librarys Role in Managing the River of Data on Campus Christie Peters | Science & Engineering Librarian Anita R. Dryden.
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Contouring Curation in Research Libraries: Defining “Working” Data Units and Communities Carole L. Palmer Center for Informatics Research in Science &
The Analytic Potential of Long-Tail Data: Sharable Data and Re-use Value Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate.
Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Welcome to the Conference !! Juan Bicarregui Chair, APA Executive.
Data Sharing, Small Science, and Institutional Repositories Melissa H. Cragin & Carole L. Palmer Center For Informatics Research in Science and Scholarship.
Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer & Tiffany Chao Center for Informatics Research in Science & Scholarship Graduate.
The Data Curation Profile IASSIST 2010 Jake Carlson Data Research Scientist Purdue University Libraries.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
EMu and Archives NA EMu Users Conference – Oct Slide 1 EMu and Archives Experiences from the Canada Science and Technology Museum Corporation.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Creating documentation and metadata: Introduction to metadata and metadata standards Lynn Yarmey National Snow and Ice Data Center Version 2.0 Review Date.
NationalDataService.org Cross-disciplinary Reuse Carole L. Palmer Director and Professor Center for Informatics Research in Science & Scholarship Graduate.
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Version 2.0 [Review Date]
Final Search Terms: Archiving (digital or data) Authentication (data) Conservation (digital or data) Curation (digital or data) Cyberinfrastructure Data.
Sun PASIG Fall 2008 Meeting 26 October 2008 Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School of Library and Information.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
Supporting the local research data environment via cross-campus collaboration and leveraging of national expertise Hannah F. Norton, Rolando Garcia Milian,
Data Curation Education and Biological Information Specialists DigCCurr 2007 Chapel Hill, April 20, 2007 P. Bryan Heidorn, Carole L. Palmer, Melissa H.
Research Data Management Services Katherine McNeill Social Sciences Librarians Boot Camp June 1, 2012.
13 September 2012 The Libraries’ Role in Research Data Management: A Case Study from the University of Minnesota Meghan Lafferty, Chemistry, Chemical Engineering,
U.S. Department of the Interior U.S. Geological Survey Next Generation Data Integration Challenges National Workshop on Large Landscape Conservation Sean.
Preserving Digital Collections for Future Scholarship Oya Y. Rieger Cornell University
Information and Discovery in Neuroscience (IDN) Carole Palmer Graduate School of Library and Information Science University of Illinois at Urbana-Champaign.
1 Data Integration Community of Practice Meeting September 15, 2009 Science Data Integration.
Data Curation Education JCDL Pittsburgh, June 20, 2008 Linda C. Smith Melissa H. Cragin, Carole L. Palmer, W. John MacMullen, P. Bryan Heidorn.
Michael Witt Interdisciplinary Research Librarian & Assistant Professor Purdue Libraries & Distributed Data Curation Center (D2C2) Eliciting.
Elements of a Data Management Plan: Roles and Responsibilities Ruth Duerr National Snow and Ice Data Center Version 1.0 Review Date.
Data Curation in LIS Education and Libraries Melissa Cragin Center for Informatics Research in Science and Scholarship Graduate School of Library and Information.
Data Practices across Disciplines: Informing Collections & Curation Carole L. Palmer Melissa H. Cragin, Tiffany Chao, & Nic Weber Center for Informatics.
10/07/2008 Semantic Web Technologies & Higher Education.
National Center for Supercomputing Applications Barbara S. Minsker, Ph.D. Associate Professor National Center for Supercomputing Applications and Department.
Current and Potential Uses for GIS in Academic Arctic Research Michael F. Goodchild University of California Santa Barbara.
Creating Documentation and Metadata: Introduction to Metadata and Metadata Standards Lynn Yarmey National Snow and Ice Data Center Version 1.0 February.
Building the e-Minerals Minigrid Rik Tyer, Lisa Blanshard, Kerstin Kleese (Data Management Group) Rob Allan, Andrew Richards (Grid Technology Group)
DataONE: Preserving Data and Enabling Data-Intensive Biological and Environmental Research Bob Cook Environmental Sciences Division Oak Ridge National.
The Long Tail of Sample-based Data in the Next Decade FROM DARKNESS TO LIGHT Kerstin Lehnert
The Phylogeny of a Dataset Andrea K Thomer & Nicholas M. Weber Center for Informatics Research in Science and Scholarship Graduate School of Library and.
OAIS Rathachai Chawuthai Information Management CSIM / AIT Issued document 1.0.
Laura Russell Programmer VertNet Buenos Aires (Argentina) 28 September 2011 Training course on biodiversity data publishing and.
Interoperability from the e-Science Perspective Yannis Ioannidis Univ. Of Athens and ATHENA Research Center
Breakout # 1 – Data Collecting and Making It Available Data definition “ Any information that [environmental] researchers need to accomplish their tasks”
Site-Based Data Curation at Yellowstone National Park PI: Carole L. Palmer, GSLIS, CIRSS Co-PIs: Bruce Fouke, Geology, Microbiology, Institute for Genomic.
When and Why Should Research Data be Sustained? National Science Foundation Workshop Cyberinfrastructure for Large Facilities December 1-2, 2015 Christine.
Taming the Big Data in Computational Chemistry #euroCRIS2015 Barcelona 9-11-XI-2015 Carles Bo ICIQ (BIST) -
Research Data Access and Preservation Summit 2012 E-Research Roundtable Center for Informatics Research in Science and Scholarship Graduate School of Library.
11 Researcher practice in data management Margaret Henty.
Data Conservancy and the US NSF DataNet Initiative Fourth Workshop on Data Preservation and Long-Term Analysis in HEP Sayeed Choudhury Johns Hopkins University.
Preliminary Findings Baseline Assessment of Scientists’ Data Sharing Practices Carol Tenopir, University of Tennessee
Infrastructure Breakout What capacities should we build now to manage data and migrate it over the future generations of technologies, standards, formats,
Considerations on barriers to data sharing Elaine Collier, MD National Center for Research Resources National Institutes of Health.
Research Data Management 26 th April 2016 Federica Fina, Data Scientist, University of St Andrews Library.
The Case for Data Stewardship: Preserving the Scientific Record Matthew Mayernik National Center for Atmospheric Research Section: The Case for Data Stewardship.
Michael Witt, Jacob Carlson, D. Scott Brandt Purdue University Melissa H. Cragin University of Illinois at Urbana-Champaign Constructing Data Curation.
Data Stewardship Lifecycle A framework for data service professionals Protectors of data.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
A. D. SMITH – SEPTEMBER 28, 2011 DATA CURATION PROFILE.
Research Data Management in the Humanities: an Introduction to the Basics Open Exeter Project Team.
Big Data, Little Data, No Data – Who is in Charge of Data Quality? World Data Systems Webinar #9 9 May 2016 Christine L. Borgman Distinguished Professor.
Data Curation and Data Analytics for Advancing Science and Scholarship GSLIS Research Showcase 9 April 2011 Carole Palmer & Cathy Blake Center for Informatics.
Overview of WGs, IGs and BoFs
Open Exeter Project Team
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
BURIED DEEP: How data about subseafloor life becomes dark and why
Research on Data Curation and Repositories
EMBRC - European Marine Biological Resource Center K. Deneudt, I. Nardello Pilot Blue Cloud Workshop March 28th, 2017 Brussels.
Bird of Feather Session
School of Information Studies, Syracuse University, Syracuse, NY, USA
Presentation transcript:

Data Sharing Practices: Implications for Curation and Re-use Carole L. Palmer Center for Informatics Research in Science & Scholarship Graduate School of Library & Information Science University of Illinois at Urbana-Champaign Sharing Data: Practices, Barriers, and Incentives ASIST annual meeting 10 October 2011

Data Practices research group - CIRSS Team members: Tiffany Chao Melissa Cragin Nic Weber Karen Baker Andrea Thomer big science data long tail of “dark” small science data small science complex, heterogeneous data implications for data curation value for re-use across disciplines

Data Curation Profiles Project Scientists ’ data workflows & curation requirements across disciplines IR applications Scott Brandt, PI; M. Witt & J. Carlson, (Purdue) Palmer, Cragin, Heidorn, & Shreeves (Illinois) Biochemistry Biology Civil Engineering Electrical Engineering Food Sciences Earth and Atmospheric Sciences Soil Science Anthropology Geology Plant Sciences Kinesiology Speech and Hearing Earth and Atmospheric Sciences Soil Science Data Curation Profiles Toolkit at Purdue:

Worksheets and interviews Data kinds and stages - sharing targets, provenance, context Intellectual property - owners, terms of use, attribution Organization /description – formal / local standards, documentation Access - embargo, access control, mirror site Preservation – duration, migration Tools - analytical, visualization, integration Interoperability - needs, APIs, 3rd party data, etc. Storage, integrity, security - audits, version control Discovery – browse, search, external

Field Specific Research Area Form to be sharedFormats Type of data setSize Shared when? Agronomy water quality, drainage, and plant growth cleaned, reviewed sensor; hand-collected samples.xls approx. 100 files ~1MB each, up to 20 Mb After publication Geology rock, water and microbes averaged sensor; hand-collected samples; photographs.xls; jpg 1 file; images < 1 Mb After publication Civil Engineering traffic movement cleaned, normalized sensor MySQL postgresql 1 database appro x K/day 1 month to 1 year embargo Which can be shared when?

Private vs. public data sharing – Supplying data – limited and controlled distribution by request – Exposing data – public access conditioned by data management pressures and experience Complex mis-use concerns: misinterpretation– presumed problems misappropriation – actual premature re-use disregard of good faith practices – how used, what referenced Cragin, Palmer, Carlson, & Witt (2010). Data sharing, small science, and institutional repositories. Philosophical Transactions of the Royal Society A, 368,

Interpreting practices long-term use by others, especially in other fields collective value in aggregate with other data How do we identify and represent potential for reuse? Forms most easily or willingly shared may not have the most re-use value. “My data will never be of use to anyone else.” “There are no standards in my field” “Of course I'm willing to share my data publicly".

Data Conservancy PI, Sayeed Choudhury, Illinois research team Data practices group - (Palmer, Cragin, Chao, Weber, Thomer, Baker) comparative analysis - earth and life sciences long-term, re-use value of data Data concepts group - (Renear, Dubin, Sacchi, Wicket) formal terminology, identity conditions for data sets, versions, etc. representation levels (data, encoding, format) A blueprint for data infrastructure and curation services for research libraries and other organizations.

Data practices - progressive data collection Talking shop about data - efficient exchange with right scientists about right things Lead scientists - research context, IP, access, discovery, re-use 1) Pre-interview worksheets 2) Semi-structured interviews 3) Follow-up sessions with selected participants Researchers managing data - stages, versions, standards, tools 4) Data deposit & sharing worksheet 5) Data samples, related documentation

SHARING GeoscienceSoil Ecology Oceanographic / Climate Modelers What - physical rock samples - images, stratigraphy data table from sample -methodologies -species taxonomies -scripts, code -model output (netCDF) When by requestby request; funding policy (i.e. LTER) by request within personal networks; methodological conditions How , phone, site visit With Whom “experts” in related fields, readers collaborators, colleagues, readers Practices Provider - no standard for attribution Receiver – may offer co-authorship Provider - possible acknowledgement Receiver – needs methods training & programming Provider - supply publication with data for citation, may request co- authorship or acknowledgement

Data products within communities GeobiologyVolcanologySoil ecologySensor science Data unit Time series: (site specific) spreadsheets microscopy images annotated digital “field photos” Rock profile: physical rock thin section chemical analysis photographs field notes Database: multiple abiotic soil measures associated metadata Database: soil data sensor data User communities Geobiology, Geology, Chemistry, Microbiology U.S. Park Service Geology – igneous petrology Geophysics Geochemistry Biochemistry Earthworm ecology Network Science Computer Science Sharing conventions by request no repository mostly post-pub, some unpublished by request no repository public resource collection Reference data industry Limits – customization “vertical” dev.

Data & practices related to curation

Curation tasks

Thank you Center for Informatics Research in Science and Scholarship