Increase discovery of your institution’s research through SHARE

Slides:



Advertisements
Similar presentations
Partnering with Faculty / researchers to Enhance Scholarly Communication Caroline Mutwiri.
Advertisements

Session 5 Intellectual Merit and Broader Significance FISH 521.
Data Sources & Using VIVO Data Visualizing Scholarship VIVO provides network analysis and visualization tools to maximize the benefits afforded by the.
Institutional Perspective on Credit Systems for Research Data MacKenzie Smith Research Director, MIT Libraries.
Libraries as Partners in Research: the UC Curation Center’s Tools and Services UC3 Team University of California Curation Center California Digital Library.
Libra: Thesis and Dissertation Submission. What is Libra? UVA’s institutional repository, providing online archiving and access for the scholarly output.
The Department of Energy’s Public Access Solution Giving Voice to Energy and Science R&D Results Jeffrey Salmon Deputy Director for Resource Management.
Making Connections: SHARE and the Open Science Framework Jeffrey Open Repositories 2015.
SHARE (SHared Access Research Ecosystem) Tyler Walters Co-Chair, SHARE Steering Group (a joint committee of the ARL, the AAU, and the APLU) Eric Celeste.
The Role of Academic Libraries in the Digital Data Universe Break-Out Session: New Partnership Models Bob Hanisch and Brian Schottlaender Co-Leaders ARL.
| 1 Open Access Advancing Text and Data Mining Libraries & Publishers working together to support Researchers What is Text Mining?
Public Access: Update on Progress National Science Foundation April 2, 2014.
It’s the data that makes a paper Joerg Heber Executive Editor Nature Communications.
Data Citation Implementation Pilot Workshop
Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.
Practical Steps for Increasing Openness and Reproducibility Courtney Soderberg Statistical and Methodological Consultant Center for Open Science.
| 1 Anita de Waard, VP Research Data Collaborations Elsevier RDM Services May 20, 2016 Publishing The Full Research Cycle To Support.
Redefining the Library’s Role through an Institutional Repository Sharon Mader, Dean Jeanne Pavy, Scholarly Communications Librarian Earl K. Long Library.
ODIN – ORCID and DATACITE Interoperability Network ODIN: Connecting research and researchers Sergio Ruiz - DataCite Funded by The European Union Seventh.
Data Sources & Using VIVO Data Visualizing Science VIVO provides network analysis and visualization tools to maximize the benefits afforded by the data.
Role of librarians in improving the research impact and academic profiling of Indian universities J. K. Vijayakumar Ph. D Manager, Collections & Information.
Beyond the Repository: Research Systems, REF & New Opportunities William J Nixon Digital Library Development Manager.
NRF Open Access Statement
Introducing orcid What, why and how
Promotion & Tenure Workshop
Scholarly Workflow: Federal Prototype and Preprints
Identity of a Brand new library J. K. Vijayakumar
Promoting and Preserving FIU Research and Scholarship
Center for Open Science: Practical Steps for Increasing Openness
Jarek Nabrzyski Director, Center for Research Computing
A robust scholarly repository that puts your UVA research center stage
Creating an Academic Presence
SHARE: A Public Good to Increase Scholarly Innovation
Peter Shepherd COUNTER March 2012
Open Science Framework
Open Scholarship on an Open Platform: An Introduction to the
Summit 2017 Breakout Group 2: Data Management (DM)
ACS 2016 Moving research forward with persistent identifiers
Institutional Repository and Friends
SowiDataNet - A User-Driven Repository for Data Sharing and Centralizing Research Data from the Social and Economic Sciences in Germany Monika Linne, 30.
Education of a scientist video
Introduction to ORCID Ricky Patterson
Institutional role in supporting open access, open science, open data
Linking persistent identifiers at the British Library
VI-SEEM Data Repository
Jay Bhatt Drexel University Libraries
Matthew Harp Arizona State University
Introduction to Implementing an Institutional Repository
Access  Discovery  Compliance  Identification  Preservation
ORCID y la comunidad global
What, why and best practices in open research
Sophia Lafferty-hess | research data manager
Data Management: Documentation & Metadata
Open Access to your Research Papers and Data
SFU Open Access Policy Endorsed by Senate January 9, 2017
Assessing the Assessment Tool
Expanding Knowledge: Introduction to Scholarly Communication
SAA Research Forum August 2018 Ann Whiteside
Opening Access: Increasing Scholarly Impact with
Mission SHARE is a higher education initiative whose mission is to maximize research impact by making research widely accessible, discoverable, and reusable.
CSCD 506 Research Methods for Computer Science
Using a CRIS to support communication of research: mapping the publication cycle to deposit workflows for data and publications Federica Fina, Data Scientist,
Measuring Your Research Impact
Building an open library without walls : Archiving of particle physics data and results for long-term access and use Joanne Yeomans CERN Scientific Information.
OPEN ACCESS POLICY Larshan Naicker Rhodes University Library
Bird of Feather Session
Digital Library and Plan for Institutional Repository
CARL Guide to Author Rights
Supporting Open Research
Digital Library and Plan for Institutional Repository
Presentation transcript:

Increase discovery of your institution’s research through SHARE Kelly Thompson Metadata Analyst Librarian, University of Minnesota SHARE Curation Associate, 2016-2017 Library Technology Conference, March 16, 2017 Hello and welcome to this session, entitled “Increase discovery of your institution’s research through SHARE”. I’m Kelly Thompson. I’m a Metadata Analyst Librarian at the University of Minnesota Libraries, Twin Cities campus, where I was lucky to join a department called Data Management and Access this past September. I’m hear today in a slightly different role however. I’m also a member of the pilot cohort of SHARE Curation Associates, and it’s in that capacity that I’ll be speaking to you today, about SHARE, and how it might fit into your work.

Outline Conceptual foundations Practice-based: What is SHARE Current features of SHARE Developing features of SHARE How can you participate? Question time To start off, I’d like to give you a brief roadmap of what I’ll be presenting today. This presentation will have two main parts: the first being more formal, where I will lay some of the conceptual groundwork for this project, and the second more informal with some discussions of the practical aspects of SHARE. In the first part, I’ll give you some general background on what SHARE is and where it fits in to the current scholarly communication landscape. In the second part I’ll discuss current features of SHARE, and also a little bit about some developing features. In the practice-based portion, I’ll show you some examples of how other librarians are leveraging SHARE, and give you some ideas on how your institution can participate. We’ll have some time at the end for questions, so if something pops up, make sure to jot it down for that time.

SHARE SHARE is a free, open data set of research and scholarly activities across the research life-cycle. So, what is SHARE… SHARE is a free Open Data set Of research and scholarly activities Across the research life-cycle. Now, that’s a lot, so let me break that out piece by piece.

SHARE SHARE is a free, open data set of research and scholarly activities across the research life-cycle. SHARE is a data set, comprised of metadata contributed by about 150 different contributors, and that number is growing. It’s metadata only – so there are no digital objects stored or captured, simply information about the research or scholarly activity. And, because it’s an aggregated data set, it allows for collacation in searching of a variety of repositories that would otherwise be siloed.

SHARE SHARE is a free, open data set of research and scholarly activities across the research life-cycle. SHARE is free – it costs nothing, you don’t need to buy a subscription – And it’s open. You are free to use the data without having to license it. All of the software and infrastructure are built with code that is freely distributed on github. This is important because it means you don’t have to pay to access or use the data, and you can use it for activities that might otherwise be restricted under license agreements we as libraries typically sign with vendors, such as agreements that you won’t use a vendor’s data for data mining or similar purposes. SHARE doesn’t have any of those restrictions, so if you think of an incredible use for it, you are free to make your geeky data dreams come true.

Who is SHARE Making SHARE free and open was a very intentional choice. SHARE is a partnership between the Association of Research Libraries, or ARL, and the Center for Open Science, or COS. The SHARE intiative was founded in 2013 by the Association of American Universities (AAU) and the Association of Public and Land-grant Universities (APLU), both of which continue to have ex officio representation on the project’s Advisory Board. The project is also underwritten in part by generous funding from the Institute of Museum and Library Services (IMLS grants) and the Alfred P. Sloan Foundation. All of these organizations explicitly value developments in the library, information, and science fields which have broad benefits to as many stakeholders as possible, and by making share free and open, it is more able to support that value.

SHARE SHARE is a free, open data set of research and scholarly activities across the research life-cycle. And, finally, SHARE is a free, open data set of research and scholarly activities across the research life-cycle.

Research Lifecycle The research lifecycle can be abstracted in many ways - this is the model that the Open Science Framework uses. It starts with some kind of intellectual input, a problem, or a documented gap in understanding. A researcher synthesizes this information, hypothesizes about possible answers to the research question, and designs some type of study to test these hypotheses. The researcher goes through the steps of carrying out the study (acquiring materials, which could be data or physical materials; collecting data; storing that data somewhere; and eventually analyzing that data.) The researcher interprets the findings from their study, and, as our current custom, writes it up in a report of some sort, which is published or shared in some way. This research output is consumed by other researchers, and ignites further inquiry by the questions it generates or leaves unanswered, and the cycle starts again. Now, in the traditional academic culture, this stage at 11:00 there, the part where you publish the report, has historically been given the most weight in the evaluation of whether or not a researcher has been successful in their trip around the cycle. This is what has driven tenure, metrics like impact factors, etc. all these years. But in the modern research landscape, this is not the case anymore.

Collecting the Research Lifecycle There are outputs at every stage of the lifecycle. Presentations, study pre-registrations, grant proposals, data management plans, research protocols, data bases and spreadsheets, code that was used to process and analyze data, figures, graphs, and charts, author manuscripts, pre-prints, peer-reviewer comments, articles, book chapters, posters, conference presentations, class lectures, etcetera. SHARE seeks to improve the discoverability of research across the lifecycle, not just when a final version of an article is published. This is tricky, because in our current scholarly communication landscape, each type of output tends to live in a different repository from the others. By aggregating data from all of these disparate sources, SHARE seeks to piece these important pieces back together. So, why is this important? Why isn’t the journal article good enough anymore? Well, we know that there has been a lot of buzz about reproducibility in science. Can I take the same data you used to come to your conclusions, and analyze it, and get the same outcome? In some cases, papers have been retracted because the researchers couldn’t produce the supporting data to back up their claims. Because most disciplines do not publish “negative results” or results of studies that do not confirm a hypothesis, there is pressure at the end of the research lifecycle to extrapolate something from the data, so that the expensive investment (in time, and resources) was not a “waste”. And, in an environment where most journals are closed, toll-access publications, many people can not access the published final article, leaving much of the research corpus inaccessible, slowing down the pace of research and new breakthroughs. So, these extra materials that help to give context to the research. They support the interest in replicating studies, and performing new studies and analyses using existing data sets. To collect and making accessible these materials is to act on the notion that failed or negative experimental results might save someone time or as funders, resources, if it doesn’t have to be repeated because someone didn’t publish a journal article about it. This is what is fueling the adoption of open science practices, publications, and tools within the research community. It’s also helping to build the understanding that the published report is both a summary of the science and the end of the research process and that there is a fundamentally unnecessary inefficiency in waiting for that published report to engage potential collaborators or advance a finding.

Everett Opie The New Yorker Collection/The Cartoon Bank, 1976 This is a cartoon that Jeff Spies, from the Center for Open Science, likes to share. The caption reads, “I see by the current issue of Lab News,” Ridgeway, that you’ve been working for the last twenty years on the same problem I’ve been working for the last twenty years.” Now this is from 1976, so, before Al Gore invented the internet as we know it [joke]. How much better are we doing now? I worked in scientific research before I went into libraries, so I understand the fear that your research will get “scooped”. But I think that, by re-designing the way the research lifecycle is documented, this whole anxiety looses a bit of its teeth, since you can document and establish your ideas and your intellectual pursuit at the earliest stages of the research cycle. An emerging practice is study registration. This is where you register your research question, methods, and the statistical analyses you plan to perform before collecting any data. This also allows for peer review of your research methods before you go to the trouble and expense of conducting any experiments. ClinicalTrials.gov is an example of a system collecting these kinds of outputs. When this kind of openness about workflows is integrated throughout the research process, it is sometimes called “Open Notebook Science”, where the entire research process is completely transparent (or as much as it can be in cases where there is sensitive data involved in the research). So, what tools and what infrastructure do researchers have to overcome this, frankly, inefficient, situation?

Open science and open scholarship One tool that SHARE developers and the SHARE curation associates have been using heavily is the Open Science Framework, which I’ll talk a little bit about later in this presentation, but for now will just mention that it is a system to support the entire research lifecycle, from pre-registering your study idea and methodology, to providing one interface that embeds and links to all of your project documents whether they’re in Google Docs, on GitHub, uploaded to the OSF, or in your project page’s wiki. Data repositories are proliferating and seeing broad adoption as researchers work to comply with federal funder mandates…. and discipline-specific pre-print servers such as MLA Commons CORE now supplement institutional repositories as places where communities of scholars can deposit their working papers, and versions of their manuscripts they have publisher permission to share, even if the final publisher version is not open access. There has been a lot of discussion of open peer review, which I don’t think anyone can agree on a definition of currently, but has been used to mean everything from peer review where the identity of the reviewers are not a secret, to having peer reviews published alongside of the manuscript versions themselves, to post-publication peer review where the articles is published first and reviewed after.

The Lifecycle Approach “Rather than focusing on acquiring the products of scholarship, the library is now an engaged agent supporting and embedded within the processes of scholarship.” --21st Century Library Collections: Calibration of Investment and Collective Action (ARL 2012) So, why is SHARE, and why are libraries for that matter, interested in the full research lifecycle, not just data and publications, which are typically the focus of mandates and policy? And, truthfully, when the data and the publications themselves are hard enough to tackle? For transparency, reproducibility, and reusability To tell a full story about scholarship, and about the range of potential contributions to scholarship as a project might include people who create different aspects of this work. The article is a late indicator of a person’s work.

Mission and Values SHARE’s mission is to maximize research impact by making research widely accessible, discoverable, and reusable. SHARE is developing services to gather and freely share information about research and scholarly activities across their life cycle. Making research and scholarship freely and openly available encourages innovation and increases the diversity of innovators. Okay. So now that we’ve talked about the landscape, and the conceptual foundations of open science, we can talk about SHARE specifically as a tool.

How does it work?

SHARE dataset 150+ data sources Registries (e.g. CrossRef, DataCite) Disciplinary repositories and preprint services Data repositories Institutional repositories Agency repositories (e.g. DOE SciTech Connect) https://share.osf.io/sources As of Monday, there were 150 data sources which had contributed to SHARE. (This number is continously growing, and the number is updated next to the search bar on whatever version of SHARE you are searching.) This is about 17.8 million metadata records. These contributors cover a wide swath of the research lifecycle ouputs. There are persistent identifier registries, such as CrossRef and DataCite, which mint and manage DOIs). There are Disciplinary repositories and preprint services (such as arXiv, RePEc or the Repository of Papers in Economics, PubMed Central) Data Repositories (Dryad) And Institutional repositories from dozens of universities and research organizations (public, governmental, and private) Everyone from Harvard to the Department of Energy to https://share.osf.io/discover

Metadata records by type Data Set Patent Poster Presentation Publication (Article, Book, Conference Paper, Dissertation, Preprint, Project, Registration, Report, Thesis, Working Paper) Repository Retraction Software https://share.osf.io/discover

Institutional Dashboard API Feeds Application Components ... Integrations, Applications, Widgets, Other Solutions... Institutional Dashboard Discovery ... SHARE Notify

SHARE API powers new discovery services This is a prototype. There will be additional polish to the interface. UCSD and SHARE UI developer Manual clean-up & enhancement of the data was necessary. Hired grad students to manually collect bib/lab/people data.

Developing features of SHARE Enhanced data model Institutional dashboard Aggregating more [varied] sources Curation Associates work (including interface for curation)

SHARE Curation Associates Program Goal: to support associates to leverage curation expertise to enhance the SHARE dataset, and to lead projects that provide benefits locally Assessment Clean-up Alignment with SHARE dataset as a whole

34 professional librarians in inaugural pilot Information extraction through application programming interfaces (API) and OAI-PMH feeds Building content harvesters Basic programming in Python Methods and tools to automate data cleaning and metadata enhancements (using programming scripts and/or OpenRefine) All tools used will be open source. I was working at Iowa State University when I was selected to participate in the SHARE curation associates program, so I am represented on this map by the purple dot in the middle of Iowa, instead of a purple dot in the middle of Minnesota.

Curation program Local data curation Project teams Training, education, and outreach Dataset curation (forthcoming) Components Local data curation OR Project teams

Open Science Framework https://osf.io Increasingly, we are looking at this opportunity for integration using the OSF, which can work with both funders and institutions to provide a platform that can integrate with and provide computational services, archival services, curation opportunities and preservation. There is very exciting work going on at Notre Dame.

Local enhancement activities First 6 months: Metadata review Gap analysis Digital preservation review Draft 3-3-3 plan Upcoming: Implement 3-3-3 plan

? Metadata quality Title Title Author Author 1; Author 2 Author 1 Volume Pages Explanation of why the bepress OAI-PMH endpoint isn’t a good data source from SHARE even though your repository metadata might be very rich. [Crosswalking 101] Coverage

Project groups Potential data sources from re3data.org repositories Populating an open access institutional repository with SHARE data Graduate student researcher profiles etc.

How can you participate in SHARE?

Register your repository! https://share.osf.io 1. Scroll down, click on “Source Registration” 2. Create an OSF account 3. Fill out the form 4. a. If your repository has an OAI-PMH endpoint, the SHARE team will take care of everything from there! b. Otherwise they will work with you on what you’ll need to submit.

Use the API! https://osf.io/bygau Getting started documentation, sample scripts! More documentation on the SHARE website.

Get more information Get involved! www.share-research.org/updates info@share-research.org

Question time

Thanks to Cynthia Hudson-Vitale for creating some of the slides used in this presentation! And to all the SHARE folks including Jeff Spies, Judy Ruttenberg, Rick Johnson, Erin Braswell, and Amy Eshgh.