Finding and Acquiring Data: Discovering and Obtaining Data from Library & Non-library Sources Linking Open Data cloud diagram, by Richard Cyganiak and.

Slides:



Advertisements
Similar presentations
Rclis in vision and reality Thomas Krichel
Advertisements

The Economic and Social Data Service (ESDS) Kevin Schürer ESDS/UKDA ESDS Awareness Day 5 December 2003.
The Economic and Social Data Service (ESDS) Karen Dennison UK Data Archive Improving access to government datasets 18 January 2007.
Wiki&101 If your desired outcome contains an element of collaboration then consider using a wiki. ppt link:
DSpace: the MIT Libraries Institutional Repository MacKenzie Smith, MIT EDUCAUSE 2003, November 5 th Copyright MacKenzie Smith, This work is the.
Selecting a Data Sharing Repository. 2 Why Share Data? Enabling others to replicate and verify results as part of the scientific process Allows researchers.
MAIN MESSAGE key reasons enumerated ->please read speaker notes Research. Report. Reposit. Deposit your scholarly research - it’s as easy as 1, 2, 3 id.
1. The Digital Library Challenge The Hybrid Library Today’s information resources collections are “hybrid” Combinations of - paper and digital format.
Tara Guthrie, 2012 Types of Resources: Electronic.
StatCat Building a Statistical Data Finder ssrs.yale.edu/statcat Steven Citron-Pousty Ann Green Julie Linden Yale University.
Depositing and Disseminating Digital Resources Alan Morrison Collections Manager AHDS Subject Centre for Literature, Linguistics and Languages.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
INTRODUCTION TO RESEARCH DATA MANAGEMENT Robin Desmeules Janice Kung J W Scott Health Sciences Library University of Alberta Libraries.
EZID (easy-eye-dee) is a service that makes it simple for digital object producers (researchers and others) to obtain and manage long-term identifiers.
THE DATA CITATION INDEX AN INNOVATIVE SOLUTION TO EASE THE DISCOVERY, USE AND ATTRIBUTION OF RESEARCH DATA MEGAN FORCE 22 FEBRUARY 2014.
Presented by Ansie van der Westhuizen Unisa Institutional Repository: Sharing knowledge to advance research
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
ORGANIZING AND STRUCTURING DATA FOR DIGITAL PROJECTS Suzanne Huffman Digital Resources Librarian Simpson Library.
Dr. Kurt Fendt, Comparative Media Studies, MIT MetaMedia An Open Platform for Media Annotation and Sharing Workshop "Online Archives:
U.S. Decennial Census Finding and Accessing Data Summer Durrant October 20, 2014 Data & Geographical Information Librarian Research Data Services
1 Guidelines For The Future Sharing Best Practice For National Bibliographies In The Digital Era Neil Wilson Information Coordinator IFLA Bibliography.
Introduction to Versioning
Choosing Between Data Sharing Repositories for Social Sciences Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Dataset Citation: From Pilot to Production Mark Martin Assistant Director, Office of Scientific and Technical Information U.S. Department of Energy.
Login / Upload / Share Deposit your scholarly research - it’s as easy as 1, 2, 3 MAIN MESSAGE key reasons enumerated ->please read speaker notes id / who.
Choosing Between Data Sharing Repositories for the Humanities Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Research Data Management Victoria University Context Lyle Winton Adrian Gallagher Julie Gardner.
Data Wrangling and Interoperability Andrea Denton Research and Data Services Manager Claude Moore Health Sciences Library Ricky Patterson.
Choosing Between Data Sharing Repositories for the Life Sciences Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
Use & Access 26 March Use “Proof of Concept” Model for General Libraries & IS faculty Model for General Libraries & IS faculty Test bed for DSpace.
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
Preserving and Sharing Data: Best Practices & Requirements for Selecting a Data Sharing Repository
Background Cornell Institute for Social and Economic Research (CISER): Data and Computing Support for Social and Economic Researchers at Cornell University.
U.S. Department of the Interior U.S. Geological Survey CDI Webinar Series 2013 Data Management at the National Climate Change and Wildlife Science Center.
VIVO and Scholarly Repositories: Synergistic Opportunities.
1 OSTI - Accelerating Science Information Dr. Walter L. Warnick Director U.S. Department of Energy Office of Scientific and Technical Information Federal.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
JISC Information Environment Service Registry (IESR) Ann Apps MIMAS, The University of Manchester, UK.
EThOS: Where have we got to and where do we go next? FIL Conference 29 June 2015 Sara Gould.
Research Grants and Projects Discovery Service ANDS Webinar 12th August 2015 Monica Omodei, ANDS.
4 way comparison of Data Citation Principles: Amsterdam Manifesto, CoData, Data Cite, Digital Curation Center FORCE11 Data Citation Synthesis Group Should.
National Library of the Czech Republic Integration of digital materials into EDL Adolf Knoll National Library of the Czech Republic Helsinki CENL Workshop.
Web 2.0: Making the Web Work for You, Illustrated Unit A: Research 2.0.
Managing Access at the University of Oregon : a Case Study of Scholars’ Bank by Carol Hixson Head, Metadata and Digital Library Services
Issues in RDM This work is licensed under a Creative Commons Attribution 4.0 International LicenseCreative Commons Attribution 4.0 International License.
The JISC Information Environment Service Registry (IESR) Ann Apps Mimas, The University of Manchester, UK.
PERSISTENT IDENTIFIERS FOR THE UK: SOCIAL AND ECONOMIC DATA …………………………………………………………………………………………………… LOUISE CORTI …………………….…………………………….… UK DATA ARCHIVE.
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
BUILDING AN ONLINE ACADEMIC PROFILE Pamela Andrews Repository Librarian for Scholarly Works, University of North |
Discovering Computers Fundamentals, 2011 Edition Living in a Digital World.
Digital Research tools – Endnote, RefWorks, Zotero, Mendeley, Papers.
Ingest – Acquisition and deposit Irena Vipavc Brvar ADP SEEDS Workshop I Belgrade, October.
YOUR TITLE HERE Courtney Matthews, Digital Repository Librarian Web Advisory Committee April 20, 2016 uwspace.uwaterloo.ca Library Scholarly Communications.
NRF Open Access Statement
Jeff Moon Data Librarian &
Open Exeter Project Team
Promoting and Preserving FIU Research and Scholarship
Why Evaluate? Anyone can publish anything on the Web… It is your job, as a researcher, to look for quality.
An Overview of Data-PASS Shared Catalog
Jay Bhatt Drexel University Libraries
Data stewardship life cycle
OpenML Workshop Eindhoven TU/e,
Research Data Management
JISC Information Environment Service Registry (IESR)
A Case Study for Synergistically Implementing the Management of Open Data Robert R. Downs NASA Socioeconomic Data and Applications.
USER MANUAL - WORLDSCINET
Dataverse for citing and sharing research data
USER MANUAL - WORLDSCINET
Presentation transcript:

Finding and Acquiring Data: Discovering and Obtaining Data from Library & Non-library Sources Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. © 2014 by the Rector and Visitors of the University of Virginia. This work is made available under the terms of the Creative Commons Attribution-ShareAlike 4.0 International license Bill Corey Summer Durrant Research Data Services University of Virginia Library

3 Workshop Goals You’re looking for data for your project: where do you find it and how do you acquire it? You’ll learn how to : identify potential sources of data locate the data you need gain access to it We’ll explore the library’s collections and subscriptions, in addition to data residing in repositories that can be identified by data citations and article references.

4 Data Collections

5 Planning Phase Gather information about the data set you hope to find: Known Dataset  Name of study or data set  Principle investigator(s)  Data producer or vendor  Edition or version Variable Search  Variable(s)  Geographic coverage  Time period  Frequency

6 Questions Who is likely to collect data on this topic?  government agencies  academic researchers  IGOs and NGOs  non-profit organizations  private sector How are data likely to be collected?  survey (census, sampling)  administrative records

7 Search Strategy #1: Start with a Statistical Publication

8 Search Strategy #2: Use the Methodology/Data Section of Articles

9 Search Strategy #3: Search in Data Repositories

10 Discovering & obtaining data from non- library sources What do you do if you have researched your potential data source and it is not available through one of the databases that the library has a license to? You may want to contact the researcher directly if they have mentioned a dataset in an article that your citation research identified. You can widen your search to include resources that are available on the internet, via Google or Bing. One of the easiest ways to do this is by searching in a directory of data repositories.

11 Uncovering data on the internet It is second nature to search the internet for everything and anything. But how do we know if the resources we find are ‘good’ resources? How do we know if we can use them for our own research? Materials on the web were put there by somebody. It is always important to remember that anything you find online belongs to someone. To use it appropriately, you need to identify the owner and obtain permission.

12 Data Registries Finding the source of your ideal dataset can be intimidating, confusing, and very time consuming. There are 2 international registries of data repositories that can speed up that search (Databib & Re3data merged in 2015): Databib Re3data There is also a list of data repositories by subject at Simmons College.

13 Advantages of Data Registries They provide information to narrow your search: Persistent Identifiers – unique, citable, and searchable Access controls Terms of Use & Licenses Data preservation -- migrating to new formats or emulating old formats Professional backup & documentation Repository Standards ensure commitment and quality

14 Finding data using a data registry Databib and re3data can be searched at multiple levels. The registries are similar in their content: re3data has more European & Asian repositories Databib has more from the America’s. Both organizations have recently agreed to merge their resources into a single registry in 2015, under the DataCite umbrella. DataCite provides DOI’s for datasets: Digital Object Identifiers, which allow datasets to be cited just like an article. A dataset with a DOI can be found more easily by other researchers.

15 Finding data in a data repository Search for History. Databib has 2 offerings, while re3data has 17. Look at the information in a particular repository record. You’ve probably noticed the icons to the right of the repository name in re3data’s listings. Those icons provide a lot of useful info up front, so you don’t have to dig for it. These icons will be the standard in the combined registry. Let’s look at those icons, and then look at Sandrart.net.

16 re3data.org icons What do the icons mean?

17 The re3data record for Sandrart.net Both registries show the same entry text for the Sandrart Repository. Both provide a URL to the repository, both tell you it is a data provider (not accepting deposits), and that it is a disciplinary repository.

18 The Databib record for Sandrart.net Both registries show the same entry text for the Sandrart Repository. Both provide a URL to the repository, both tell you it is a data provider (not accepting deposits), and that it is a disciplinary repository.

19 The re3data record for Sandrart.net But re3data provides additional information on the other tabs: Institutions responsible for the repository, Terms of use and Standards. You might also notice that Databib says it is a “restricted” repository, while re3data says it is open. In reality, it is both.

20 Sandrart.net The Sandrart.net site itself also provides additional citation information, as well as navigation tools, a French translation, a PURL for every page and image, and TEI, REST-Service and PND-Beacon feeds for the basic files.

21 openICPSR ICPSR is creating a new repository that they will call openICPSR. It will be separate from the institutionally-funded ICPSR data repository, will require a fee to deposit the data, and will preserve for 10 years. In the Self-Deposit Package, the data depositor is responsible for the documentation & anonymization of the data.

22 Data Archives & Repositories Examples

23 What have we learned about locating data? There is a LOT of data out there that you can reuse. re3data.org currently lists 586 repositories Databib lists 971 repositories There are many sources of data that aren’t in a repository. Disciplines may have an awareness of and access to data repositories that they don’t share with others. Colleagues may be working with datasets that are still being created. Ask them! Authors may cite data in an article, but provide no information about it. Contact them!

24 Best Practices for identifying and obtaining datasets When identifying and obtaining data from non-library sources, it is important to remember a few best practices. someone created the data, so who owns it? do I have permission to do what I want with the data? are there restrictions on reuse? does the dataset come with documentation that makes it understandable and useful? is the data freely accessible, or is it restricted access? do I need to purchase the data? do I need to register with the repository to obtain the data?

25 DVN hosts multiple, individually-branded Dataverses Researchers control the design, content, dissemination of their Dataverse, and can embed it in their own webpage DVN assigns handles (persistent id) and Universal Numerical Fingerprint (data fixity/verification) Extracts metadata for discovery, imports/exports metadata in multiple XML formats (DDI, Dublin Core, FGDC); data across DVNs searchable within one DVN Accepts data in multiple formats (Stata, SPSS, CSV), converts to preservation format Data can be subset, recoded, analyzed online Data sharing on DVN:

26 Code & Multi-discipline Data Archives & Repositories Open Science Framework: Scientists can use OSF for free to archive, share, find, and register research materials and data. figshare: “figshare is a repository where users can make all of their research outputs available in a citable, shareable and discoverable manner.” GitHub: Share and archive code and collaborate with anyone. runmycode: Online repository allowing people to share and download computer code and data associated with scientific publications. ResearchGate: Online research social networking. Collaborate, share publications and data.

27 Data Sources JISC

28 Questions? Summer Durrant Data & Geographical Information Librarian Research Data Services University of Virginia Library Bill Corey Data Consultant Research Data Services University of Virginia Library Thanks for attending today's workshop: Finding and Acquiring Data: Discovering and Obtaining Data from Library & Non-library Sources. Please contact us if you have any questions, or need assistance in locating and obtaining data for your research activities.