Download presentation
Presentation is loading. Please wait.
Published byChristopher Watson Modified over 9 years ago
1
Data Citation: Key to Discovery, Reuse, and Tracking Impact Curating and Managing Research Data for Reuse ICPSR Summer Program August 2, 2013 Elizabeth Moss, MSLIS eammoss@umich.edu
2
1.A tour of the ICPSR Bibliography of Data- related Literature 2.The challenges of tracking data reuse (you have to be able to discern data use before you can track data reuse) 3.Efforts to improve citing standards and practices, leading to sharing and impact Today’s talk
3
Top 10 Data Downloads (first half of the year) TitleArchive Number of downloads National Longitudinal Study of Adolescent Health (Add Health), 1994-2008 DSDR2,062 National Survey on Drug Use and Health, 2011SAMHDA1,216 Chinese Household Income Project, 2002DSDR720 Health Behavior in School-Aged Children (HBSC), 2005-2006SAMHDA555 National Survey on Drug Use and Health, 2010SAMHDA541 General Social Survey, 1972-2010 [Cumulative File]ICPSR524 American National Election Study, 2008: Pre- and Post-Election Survey ICPSR480 Collaborative Psychiatric Epidemiology Surveys (CPES), 2001-2003 [United States] CPES472 India Human Development Survey (IHDS), 2005DSDR453 Historical, Demographic, Economic, and Social Data: The United States, 1790-2002 ICPSR359
4
Top 10 Series Data Downloads (January through July 2013) TitleArchive Number of downloads National Survey on Drug Use and Health (NSDUH) SeriesSAMHDA4,012 National Longitudinal Study of Adolescent Health (Add Health), Restricted Data Series DSDR3,701 Uniform Crime Reporting Program Data SeriesNACJD2,034 Midlife Development in the United States (MIDUS) SeriesNACDA1,762 ABC News/Washington Post Poll SeriesICPSR1,744 National Crime Victimization Survey (NCVS) SeriesNACJD1,603 American National Election Study (ANES) SeriesICPSR1,563 Chinese Household Income Project SeriesDSDR1,500 Current Population Survey SeriesICPSR1,382 National Health Interview SeriesNACDA1,316
5
Who uses these shared data? How are they used? With what impact?
6
Increase likelihood of discovery and reuse Aid students, instructors, researchers, and funders The ICPSR Bibliography of Data-related Literature Link research data to scholarly literature about it
7
It’s really a searchable database...... containing over 65,000 citations of known published and unpublished works resulting from analyses of data archived at ICPSR... that resides in Oracle, with an internal UI for database management... that can generate study bibliographies linking each study with the literature about it, and out to the full text
17
It’s useful to all stakeholders Instructors direct students to begin data-related research projects by reading some of the major works based on the data Advanced researchers also use it to conduct a focused literature review before deciding to use a dataset Reporters and policymakers looking for processed statistics look for reports explaining studies Principal investigators and funding agencies want to track how data are used after they are deposited
18
But challenging to provide
19
Provide PIs and data users with citations (since 1990) and DOIs (since 2008) for all study-level data
20
Explicit citation, in the references, with the DOI doi:10.3886/ICPSR21240 “The use of DOI names for the citing of data sets would make their provenance trackable and citable and therefore allow interoperability with existing reference services like Thomson Reuters “Web of Science...” From: http://www.codata.org/taskgroups/TGdatacitation/index.htmlhttp://www.codata.org/taskgroups/TGdatacitation/index.html
21
The state of data citation in the social science literature
22
Abstract? Acknowledgements? Charts and Tables? Appendices? References! Discussion? Footnotes? Sample? Methods? Data “Sighting” (implicit) vs. Data Citing (explicit)
23
Typical “sightings” Sample described, not named, no author information, no access information, only a publication cited Data named in text, with some attribution, but no access information Cited in reference section, but with no permanent, unique identifier, so difficult for indexing scripts to find to automate tracking
24
Challenges in database search infrastructure Journal databases fielded for journal article discovery are not ideal for finding data “sightation” No field searching on methods sections Full-text search brings back too many bad hits Limiting to abstract misses too many good hits
25
Tension between highly curating a manageable collection and minimally maintaining a broad collection Too many publications for efficient collection by humans, so we must make it easy for scripts to do it reliably Challenges in tracking many studies
26
Challenges of completeness Data use that is too difficult/costly to find cannot be counted A selective sample, difficult to draw accurate conclusions in broad analyses of reuse
27
Challenges in lack of data management planning Publishing sequence prevents citation creation before publication Potential for change by educating the PI/mentor; graduate directors; liaison librarians Consciousness raising starting to occur due to funders’ requirements
33
Poorly described and cited data + Excessive human search effort = Too costly, too questionable for confident measure of impact
37
Citing data with a DOI + Minimal human search effort = High hit accuracy for the cost, and better confidence of impact measures
38
Building a culture of viable data citation to improve measures of impact
39
From: CODATA Data Citation Standards and Practices Task Group. 2012. Task Group Data Citation and Attribution Bibliography http://www.codata.org/taskgroups/TGdatacitation/docs/CODATA_DDCTG_BestPracticesBib_FINAL_17June2012.pdf
40
http://www.datacite.org /
42
http://odin-project.eu/ The tool enables users to search the DataCite Metadata Store for their works, and subsequently to add (or claim) those research outputs – including datasets, software, and other types – to their ORCID profile. This should increase the visibility of these research outputs, and will make it easier to use these data citations in applications that connect to the ORCID Registry – ImpactStory is one of several services already doing this. http://odin- project.eu/2013/05/13/new- orcid-integrated-data- citation-tool/
43
Finding data with simple search fields Integration with Web of Knowledge All Databases: Research data is equal to research literature
44
Converting journal search infrastructure to meet the needs of data, but synching metadata still a work in progress. Articles linked to underlying data. Increased data discovery. Reward for data citation. Potential for automated tracking. Articles linked to underlying data. Increased data discovery. Reward for data citation. Potential for automated tracking. What audience does this have? Anecdotally, no large group of adopters yet. What audience does this have? Anecdotally, no large group of adopters yet.
45
http://iassistdata.org/
46
http://www.codata.org/taskgroups/TGdatacitation/index.html “CODATA, the Committee on Data for Science and Technology, is an interdisciplinary Scientific Committee of the International Council for Science (ICSU), was established 40 years ago. CODATA works to improve the quality, reliability, management and accessibility of data of importance to all fields of science and technology.” From: http://www.codata.org/about/who.htmlhttp://www.codata.org/about/who.html
47
“The move to encourage wider access to the results of publicly-funded research will have limited impact without the associated tools, networks and standards that are needed for sharing and mining of data. The Research Data Alliance aims to provide them.” https://rd-alliance.org/
48
Data-PASS partners work to change publishing practice
50
Altmetrics are an attempt to augment or replace the inadequate ways we now use to determine relevant and significant sources of knowledge: 1. peer review 2. citation counting 3. journal impact factors Altmetrics.org/manifesto In-text links, blogs, tweets, bookmarks, likes, data downloads...
51
ImpactStory Example user http://impactstory.org/CarlBoettigerExample user http://impactstory.org/CarlBoettiger Needs: more aggregator and repository data exposed for harvesting metrics
54
ICPSR will create APIs for others to query for usage statistics.
55
Other altmetrics resources ASIS&T Bulletin Special Section: Altmetrics: What, Why and Where? April/May 2013. http://www.asis.org/Bulletin/Apr-13/http://www.asis.org/Bulletin/Apr-13/ Piwowar, Heather. “Data Citation and Altmetrics Panel: Tools that Work Today to Reveal Dataset Use,” April 5, 2013. RDAP 13. Baltimore, MD. http://www.slideshare.net/asist_org/rdap13-piwowar-tools- that-work-today-to-reveal-dataset-use
56
Thank you. Elizabeth Moss eammoss@umich.edu
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.