Data Reference In Depth: Citation Hailey Mooney Data Services and Reference Librarian Michigan State University Libraries IASSIST 2010 Conference – June 4, 2010
A Question How do current data citation practices affect the ability of librarians to perform the task of bibliographic verification for datasets?
Study of an Assumption “Many researchers…are not aware that published data deserves citation just like published articles, perhaps in part because so many articles presently use data without citation.” Freese, J. (2007). Replication standards for quantitative social science: Why not sociology?. Sociological Methods & Research, 36 (2):
Research Question How often do faculty at Michigan State University cite (or not cite) research data? –MSU Major research university Over 200 programs of study 17 degree granting colleges 47,278 total students: 36,489 undergraduate and 10,789 graduate and professional (Fall 2009) Approximately 4,985 faculty and academic staff Sponsored research totaled nearly $405 million in 2008–09
Review of previous research Is there any evidence that supports the common knowledge that data is not consistently cited? “…Researchers’ behavior, attitudes and knowledge concerning the citation of data sets fall short of the ideal that would foster openness, fairness and economy in the pursuit of scientific knowledge.” Sieber, J. E. & Trumbo, B. E. (1995). (Not) giving credit where credit is due: Citation of data sets. Science and Engineering Ethics, 1 (1):
Call for Standards 1979: Dodd, S. A. (1979). Bibliographic references for numeric social science data files: Suggested guidelines. Journal of the American Society for Information Science, 30 (2): : Dodd, S. A. (1990). Bibliographic references for computer files in the social sciences: A discussion paper. Chapel Hill, NC: Institute for Research in Social Science. Retrieved from /2007: Altman, M. & King, G. (2007). A proposed standard for the scholarly citation of quantitative data. D-Lib Magazine, 13 (3/4). 2006: Schneider, J. (2006, Spring). Why we need a data citation standard: Lessons learned from compiling ICPSR’s Bibliography of Data-Related Literature. ICPSR Bulletin. Retrieved from Q1.pdf. 2008: Kelly, M. C. (2008). NISO thought leader meeting on research data. Retrieved from : Green, T. (2009). We need publishing standards for datasets and data tables. OECD Publishing White Paper, OECD Publishing. 2009: Brase, et al. (2009). Approach for a joint global registration agency for research data. Information Services & Use, 29 (1): (i.e, DataCite)
Style Manuals Style ManualAcknowledgment of Datasets American Sociological Association Style Guide, 2 nd ed. (1997) & 3 rd ed. (2007) YES - Appendix provides example for the citation of “Machine- Readable Data Files.” Chicago Manual of Style, 15 th ed. (2003) NO – But related: Source notes: acknowledgment of data. [For a table: Example is for a related publication, not an actual dataset] Personal communications, unpublished data, and such. Publication Manual of the American Psychological Association, 5 th ed. (2001) YES – Examples and instructions on retaining raw data after an article is published Unpublished raw data from study, untitled work & 95: Raw data Data file, available from government survey 95. Data file, available from NTIS Web site Publication Manual of the American Psychological Association, 6 th ed. (2010) YES – Examples and instructions on retaining raw data after an article is published 7.08 Data Sets, Software, Measurement Instruments, and Apparatus Data set Style Manual for Political Science, Revised Ed. (2001) YES – Example provided for Data Archived and Available at the Inter-university Consortium for Political and Social Research
The Context for Data Citations Data Citations Citations Style Manuals Information Standards Library Systems: Organization, Search, Retrieval Publishing: Journals, Books, etc. Scholarly Communication Creation Evaluation Dissemination Preservation Research Data Use & Reuse Data Sharing Citation Analysis Behavior Motivations Reference Desk: Bibliographic verification
Data & Methodology Universe: All empirical research articles based on secondary analysis of publicly available datasets and published in scholarly journals. Sample: Journal articles authored or co- authored by MSU faculty indexed in the ICPSR Bibliography of Data-Related Literature. –Time Period: –Author not involved in original data collection –Faculty from social science departments: Political Science Sociology Psychology Criminal Justice
The Sample 49 articles 25 authors 39 journals 33 dataset series Number of Articles in Sample by Year Sample Distribution
Variables Dataset citation Location of dataset citation Related publication citation Type of related publication Citation style format
Frequency of Citations Data is not consistently cited Dataset (any)Dataset (reference list)Related publication Percentage of Articles Citation Provided
Journal Styles
Journal Styles and Data Citation Practices Editors fail to enforce standards
Academic Departments and Citation Practices All departments fail to provide consistent data citations, regardless of ICPSR involvement Academic Department Articles Data Citation Practices by Academic Department
Conclusions Citing data is not a norm –Recent development in the history of scholarly communication Existing standards are not enforced
Further Research Data sharing cultures –More disciplines: social sciences vs. other sciences –Journal data sharing policies –secondary analysis vs. primary
Back to the Reference Desk Raise awareness Promote good scholarship
Bibliography Altman, M., & King, G. (2007). A proposed standard for the scholarly citation of quantitative data. D-Lib Magazine, 13(3/4). Retrieved from Brase, J., Farquhar, A., Gastl, A., Gruttemeier, H., Heijne, M., Heller, A., et al. (2009). Approach for a joint global registration agency for research data. Information Services & Use, 29(1), Retrieved from Brooks, T. (2010). Citer motivations. In Encyclopedia of Library and Information Sciences (3rd ed, Vol. 2, pp ) [Electronic version]. Boca Raton, FL: CRC Press. Camacho-Miñano, M., & Núñez-Nickel, M. (2009). The multilayered nature of reference selection. Journal of the American Society for Information Science and Technology, 60(4), Retrieved from Dodd, S. A. (1979). Bibliographic references for numeric social science data files: Suggested guidelines. Journal of the American Society for Information Science, 30(2), Retrieved from Dodd, S. A. (1990). Bibliographic references for computer files in the social sciences: A discussion paper. Chapel Hill, NC: Institute for Research in Social Science. Retrieved from Freese, J. (2007). Replication standards for quantitative social science: Why not sociology? Sociological Methods Research, 36(2), Retrieved from Green, T. (2009). We need publishing standards for datasets and data tables. OECD Publishing White Paper. Paris: OECD Publishing. Retrieved from Griffiths, A. (2009). The publication of research data: Researcher attitudes and behaviors. The International Journal of Digital Curation, 4(1), Retrieved from Kelly, M. C. (2008). NISO Thought Leader meeting on research data. Retrieved from King, G. (2007). An introduction to the Dataverse Network as an infrastructure for data sharing. Sociological Methods Research, 36(2), Retrieved from Nelson, B. (2009). Empty Archives. Nature, 461(7261), Nicolaisen, J. (2007). Citation analysis. Annual Review of Information Science and Technology, 41, Schneider, J. (2006, Spring). Why we need a data citation standard: Lessons learned from compiling ICPSR's Bibliography of Data- Related Literature. ICPSR Bulletin, Retrieved from Q1.pdf Sieber, J. E., & Trumbo, B. E. (1995). (Not) giving credit where credit is due: Citation of data sets. Science and Engineering Ethics, 1(1), Retrieved from Turner, S. (2007). Scientific norms/counternorms. In Blackwell Encyclopedia of Sociology. Blackwell Reference Online. Retrieved from White, H. D. (1982). Citation analysis of data file use. Library Trends, 30(3), Retrieved from