Mining Citation Data Using the Web of Science API

Slides:



Advertisements
Similar presentations
A worldwide library cooperative OCLC Online Computer Library Center OCLC CJK Users Group 2007 Annual Meeting March 24, 2007, Boston David Whitehair, OCLC.
Advertisements

XID Web services Xiaoming Liu Senior Software Engineer OCLC.
Open Scholarship 2006 Bielefeld Academic Search Engine a Scientific Search Service for Institutional Repositories Open Scholarship 2006 New Challenges.
Freedom by design OL 2 Stephanie Taylor Project Manager.
Library Committee Meeting – 31 January 2007 Approval Plan.
Modern Language Association (MLA) International Bibliography Hosted by Gale Cengage Welcome to our Guided Tour Tour takes about 7 minutes. The show will.
Distributed Indexed Outlier Detection Algorithm Status Update as of March 11, 2014.
Increasing the Visibility of Full-Text, Electronic Format Journals Matt Hall Serials Solutions, LLC.
GridRPC Sources / Credits: IRISA/IFSIC IRISA/INRIA Thierry Priol et. al papers.
Y.Pancheshnikov, ACRL, 2003 Course-Centered Collection Evaluation in the Agricultural Sciences for University Instructional Program Reviews Yelena Pancheshnikov.
Discovery Tools in Academic Libraries: why, what and how? Edith Falk Chef Librarian The Hebrew University Library Authority.
Challenges for the DL and the Standards to solve them Alan Hopkinson Technical Manager (Library Systems) Learning Resources Middlesex University.
Interfacing with the MyRutgers Portal to send RU Alerts Lars Sorensen
1 Using Scopus for Literature Research. 2 Why Scopus?  A comprehensive abstract and citation database of peer- reviewed literature and quality web sources.
Metis Workflow System Kenneth M. Anderson University of Colorado, Boulder.
Using ProQuest Databases Jackson Community College Atkinson Library.
Search Engines. Allows a user to find information residing on remote computers; Searching differs from browsing in that the user is not required to provide.
Faculty of Medicine Comenius University in Bratislava Academic Library FM CU.
ALIAS Unmediated article requesting using the IDS Project's Article Licensing Information Availability Service (ALIAS) and ILLiad ILLiad Conference 2009IDSProject.org.
Moving forward our shared data agenda: a view from the publishing industry ICSTI, March 2012.
Searching UCN Databases Finding Journal Articles Through Ebsco.
The Research Process Joel Seewald, Librarian January 30, 2003.
Grid Computing, B. Wilkinson, 20043b.1 Web Services Part II.
The Quest for Information James Mouw The University of Chicago Library ERIL, March 20, 2008
“Getting Best Value from your Collection of E-Journals” Ian Pattenden - Bowker (UK) Ltd.
The role of knowledge bases in improving discoverability now and in the future- why national and international collaboration is key The role of knowledge.
CUFTS: Open-Source ERMS Andy Perry and Bill Drew SUNY New Paltz Tompkins Cortland Community College.
Science Librarians in the 21 st Century Michael Leach Harvard University
Discovering Resources at Friedsam Memorial Library.
REQUESTING ARTICLES InterLibrary Loan (ILL). Article Delivery & Turnaround Time Articles are scanned and delivered electronically Turnaround is 24 hours.
ISC Journal Citation Reprots تقارير استنادية للمجلات Mohammad Reza – Ghane Assistant Prof. in Library and Information Science & Director of Research Department.
PAN-European Exploitation of the Results of the Libraries Programme - EXPLOIT German Libraries Institute Berlin EXPLOIT 1 Electronic Access, Document Ordering.
PROCESSED RADAR DATA INTEGRATION WITH SOCIAL NETWORKING SITES FOR POLAR EDUCATION Jeffrey A. Wood April 19, 2010 A Thesis submitted to the Graduate Faculty.
Research Seminar Series Laura Abate Electronic Resources & Instructional Librarian
Issues, Concerns and Suggestions for Chinese E-resources Susan Xue Chair, Committee on Chinese Materials.
CITATION ANALYSIS A Tool for Collection Development and Enhanced Liaison Services Christine Brown and Denis Lacroix.
PlumX and Pitt: Understanding and Visualizing Research Impact Rush G. Miller Hillman University Librarian and Director, ULS University Library System University.
Intro to Web Services Dr. John P. Abraham UTPA. What are Web Services? Applications execute across multiple computers on a network.  The machine on which.
Web of Science Demonstration Search Chemistry 137 – Spring 2013 Grace Baysinger Head Librarian & Bibliographer, Swain Chemistry & Chemical Engineering.
Digital libraries research IG Cataloging and metadata IG Web services and metadata switch February 2003 Web services and metadata switch February 2003.

December 9, 2004 EC511 Java Pet Store Demo Chandra Donipati.
The Power of Aggregation: A Quantum Leap in Resource Discovery and Management CASLIN 2011 | June 13, 2011 Dr. Tamar Sadeh, Director of Marketing.
Automating Your Way to Easy Faculty Scholarship Collection Development Margaret Heller Loyola University Chicago
Developing citation services at the University of Manchester 3 rd Bibliometrics in Libraries, Fri 4 th July Scott Taylor, Research Services Librarian,
The search matrix - the point from which something else develops Lyn Leslie Faculty Senior Librarian.
Esri UC 2014 | Technical Workshop | Administering ArcGIS for Server with Python Jon Bodamer.
Remote Data Sources in Primo Ebsco API WorldCat API Local Content.
Faculty and Learning Resources Anthony Valenti Campus Director Learning Resources.
State of the art literature review on...
Indexing (and other good ideas)
Using Bibliographic Sources
Exporting references - Library Search
Library Assessment Tools & Technology
Spark Presentation.
Detailed search stats from DSpace Solr
Fearless Transformation: Applying OpenRefine to Digital Collections
Research impact and library support
K-Plex, Inc. We Develop Technology for… Personalization Integration
Therese - Good morning, (Introductions)
Library Content Comparison System
Exporting references - EBSCOhost
Research impact and library support
Build Better Data: Best Practices for Catalog Cleanup CT Library Association, April 23, 2018 Diane Napert, Interim Director Monographic Processing Services,
Collection analysis – Demonstrating value via the liaison program
Assessing the Assessment Tool
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
Citation Analysis for Shared Print Programs
Рахматуллаев Искандер
Mendeley Overview VISHAL GUPTA Customer Consultant South Asia
Presentation transcript:

Mining Citation Data Using the Web of Science API A Data Gold Rush Phil White Earth Sciences & Environment Librarian philip.white@colorado.edu

The Question: How would I conduct a citation analysis? How could it be done more efficiently?

New Methods Web of Science Web of Science API How do I do this? API = Application Programming Interface SOAP API: runs on XML API has a URL Send the API URL an XML message, it will send an XML message in return How do I do this? One at a time using API tools (Postman, Hurl.it) Programmatically using a program language like Ruby, Python, R

Test Case: Geological Sciences @ CU Downloaded bibliography of Geoscience faculty pubs at CU for past 5 years Symplectic Elements (CSV) Each faculty publication comes with a Web of Science accession number 421 publications indexed by WOS Developed Python script: https://github.com/outpw/WOKapiscripts Opens CSV containing each WOS accession number Sends XML message requesting all cited references for each accession number Compiles each response into one XML document (24,448 citations) About 9 minutes (bye bye student workers) Cleaned data in OpenRefine Standardized journal names using OpenRefine clustering tools Matched citation data to local holdings data using OpenRefine reconciliation tool

Test Case: Geological Sciences @ CU Results: CU provides access to 92% of items cited 5 times or more 80% of all citations go to just 10% of all items cited (50% to just 1%) Discovered gaps in library collection Identified core collection of Geoscience serials (and the opposite)

Next Steps I’m not done! Current work: Future work: Refine methods—test case matched data sets on serial titles. Very close now to matching on ISSNs. This will speed up process dramatically. Integrate other APIs into workflow: OCLC, Crossref Total time for test case about 40–50 hours. Could be as fast as 1 day. Current work: New science faculty at CU Evaluate all sciences at CU Future work: Cross-institution comparison …?

Implications A revolution for citation analysis and collection assessment? Speed Scale

Thank You! Want to collaborate? Scripts: https://github.com/outpw/WOKapiscripts More: http://slides.com/philipwhite/datamining Want to collaborate? Phil White Earth Sciences & Environment Librarian philip.white@colorado.edu