Download presentation
Presentation is loading. Please wait.
Published byIrmela Böhler Modified over 6 years ago
1
Mining Citation Data Using the Web of Science API
A Data Gold Rush Phil White Earth Sciences & Environment Librarian
2
The Question: How would I conduct a citation analysis? How could it be done more efficiently?
3
New Methods Web of Science Web of Science API How do I do this?
API = Application Programming Interface SOAP API: runs on XML API has a URL Send the API URL an XML message, it will send an XML message in return How do I do this? One at a time using API tools (Postman, Hurl.it) Programmatically using a program language like Ruby, Python, R
4
Test Case: Geological Sciences @ CU
Downloaded bibliography of Geoscience faculty pubs at CU for past 5 years Symplectic Elements (CSV) Each faculty publication comes with a Web of Science accession number 421 publications indexed by WOS Developed Python script: Opens CSV containing each WOS accession number Sends XML message requesting all cited references for each accession number Compiles each response into one XML document (24,448 citations) About 9 minutes (bye bye student workers) Cleaned data in OpenRefine Standardized journal names using OpenRefine clustering tools Matched citation data to local holdings data using OpenRefine reconciliation tool
5
Test Case: Geological Sciences @ CU
Results: CU provides access to 92% of items cited 5 times or more 80% of all citations go to just 10% of all items cited (50% to just 1%) Discovered gaps in library collection Identified core collection of Geoscience serials (and the opposite)
6
Next Steps I’m not done! Current work: Future work:
Refine methods—test case matched data sets on serial titles. Very close now to matching on ISSNs. This will speed up process dramatically. Integrate other APIs into workflow: OCLC, Crossref Total time for test case about 40–50 hours. Could be as fast as 1 day. Current work: New science faculty at CU Evaluate all sciences at CU Future work: Cross-institution comparison …?
7
Implications A revolution for citation analysis and collection assessment? Speed Scale
8
Thank You! Want to collaborate?
Scripts: More: Want to collaborate? Phil White Earth Sciences & Environment Librarian
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.