Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mining Citation Data Using the Web of Science API

Similar presentations


Presentation on theme: "Mining Citation Data Using the Web of Science API"— Presentation transcript:

1 Mining Citation Data Using the Web of Science API
A Data Gold Rush Phil White Earth Sciences & Environment Librarian

2 The Question: How would I conduct a citation analysis? How could it be done more efficiently?

3 New Methods Web of Science Web of Science API How do I do this?
API = Application Programming Interface SOAP API: runs on XML API has a URL Send the API URL an XML message, it will send an XML message in return How do I do this? One at a time using API tools (Postman, Hurl.it) Programmatically using a program language like Ruby, Python, R

4 Test Case: Geological Sciences @ CU
Downloaded bibliography of Geoscience faculty pubs at CU for past 5 years Symplectic Elements (CSV) Each faculty publication comes with a Web of Science accession number 421 publications indexed by WOS Developed Python script: Opens CSV containing each WOS accession number Sends XML message requesting all cited references for each accession number Compiles each response into one XML document (24,448 citations) About 9 minutes (bye bye student workers) Cleaned data in OpenRefine Standardized journal names using OpenRefine clustering tools Matched citation data to local holdings data using OpenRefine reconciliation tool

5 Test Case: Geological Sciences @ CU
Results: CU provides access to 92% of items cited 5 times or more 80% of all citations go to just 10% of all items cited (50% to just 1%) Discovered gaps in library collection Identified core collection of Geoscience serials (and the opposite)

6 Next Steps I’m not done! Current work: Future work:
Refine methods—test case matched data sets on serial titles. Very close now to matching on ISSNs. This will speed up process dramatically. Integrate other APIs into workflow: OCLC, Crossref Total time for test case about 40–50 hours. Could be as fast as 1 day. Current work: New science faculty at CU Evaluate all sciences at CU Future work: Cross-institution comparison …?

7 Implications A revolution for citation analysis and collection assessment? Speed Scale

8 Thank You! Want to collaborate?
Scripts: More: Want to collaborate? Phil White Earth Sciences & Environment Librarian


Download ppt "Mining Citation Data Using the Web of Science API"

Similar presentations


Ads by Google