Petr Knoth & Nancy Pontika CORE The Open University @oacore Introducing scientometrics in the CORE Repositories Dashboard: a proposal Petr Knoth & Nancy Pontika CORE The Open University @oacore
What is CORE
What is CORE
Facts > 192 API users
> 125 Repositories Dashboard users Facts > 125 Repositories Dashboard users
Facts > 1034 Repositories
Facts > 8,900 Journals
Facts > 53 Languages
Facts > 36,207,179 Metadata
Facts > 3,800,995 Full-text
Aiming for the moon!
Cambridge vs Oxford Research Impact Contest Universities are increasingly deciding to pay for commercial tools that help them evidence research impact of their academics. The popular tools of choice, such as Elsevier’s SciVal, Thomson Reuter’s Web of Science and more recently Altmetric, cost universities substantial amounts. However, many performance indicators including bibliometric and usage based data can now be freely collected from datasets available on the Web and via APIs. This allows us to acquire both article and higher-level performance indicators to evidence impact for a given university based on papers in its repository. It also makes it possible to compare the research performance of universities based on these metrics. In this demonstration, we will show, on the example of a traditional Oxford University vs Cambridge University contest, how to freely gather and compare the research performance of universities. Using the popular iPython Notebook environment, we will show some code snippets and graphs demonstrating the practicality of our approach. Image source: The JeanRichard Aquascope Boat Race http://www.horbiter.com/en/jeanrichard-aquascope-boat-race/
Workflow CORE harvests the repositories of both institutions. Publication records for a given institution accessible through the CORE API. Microsoft Academic Graph is the world’s largest free open citation dataset we use to enrich our data. Mendeley is one of the most popular research reference managers and network. We use it as a free source of “altmetric” information – Mendeley readership. We analyse! CORE harvests the research repositories of both institutions All records were extracted for an given repository via the CORE API The data was integrated with the Mendeley Catalog API and the Microsoft Academic Graph. This were analyzed and conclusions of the results are... (tbc)
Step 1. Get publications for a given institution We export only the fields we are interested in (title, DOI, etc.) to csv for all records in both repositories
Step 2. Enrich the dataset with Mendeley readership
Average readers per year from 2000 to 2016
Step 3. Enrich with citations from MAG and then queried the matric for all papers and sorted by citation counts.
Post process citation data Remove papers with 0 citations Merge by DOI Aggregate by year The citations have a very long tail in general, as expected. So we didn’t take into account papers with 0 citations to produce averages for Cambridge/Oxford comparison, as 0 could also mean no data. So we removed the long tail and merged the results in a single table. To make the graph clearer we divided the log of the number of year, to remove the fact that older papers get more citations.
Average citation by paper by year
And the winner is… Oxford University papers have a higher readership Oxford vs Cambridge 1 – 0 University of Cambridge papers are cited more often 0 – 1 The citations have a very long tail in general, as expected. So we didn’t take into account papers with 0 citations to produce averages for Cambridge/Oxford comparison, as 0 could also mean no data. So we removed the long tail and merged the results in a single table. To make the graph clearer we divided the log of the number of year, to remove the fact that older papers get more citations.
CORE Metrics - I
CORE Metrics - II
Thank you! Petr Knoth, Research Fellow, petr.knoth@open.ac.uk Nancy Pontika, Open Access Aggregation Officer, nancy.pontika@open.ac.uk Website: http://core.ac.uk Email: theteam@core.ac.uk Twitter: @oacore