Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digitometric Services for Open Archives Environments

Similar presentations


Presentation on theme: "Digitometric Services for Open Archives Environments"— Presentation transcript:

1 Digitometric Services for Open Archives Environments
Tim Brody Simon Kampa, Stevan Harnad, Les Carr, Steve Hitchcock University of Southampton, Intelligence, Agents, Multimedia Group 08 December 2018 ECDL 2003, Trondheim, Norway

2 Open Archives Initiative
The protocol is openly documented, and metadata is “exposed” to at least some peer group (note: rights management can still apply!) Archive defined as a “collection of stuff” -- not the archivist’s definition of “archive”. “Repository” used in most OAI documents. Promoting interoperability 08 December 2018 ECDL 2003, Trondheim, Norway

3 OAI Data Model: Resources/Items/Records
All available (meta)data about the resource Item = OAI identifier item Dublin Core Metadata MARC Metadata ??? XML records record = metadata + identifier + datestamp 08 December 2018 ECDL 2003, Trondheim, Norway

4 Protocol Responses 08 December 2018 ECDL 2003, Trondheim, Norway

5 Protocol 1 2 3 HTTP URL Requests Service Provider Data Provider
XML Responses Identify 1 Collection-level Description ListRecords?metadataPrefix=xyz 2 All repository xyz records 3 ListRecords?from= &… All repository xyz records since 08 December 2018 ECDL 2003, Trondheim, Norway

6 Other Commands ListIdentifiers ListMetadataFormats ListSets GetRecord
Return only the identifier/datestamp/set membership ListMetadataFormats Return the available data formats ListSets Return the set structure (if there is one) GetRecord Return a record given by OAI identifier 08 December 2018 ECDL 2003, Trondheim, Norway

7 Interest in OAI 111 registered OAI repositories
Many unregistered (e.g. all GNU EPrints.org and DSpace archives) 4,500,000 public records NSDL project, UK’s JISC Information Environment OLAC (language community built on OAI) 08 December 2018 ECDL 2003, Trondheim, Norway

8 Why OAI? Mandated Dublin Core allows the quick establishment of basic services and tools Simple and metadata-neutral protocol allows more interesting possibilities (without breaking 1.) and extensions … 08 December 2018 ECDL 2003, Trondheim, Norway

9 Adding Caching to OAI-PMH
08 December 2018 ECDL 2003, Trondheim, Norway

10 Celestial (OAI Cache) Developed to maintain a local metadata copy
Avoid repeated, large harvests during development Provides an abstraction over multiple OAI versions (hence acts as a gateway to older implementations) Useful for testing OAI implementations & improving performance Using XSLT provides a Web interface to OAI Provides redundancy 08 December 2018 ECDL 2003, Trondheim, Norway

11 08 December 2018 ECDL 2003, Trondheim, Norway

12 Citebase Search – Data Model
e-Services 08 December 2018 ECDL 2003, Trondheim, Norway

13 Content 250,000 full-text resources 6 million references
240,000 of which arXiv.org 6 million references 29 mean refs/paper (therefore failed to extract references for 18% of papers) (n.b. modal refs is 19) 1 million references linked internally to the full-text (15%) 08 December 2018 ECDL 2003, Trondheim, Norway

14 08 December 2018 ECDL 2003, Trondheim, Norway

15 Citebase Search 08 December 2018 ECDL 2003, Trondheim, Norway
The abstract page shows the usual title/authors/abstract and some analysis of the current article. The graph shows over time when the paper has been cited and when it has been downloaded. 08 December 2018 ECDL 2003, Trondheim, Norway

16 Citebase Search: Navigation by Citation Links
Article with reference list Future Reference link Following the abstract are links to related pages by citations. These links can go backwards in time using the reference list, forwards in time by what has cited me, and sideways by either related or co-citation. Related papers are papers that have a similar reference list – often where an author has used the same references more than once! Co-cited is where two papers have been cited next to each other, the same as author co-citation. However co-cited papers can only be found for articles that have been cited, hence can’t be used for new articles. Related Current Article Co-cited Past 08 December 2018 ECDL 2003, Trondheim, Norway

17 Citebase Search cites cites 08 December 2018
This is the reference list, as parsed from the full-text. “eprint” takes the user to the Citebase abstract page of the cited article, journal are bespoke links for the American Physical Society journals. 08 December 2018 ECDL 2003, Trondheim, Norway

18 Citebase Search cites cites 08 December 2018
Articles that have cited the current article, following these links will take the user towards newer papers. 08 December 2018 ECDL 2003, Trondheim, Norway

19 Citebase Search “Co-cited” 08 December 2018
And co-cited articles. The development version of Citebase also includes Related articles. 08 December 2018 ECDL 2003, Trondheim, Norway

20 Read/Cite Cycle 08 December 2018 ECDL 2003, Trondheim, Norway

21 Digitometric Services for OAI
Tools for visualising research metadata Builds an analysis service on Citebase Knowledge mapping (co-authors, co-citation, etc.) 08 December 2018 ECDL 2003, Trondheim, Norway

22 Co-Citation Network 08 December 2018 ECDL 2003, Trondheim, Norway
A co-citation map embedded within the Digitometric user interface. The nodes on the map represent individual publications. By hovering with the mouse pointer over a node, the user can generate details (title, author, abstract) in the information box. The arcs between the nodes represent a co-citation relationship. A cluster of related publications are evident in the centre of the map. Four distinct paths emanate out of this indicating the possibility of specialty fields arising out of the main cluster. 08 December 2018 ECDL 2003, Trondheim, Norway

23 Full Co-Citation Map 08 December 2018 ECDL 2003, Trondheim, Norway
A full-sized co-citation map with a lower co-citation threshold resulting in more nodes being included. Several clusters (research fronts) are evident, in particular the large cluster towards the bottom right of the map. Researchers may get a better understanding of their research landscape by exploring these clusters and the relationships between them. Different colours are also used to indicate which nodes have been recently highly cited, paving the way for up-and-coming (or dying) research fronts to be identified. There are also several occurences of 5 or 6 nodes emanating sequentially out of a single node, indicating a sequence of papers being published that address a common problem or theme. 08 December 2018 ECDL 2003, Trondheim, Norway

24 Digitometric Services for Open Archives Environments
AKT Project (knowledge) Thank you for listening! Tim Brody 08 December 2018 ECDL 2003, Trondheim, Norway


Download ppt "Digitometric Services for Open Archives Environments"

Similar presentations


Ads by Google