Programmatic interaction with the Invenio-based NADRE Repository

Slides:



Advertisements
Similar presentations
IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
Advertisements

EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
Open Scholarship 2006 Bielefeld Academic Search Engine a Scientific Search Service for Institutional Repositories Open Scholarship 2006 New Challenges.
Ere’s Stuff Ere Maijala IT Research Specialist The National Library of Finland.
Creation of an online catalog of dissertations using Access & ASP – slide 1 Creation of an online catalog of dissertations using Access & ASP: from Datatel.
Ingest and Loading DigiTool Version 3.0. Ingest and Loading 2 Ingest Agenda Ingest Overview and Introduction Ingest activity steps Transformers Task Chains.
Basic Concepts Architecture Topology Protocols Basic Concepts Open e-Print Archive Open Archive -- generalization of e-print Data Provider and Service.
CGI Programming: Part 1. What is CGI? CGI = Common Gateway Interface Provides a standardized way for web browsers to: –Call programs on a server. –Pass.
1 CS428 Web Engineering Lecture 18 Introduction (PHP - I)
Publishing Digital Content to a LOR Publishing Digital Content to a LOR 1.
JY Le Meur/Tibor Simko 12 th Feb’04 1)Context 2)Interoperability 3)Submission 4)Search 5)Preservation CERN, OAI3 Workshop, Geneva.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
Chapter 16 The World Wide Web. 2 The Web An infrastructure of information combined and the network software used to access it Web page A document that.
TECHNICAL DOCUMENTATIONPARTNERS DOWNLOAD DATA Download water quality data in MS Excel, CSV, TSV, and KML formats. Learn how to use the portal and data.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
HTRC API Overview Yiming Sun. HTRC Architecture Data API Portal access Direct programmatic access (by programs running on HTRC machines) Security (OAuth2)
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
University of Illinois at Urbana-Champaign A Unified Platform for Archival Description and Access Christopher J. Prom, Christopher A. Rishel, Scott W.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
How do I find works in the Repository?. University of Texas Libraries UT DR Digital Repository Search in the Repository Keyword search from the Repository.
Facebook API Kelly Orser. Client Libraries Client libraries will simplify the calls to the platform by reducing the amount of code you have to write.
OAI Workshop, October 17, Geneva, Switzerland CERN Document Server: An OAI-based solution for managing data collections Jean-Yves.
Introduction to Morpho RCN Workshop Samantha Romanello Long Term Ecological Research University of New Mexico.
CSUN eCommons Submitting Learning Objects to CSUN eCommons: A Preliminary Guide February 7, 2008.
WStore Programmer Guide Offering management integration.
INIS INPUT SUBMISSION. INIS Input Submission procedure how to submit bibliographic data PDFs to INIS Secure FTP server CD-ROM DVD October.
Search with Invenio Invenio User Group Workshop 2012 CERN IT-CIS-DLS – Flavio Costa.
Metadata Content Entering Metadata Information. Discovery vs. Access vs. Understanding Cannot search on content if it is not documented. Cannot access.
Digital Library Services team Indico Workshop - CERN – Invenio: a possible search system for Indico.
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
HTML FORM AND PHP IST 210: Organization of Data IST210 1.
Session 11: Cookies, Sessions ans Security iNET Academy Open Source Web Development.
INTERNET APPLICATIONS CPIT405 Forms, Internal links, meta tags, search engine friendly websites.
Google Analytics Graham Triggs Head of Repository Systems, Symplectic.
General Architecture of Retrieval Systems 1Adrienn Skrop.
HTML III (Forms) Robin Burke ECT 270. Outline Where we are in this class Web applications HTML Forms Break Forms lab.
X2R Spec 1. Change log DateVersionPeopleNote 2013/11/01V0.0.1Chien-Wei Yu, Anderson Ou First draft, add X2R files spec. 2013/12/16V0.0.2Anderson Ou, Doc.
AdisInsight User Guide July 2015
Programmatic Interaction with Open Access Repositories
Bielefeld Academic Search Engine
Programmatic Interaction with Open Access Repositories
EthERNet Repository - Final report
Loading Records Through the Registry’s REST Interface
User Awareness Program ‘Accessing Emerald’ Universitas Lancang Kuning
CIIT-Human Computer Interaction-CSC456-Fall-2015-Mr
Managing Copyrights in Invenio
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
A Lightweight Structured Data Implementation Using JSON-LD and Schema
Detailed search stats from DSpace Solr
PHP Training at GoLogica in Bangalore
ACEPRD Plant Repository - Final report
Overview of INIS IT systems and applications
WEBSITE DESIGN Chp 1
CS6604 Digital Libraries IDEAL Webpages Presented by
jQuery form submission
U.S. Environmental Protection Agency
Context Interoperability Submission Search Preservation
Atelier Progress Report
Márton Németh – László Drótos How to catalogue a web archive?
Clearinghouse Overview.
The NADRE services Mr. Mario Torrisi (PI4 – Italy –
The NADRE services Mr. Mario Torrisi (PI4 – Italy –
Programmatic interaction with the Invenio-based NADRE Repository
The NADRE services Mr. Mario Torrisi (PI4 – Italy –
How to request a new DataCite DOI prefix or a
The NADRE services Mr. Mario Torrisi (PI4 – Italy –
Advanced hands-on on programmatic access to an Open Access Repository
Intro to Web Services Consuming the Web.
Presentation transcript:

Programmatic interaction with the Invenio-based NADRE Repository Mr. Mario Torrisi (PI4 – Italy – mario.torrisi@ct.infn.it)  16 October 2018 – Third NADRE Training Workshop – Jimma (Ethiopia)

Overview Search engine API Upload records GitHub repository XML API Outline Overview GitHub repository Search engine API XML API JSON API Search engine API hands-on Upload records MARCXML file Upload records hands-on curl php

Overview For this tutorial you can refer to Github project repository, that collects all the examples you will see: https://github.com/nadre-project/nadre-tutorial Clone or download this repository on your system as shown in this video

Invenio offers three different kinds of APIs XML API JSON API Search engine API Allows you to search digital asset on the NADRE Repository, sending HTML requests Invenio offers three different kinds of APIs XML API Will return output in MARCXML JSON API Internally, Invenio records are represented in JSON, so you can ask for JSON output format Python API Invenio Search Engine can be called from within your Python programs (this API is not covered in this tutorial)

XML API

GET /search?p=...&of=...&ot=...&jrec=...&rg=... XML API Using XML API Invenio replies with an XML containing the records found Syntax: GET /search?p=...&of=...&ot=...&jrec=...&rg=... Example: Get the first 10 records in XML format http://nadre.ethernet.edu.et/search?jrec=1&rg=10&of=xm Parameters jrec - jump to record ID (e.g. 1 for first hit) rg - records in group (e.g. 10 hits per page) of - output format (e.g. Xm for XML format) Full list of parameters: link

Paginate results (XML API) Set jrec and rg properly to paginate the output Example http://nadre.ethernet.edu.et/search?of=xm&jrec=1&rg=10 http://nadre.ethernet.edu.et/search?of=xm&jrec=11&rg=10 http://nadre.ethernet.edu.et/search?of=xm&jrec=22&rg=10 Do not set rg too high – there is a server-wide safety limit for it

Look for patterns in fields (XML API) Get the first 10 records that contain the string “Hackfest” in the title: http://nadre.ethernet.edu.et/search?p=Hackfest&f=title&jrec=0&rg=10&of=xm Parameters p - pattern (e.g. your query) f - field to search within (e.g. “title”, “authors”, etc.) Get the first 10 records in 'PRESENTATIONSNADRE' collection that contain 'NADRE' in keyword: http://nadre.ethernet.edu.et/search?p1=collection:PRESENTATIONSNADRE+keyword:NADRE&of=xm&jrec=1&rg=10 p1 - first pattern to search for

Filter records and outputs in NADRE Repository (XML API) Get all records uploaded from a given date (e.g. 2018-01-01) to another given date (e.g. 2018-02-22) http://nadre.ethernet.edu.et/search?of=xm&d1=2018-01-01&d2=2018-02-22 Parameters d1 - is the first date in `YYYY-mm-dd` format d2 - is the second date in `YYYY-mm-dd` format Get only the abstract, title and authors of a resources http://nadre.ethernet.edu.et/search?of=xm&ot=abstract,title,authors ot: output tags, that is a comma separated lists of tags should be shown (e.g. ‘’ to get all fields, ‘title’ to get titles only)

JSON API

JSON API Internally, Invenio records are represented in JSON. You can ask for JSON output format (`of=recjson`) Syntax: GET /search?p=...&of=...&ot=...&jrec=...&rg=... Example: Get the first 10 records in XML format http://nadre.ethernet.edu.et/search?jrec=1&rg=10&of=recjson Parameters jrec - jump to record ID (e.g. 1 for first hit) rg - records in group (e.g. 10 hits per page) of - output format (e.g. Xm for XML format)

Paginate results (JSON API) Set jrec and rg properly to paginate the output Example http://nadre.ethernet.edu.et/search?of=recjson&jrec=1&rg=10 http://nadre.ethernet.edu.et/search?of=recjson&jrec=11&rg=10 http://nadre.ethernet.edu.et/search?of=recjson&jrec=21&rg=10 Do not set rg too high – there is a server-wide safety limit for it

Look for patterns in fields (JSON API) Get the first 10 records that contain the string “Hackfest” in the title: http://nadre.ethernet.edu.et/search?p=Hackfest&f=title&jrec=0&rg=10&of=recjson Parameters p - pattern (e.g. your query) f - field to search within (e.g. “title”, “authors”, etc.) Get the first 10 records in 'PRESENTATIONSNADRE' collection that contain 'NADRE' in keyword: http://nadre.ethernet.edu.et/search?p1=collection:PRESENTATIONSNADRE+keyword:NADRE&of=recjson&jrec=1&rg=10 p1 - first pattern to search for

Filter records and outputs in NADRE Repository (JSON API) Get all records uploaded from a given date (e.g. 2018-01-01) to another given date (e.g. 2018-02-22) http://nadre.ethernet.edu.et/search?of=recjson&d1=2018-01-01&d2=2018-02-22 Parameters d1 - is the first date in `YYYY-mm-dd` format d2 - is the second date in `YYYY-mm-dd` format Get only the abstract, title and authors of resources http://nadre.ethernet.edu.et/search?of=recjson&ot=abstract,title,authors ot: output tags, that is a comma separated lists of tags should be shown (e.g. ‘’ to get all fields, ‘title’ to get titles only)

Search engine API hands-on https://github.com/nadre-project/nadre-tutorial/tree/master/search

Search engine references To know more about XML, JSON and Python API of an Invenio based OAR visit this guide: http://nadre.ethernet.edu.et/help/hacking/search-engine-api

Upload records

Send an IP address authorization request Upload records Send an IP address authorization request Create a MARCXML file as input (e.g. your_file.xml) that describes the resources you’re going to upload to NADRE Repository Submit this XML file to the Repository: curl –T your_file.xml http://nadre.ethernet.edu.et/batchuploader/robotupload/insert -A invenio_webupload -H “Content-Type: application/marcxml+xml” A generic file you can use as template for your submission can be found at: https://github.com/nadre-project/nadre-tutorial/blob/master/submit/xml/0-generic-submission-to-OAR.xml

Must be compliant with MARCXML standard your_file.xml (1/3) Must be compliant with MARCXML standard Must have only one <collection…> tag <collection…> can have one or more <record…> that represents the resource

Each record has many <datafield…> tags your_file.xml (2/3) Each record has many <datafield…> tags tag value refers to a corresponding MARCXML metadata Each <datafield…> can have many <subfield…> that are the metadata values based on the code attribute value

Digital Object Identifier (MAN) (NR) your_file.xml (3/3) Digital Object Identifier (MAN) (NR) tag=”024” Main author (MAN) (NR) tag=”100” Other authors (R) tag=”700” Keyword (R) tag=”653” Collection (MAN) (NR) tag=”980” (MAN) Mandatory tag, (NR) not repetitive, (R) repetitive https://nadre.ethernet.edu.et/help/admin/howto-marc

Upload records hands-on https://github.com/nadre-project/nadre-tutorial/tree/master/submit

Upload records references BibUpload admin guide http://nadre.ethernet.edu.et/help/admin/bibupload-admin-guide#2 MARCXML http://nadre.ethernet.edu.et/help/admin/howto-marc

Thank you! አመሰግናለሁ!