Web-Services and RESTful APIs BCHB697
Outline Web-services RESTful web-services Use of RESTful APIs from Python CouchDB, UniProt BCHB697 - Edwards
Web-services Provide computers information or computed values over the network Internet as network (global, IP) HyperText Transfer Protocol (HTTP, HTTPS) Machine-readable responses (XML, JSON) Remote "Function Call": Send data (parameters), get data back Uses the same infrastructure that supports web browsers and facebook, gmail, etc… BCHB697 - Edwards
Web-services http://hoyataxa.georgetown.edu:8080/taxa?taxid=9606&format=XML protocol (http,https) computer name resource port # (80 if omitted) resource/request parameters http://hoyataxa.georgetown.edu/taxa/9606.xml protocol (http,https) computer name resource resource/request parameters BCHB524 - Edwards
HyperText Transfer Protocol (HTTP) Uniform Resource Locator (URL): Protocol; machine; port; resource; parameters For the interactive web, two main request types (methods): GET – everything is encoded in the URL POST – URL plus extra data. GET requests (URLs) can be bookmarked POST requests for web-form submissions BCHB697 - Edwards
authority / data-provider RESTful web-services Web-service APIs are often used to provide the instances of a logical data-model to client applications REST: Representational State Transfer RESTful web-services associate URIs with data-model entities: Uniform Resource Identifier http://hoyataxa.georgetown.edu/taxa/9606 protocol (http,https) authority / data-provider entity identifier BCHB697 - Edwards
authority / data-provider RESTful web-services RESTful web-services associate URIs with data-model entities: Notice how well this maps to database, table, and row This web-service returns everything about taxonomy id 9606 in XML or JSON format. http://hoyataxa.georgetown.edu/taxa/9606 protocol (http,https) authority / data-provider entity identifier BCHB697 - Edwards
authority / data-provider RESTful web-services RESTful web-services associate URIs with data-model entities: Notice how well this maps to database, table, and row This web-service returns everything about taxonomy id 9606 in XML or JSON format. http://hoyataxa.georgetown.edu/taxa/9606 protocol (http,https) authority / data-provider entity identifier BCHB697 - Edwards
Examples of RESTful resource URIs UniProt https://www.uniprot.org/uniprot/P25911 GlyTouCan https://glytoucan.org/Structures/Glycans/G00028MO NCBI https://www.ncbi.nlm.nih.gov/gene/54923 These URIs are easily constructed and quite transparent Access using HTTP GET; safe & read only BCHB697 - Edwards
RESTful Web-Services Unique URI to reference every resource in your API Omit identifier for summary or list Use parameters for collections or subsets GET requests for URIs and simple parameters http://hoyataxa.georgetown.edu/taxa/9606 http://hoyataxa.georgetown.edu/taxa http://hoyataxa.georgetown.edu/taxa?rank=species BCHB697 - Edwards
RESTful Web-Services HTTP request types: Create: POST to entity URI GET – retrieve a specific resource (idempotent) PUT – update a specific resource (idempotent) DELETE – delete a specific resource (idempotent) POST – create/retrieve/update/delete (CRUD) one or more resources Create: POST to entity URI Retrieve: GET instance URI Update: POST or PUT instance URI Delete: POST or DELETE instance URI BCHB697 - Edwards
RESTful Web-Services HTTP request body: HTTP return codes: … for PUT, DELETE, POST May be empty or JSON/XML GET style parameters can also be used HTTP return codes: 400 Bad Request 401 Unauthorized 403 Forbidden 404 Not Found 500 Internal Server Error BCHB697 - Edwards
RESTful Web-Services Takes advantage of existing web-based infrastructure technologies Caching Load balancing Proxying Authentication BCHB697 - Edwards
Example Web-Services UniProt: GlyTouCan: GlyGen: CouchDB https://www.ebi.ac.uk/proteins/api GlyTouCan: https://api.glytoucan.org GlyGen: http://api.glygen.org/test/glycan.html CouchDB http://docs.couchdb.org/en/latest/api/index.html http://docs.couchdb.org/en/latest/api/database/find.html BCHB697 - Edwards
CouchDB RESTful GET requests: /<db>/_all_docs, /<db>/<id> from urllib import urlopen from json import loads baseurl = 'https://edwardslab.bmcb.georgetown.edu/couchdb/' request = 'uniprot/_all_docs' response = loads(urlopen(baseurl+request).read()) for r in response['rows']: eid = r['id'] entry = loads(urlopen(baseurl+'uniprot/'+eid).read()) print entry['accession'] BCHB697 - Edwards
CouchDB RESTful GET requests: /<db>/_all_docs, /<db>/<id> from couchws import * request = 'uniprot/_all_docs' response = couch_webservice_request(request) for r in response['rows']: eid = r['id'] entry = couch_webservice_request('uniprot/'+eid) print entry['accession'] BCHB697 - Edwards
CouchDB For GET requests, urllib is fine, but otherwise… import sys, json from couchws import * print "Get uniprot entry by accession" payload = {"selector": { "accession": "A0A001" }} print json.dumps(payload,indent=2) request = '/uniprot/_find' response = couch_webservice_request(request,payload,method='POST') print json.dumps(response,indent=2) print "Get uniprot entry by _id" id = response["docs"][0]["_id"] request = '/uniprot/'+id response = couch_webservice_request(request) BCHB697 - Edwards
CouchDB Use urllib2 (or another modern library) to handle PUT and DELETE requests and/or authentication Use a module to hide the ugly details! def couch_webservice_request(request, payload={}, method='GET', username=None, password=None) BCHB697 - Edwards
CouchDB For GET requests, urllib is fine, but otherwise… import sys, json from couchws import * db = sys.argv[1] request = db print "Create database",db response = couch_webservice_request(request,method='PUT', username='admin', password='admin') print json.dumps(response,indent=2) print "Get database status" response = couch_webservice_request(request) print "Delete database",db response = couch_webservice_request(request, method='DELETE', BCHB697 - Edwards
Load CouchDB from UniProt import sys, json from couchws import * db = sys.argv[1] response = couch_webservice_request(db, method='PUT', username='admin', password='admin') print json.dumps(response,indent=2) payload = dict(offset=0, size=1000, taxid=int(sys.argv[2])) response = uniprot_webservice_request("/proteins",payload) print len(response), "documents from uniprot" all_docs = {'docs': response} request = "/"+db+"/"+"_bulk_docs" response = couch_webservice_request(request, all_docs, method='POST') BCHB697 - Edwards
Exercise Explore the uniprot, glygen, glytoucan, and couchdb web-services Implement some novel interactions with couchdb: Make sure your database names are unique use your net-id? Implement some novel interactions with uniprot, glygen, glytoucan. Can you successfully retrieve data? BCHB697 - Edwards