First Indico Workshop HTTP API Adrian Mönnich May 2013 CERN
API Uhh… what’s an API and why do we need one? An application programming interface (API) is a protocol intended to be used as an interface by software components to communicate with each other. Wikipedia
Before the api export.py Not powerful at all: Only events inside categories XML, RSS, iCal or HTML Accessing it requires an XML parser Protection? Not really Must be restricted to trusted machines by IP Reusable code? None / export.py?fid=2l12&date=today&days=1000&of=xml
And the developers said… …let there be a proper API… …and they created one
Design ideas How to make the API awesome? Handle authentication in a clean and secure way (API Keys, OAuth2) Various security levels available (API keys, HMAC signatures, …) Let developers write code for the actual task Serialize Python data structures as JSON, XML, … Provide utilities for common things (relative dates, pagination)
Using the API Before we go into details, look at how it works Indico Administrators: API Modes: Control how strict the API key requirements are At CERN: We always require an API key, but a signature is only needed to access non-public information Persistency: Lowers security a little bit (one leaked URL is valid forever) Allows people to use e.g. a RSS reader (cannot sign requests)
Using the API Before we go into details, look at how it works Indico Users: Creating an API key is easy – just one click! Keys can be reset at any time, e.g. in case of a (suspected) leak Last request including IP address and full path is displayed Persistence can be toggled (if enabled globally)
Using the API Before we go into details, look at how it works The same data – available as XML, JSON and sometimes additional more formats (iCal, RSS, HTML) / export/categ/4751.[json|xml|html]
Using the API Before we go into details, look at how it works What it we need a signed request? /export/categ/123.json?limit=10 1. Add the API key to the params … &ak=api-key 2. Add the current timestamp to the params … ×tamp= Sort the query string params alphabetically… ?ak=api- key&limit=10×tamp=123… 4. Merge path and the sorted query string /export/categ/123.json?ak=api-key&limit=10×tamp= Create a HMAC-SHA1 signature of this string using the secret key as the key. 6. Append the hex-encoded signature to the query string: …&signature=xxx
EXTENDING THE API Creating a custom API endpoint Let’s keep it simple: We want an API that returns a range of numbers or characters. This does not access any indico-specific data but demonstrates some of the utilities available when using the API as a developer. / export/num/1-10.json / export/char/a-z.json
EXTENDING THE API
Relevant classes HTTPAPIHook : Hooks into the /export/ URL subtree, handles arguments IteratedDataFetcher : Provides pagination, sorting, relative dates, etc. IFossil : Describes how an object is converted to primitives (dict, list, int, str, bool) that can be serialized (JSON, XML, …) later
EXTENDING THE API
The hook, Pt. 1: class RangeHook(HTTPAPIHook): TYPES = ('num', 'char') RE = r'(?P [0-9]+|[a-z])-(?P [0-9]+|[a-z])' DEFAULT_DETAIL = 'simple' MAX_RECORDS = { 'simple': 10, 'palindrome': 5 }
EXTENDING THE API The hook, Pt. 2: Arguments Perform additional validation if necessary Use self._pathParams for arguments from the path regex Use self._queryParams to access arguments in the query string def _getParams(self): super(RangeHook, self)._getParams() self._start = self._pathParams['start'] self._end = self._pathParams['end']
EXTENDING THE API The hook, Pt. 3: The export methods Perform actions/validations specific to the export type Must return an iterator, usually provided by an IteratedDataFetcher def export_num(self, aw): try: start = int(self._start) end = int(self._end) except ValueError: raise HTTPAPIError('Invalid value', 400) return RangeFetcher(aw, self).numbers(start, end) def export_char(self, aw): if len(self._start) != 1 or len(self._end) != 1: raise HTTPAPIError('Invalid character', 400) return RangeFetcher(aw, self).chars(self._start, self._end)
EXTENDING THE API
The fetcher, Pt. 1: Metadata Remember DEFAULT_DETAIL and MAX_RECORDS ? Here we specify how objects are fossilized in those detail levels. More about fossilization later, let’s get some data first! class RangeFetcher(IteratedDataFetcher): DETAIL_INTERFACES = { 'simple': IDummyFossil, 'palindrome': IDummyPalindromeFossil }
EXTENDING THE API The fetcher, Pt. 2: Python iterator magic Remember: We call these methods from our export methods. In this case we just iterate over numbers/characters in the given range. DummyObject ? You will see soon… def numbers(self, start, end): iterable = xrange(int(start), int(end) + 1) iterable = itertools.imap(str, iterable) iterable = itertools.imap(DummyObject, iterable) return self._process(iterable) def chars(self, start, end): iterable = itertools.imap(chr, xrange(ord(start), ord(end) + 1)) # Besides that first line everything is the same like in numbers()
EXTENDING THE API The Dummy Object Usually we deal with custom objects in Indico. Because of historical reasons it is common to have getter methods (yes, we know it is ugly and unpythonic). The fossilization system is built around this thin wrapper providing a getter for our primitive string values. class DummyObject(object): def __init__(self, value): self.value = value def getValue(self): return self.value
EXTENDING THE API
Fossils IDummyFossil is really simple: It fossilizes the getValue() return value of the actual object to a field named value. For the palindrome detail level we specify a custom callable to generate the value on the fly. The field name is determined automatically again. class IDummyFossil(IFossil): def getValue(self): pass class IDummyPalindromeFossil(IDummyFossil): def getPalindrome(self): pass getPalindrome.produce = lambda x: x.value + x.value[::-1]
EXTENDING THE API Does it work? / export/num/ json?detail=palindrome / export/num/ json / export/num/1-1.xml
EXTENDING THE API Yes! Did you notice the pagination? / export/num/ json?detail=palindrome / export/num/ json / export/num/1-1.xml
What else The API is even more powerful! Caching (unless disabled by the user per-request) POST (to modify things, e.g. booking a room) Additional URL prefixes besides /export/ (e.g. /api/ ) Many things (like the export_* method name) can be changed easily when subclassing HTTPAPIHook Reading the existing code is the best way to get used to it!
EXTENDING THE API
Another example, using class AnnouncementHook(HTTPAPIHook): PREFIX = 'api' TYPES = ('announcement',) RE = r'set' GUEST_ALLOWED = False VALID_FORMATS = ('json', 'xml') COMMIT = True HTTP_POST = True def _getParams(self): super(AnnouncementHook, self)._getParams() self._message = get_query_parameter(self._queryParams, ['message'], '') def api_announcement(self, aw): am = getAnnoucementMgrInstance() am.setText(self._message) return {'message': self._message}
Adrian Mönnich Questions? View the example code on GitHub: [Diff] [Gist]