How the VIAF Magic Happens So Much From So Little Ralph LeVan Sr. Research Scientist 7/14/2016 Code4Lib Midwest
VIAF Architecture Diagram Apache Tomcat Tomcat Filters SRWServlet SRWDatabase Database
Apache Tomcat Handles all requests not destined to the servlet Passes requests through appropriate filters Passes requests to appropriate servlets Runs out of memory
Tomcat Filters Look at incoming requests Modify the requests or perform some other action Modify the servlet response
Tomcat Filters Used ExpiresFilter ipUseThrottleFilter URLRewriteFilter Sets expiration dates for outgoing responses ipUseThrottleFilter Rejects (error 429) simultaneous requests above N from a site, depending on current load URLRewriteFilter Modifies an incoming request http://viaf.org/viaf/244788149 becomes http://viaf.org/viaf/search/VIAFURI?query=local.viafID+exact+244788149&maximumRecords=1&recordSchema=VIAF&service=APP
SRWServlet Assembles the SRU Request Handles Content Negotiation Knows how to transform a database response based on the negotiated mime type Handles Language Negotiation Looks to see if files (mostly stylesheets) have language specific versions Generates redirects http://viaf.org/viaf/sourceID/LC%7Cno2012049330 Redirects to http://viaf.org/viaf/244788149/
HTTP Methods Handles APP Requests (GET, PUT, POST and DELETE) Handles CORS Requests (OPTIONS) Handles HEAD Requests Just does a GET and then throws away the content (Ugh)
SRWDatabase Runs the request Transforms records to the desired schema Sends the search or browse to the database Transforms records to the desired schema Transformers can be XSLT or bespoke code E.g. JenaTransformer
Database Performs the search or browse (if supported) Returns records with an indication of schema (data semantics, e.g. native VIAF or MARC21) and record packing (formatting, e.g. XML or JSON)
Supported Databases Pears Lucene DSpace Elasticsearch OpenSearch ParallelSearching (Federated Searching) Filesystem
What kind of magic can we perform with these ingredients?
Real World Objects http://viaf.org/viaf/244788149 redirects with a 303 status code to http://viaf.org/viaf/244788149/ This tells the Linked Data world that the URLis the identifier for a real thing and not just a pointer to a page. It also tell Linked Data clients that they can try for a linked data version of the record (We support RDF and JSON-LD) URLRewriteFilter: <from>^/([0-9]+)$</from> <to type="seeother-redirect">$1/</to>
APP GET Requests http://viaf.org/viaf/244788149/ URLRewriteFilter: <from>^/([0-9][0-9]+)/$</from> <to>/search/VIAFURI?query=local.viafID+exact+$1&maximumRecords=1&recordSchema=VIAF& service=APP</to> APP logic extracts record from SRU response Content Negotiation renders the record appropriately
APP GET With Explicit Mime Type http://viaf.org/viaf/244788149/viaf.html URLRewriteFilter: <from>^/([0-9][0-9]+)/viaf.html$</from> <to>/search/VIAFURI?query=local.viafID+exact+$1&maximumRecords=1&recordSchema=VIAF&service=APP&httpAccept=text/html</to> APP logic extracts record from SRU response Content Negotiation renders the record appropriately We also support: viaf.xml, justlinks.json, rdf.xml, viaf.json, viaf.jsonld, rss.xml, marc21.html, marc21.xml, unimarc.html, unimarc.xml
sourceID Redirects http://viaf.org/viaf/sourceID/LC|no2012049330 redirects to http://viaf.org/viaf/244788149/ URLRewriteFilter: <from>^/sourceID/([^/]+)(/?|/.+)$</from> <to>/search?query=local.source+exact+%22$1%22&httpAccept=application/redirect%2bxml&service=APP</to> APP logic extracts record from SRU response SRWServlet looks for redirect response and generates appropriate response headers
SRU Responses Rendered as HTML SRU Response returned by SRWDatabase Stylesheets defined for each database for either client-side or server-side rendering explainStyleSheet=/viaf/xsl/explainResponse.xsl scanStyleSheet=/viaf/xsl/scanResponse.xsl searchStyleSheet=/viaf/xsl/results.xsl multipleRecordsStyleSheet=/viaf/xsl/results.xsl Content Negotiation for HTML results in the appropriate stylesheet being applied
APP PUT, POST and DELETE SRWServlet pass through to an add(), update() or delete() method of SRWDatabase SRWDatabase has a configuration parameter that allows it to extract the record key from the URL and passes that to the add(), update() or delete() method of the underlying database We use the POST and DELETE methods for immediate takedowns or data redactions We have another database (xA) that contains records that manually bring together VIAF IDs
All This Code is Open Source https://github.com/OCLC-Research
Ralph LeVan levan@oclc.org