Core Integration Web Services Dean Krafft, Cornell University
2 CI Infrastructure: Past Massive, Monolithic Infrastructure User Access through NSDL.org MR Input/Output: OAI-PMH – heavy Search: SDLIP – Java package – heavy Archive: Only through NSDL.org search NSDL.org Portal framework: uPortal – large, complex Java system
3 CI Infrastructure: Future Open, service-friendly infrastructure User access: multiple portals, browser extensions, standard web search MR I/O: SOAP/WSDL, REST, RSS Search: SOAP/WSDL, REST NSDL.org: PHP reimplementation – flexible, indexable, reusable
4 CI Philosophy Open, lightweight mechanisms for access and contribution Play well with the Internet – don’t be a silo Synergize with existing web tools and infrastructure, don’t compete Enable many forms of access and contribution – including ones we haven’t thought of yet
5 Accessing the MR Given OAI ID of record, REST access is available now: ecord&identifier=‘xxx'&metadataPrefix=oai_dc ecord&identifier=‘xxx'&metadataPrefix=oai_dc What other queries should we support? Search engine style – but MR is structured, not full text SQL query – Exposes database structure XQuery – Dependent on full XML schema, expensive to implement
6 Strawman MR Access Proposal SOAP/WSDL and REST access FetchElementsLike(“dc:title”, “frog”) – returns IDs where title contains “frog” FetchElementsStarting(“dc:author, “bill”) – returns sorted list of IDs FetchElements(“oai:nsdl.org:pri:00010”, “dc:title”, “dc:author”) – returns list of elements where OAI ID matches
7 MR Access for Relationships Committed to adding Annotations, Relationships (e.g. Equivalence), Organizational Structures Can expose as links – slow, expensive traversal What are the alternatives? Dump it as a (large) XML file? Support extended relationship queries?
8 MR Ingest Need a lightweight alternative to OAI- PMH RSS (Rich Site Summary, Really Simple Syndication, RDF Site Summary) v0.9x, v1.0, v2.0 RSS supports Dublin Core (and some variants support arbitrary metadata) Idea: create RSS/OAI gateway (in development)
9 Search Access Currently, WebDAV access available (underlying SDLIP protocol) Need to use SDS query language Full text search design collapses multiple fields (author, identifier) SOAP interface will be forthcoming
10 Archive Access Primary current access through NSDL.org search results Available as HTML page through: verb=GetArchive&identifier=oai:nsdl.org:pri:00109 SOAP interface almost complete
11 Playing well with the Web Expose the MR as a crawlable, indexable tree – enable Google search Expose MR relationships as web link structure Support lightweight contribution, annotation, and NSDL membership check for resources Enable new user services
12 What do you need? How should we expose MR data and Core Integration services? How should we support authenticated contributions (annotations, et al.) Is SOAP/WSDL the best? REST for query/data access? How can we enable exciting new services?