IVOA Interop, Beijing, China, May IVOA Data Access Layer Working Group Sessions Doug Tody (NRAO/NVO ) Markus Dolensky (ESO/EuroVO) Data Access Layer Working Group I NTERNATIONAL V IRTUAL O BSERVATORY A LLIANCE
IVOA Interop, Beijing, China, May Table Access Protocol Interface Analysis/Options (from the DAL perspective)
IVOA Interop, Beijing, China, May Protocol Consistency Motivation –TAP and SSAP, SLAP, SIA-V2, etc. (the second generation protocols) should be consistent where possible –Promotes code sharing at all levels –Simplifies service frameworks as well as client apps –Sharing reduces overall system complexity –Future ADQL integration into other DAL services will be easier Guideline –Take basic DAL service profile and semantics as a starting point, only deviate for TAP where there is a need to do so.
IVOA Interop, Beijing, China, May DAL Service Profile Standard Profile –Intended as starting point for any 2ndGen DAL service –Joint GWS/DAL/Registry standards effort Standard Operations –queryData find available datasets (may be virtual) –(getData) get a single dataset (synchronous) –stageData start async job to generate/stage datasets –getCapabilities get service metadata (capabilities) –getAvailability get service availability, status
IVOA Interop, Beijing, China, May Table Access Protocol Assumptions –Scope is a single service at a single site –Distributed queries are handled at a higher level Requirements –Simple usage for a simple query is a requirement –Large (asynchronous) queries must however be supported –A service may manage a "table set" - multiple tables –ADQL-based queries may query multiple tables in one operation
IVOA Interop, Beijing, China, May Basic Query Purpose –Execute a single query and return immediate results Goals –Basic usage very simple essentially a generalized cone search –Synchronous GET (POST also possible) –Returns a VOTable directly (or other table format) –Both ADQL and data model-based queries provide
IVOA Interop, Beijing, China, May Basic Query ADQL query –queryData(query="SELECT * FROM a WHERE snr>2.5", format="csv") Example GET translation – &FORMAT=csv –Details such as operation name may vary Equivalent DM-based query –queryData(table="a", snr="2.5/", format="csv") Classic Cone Search –queryData(table="a", POS="180.0,1.0", SIZE="0.1")
IVOA Interop, Beijing, China, May Async Query Purpose –Execute a single query asynchronously and stage the results Approach –Variation on stageData as planned for SIAP etc. –Differs only in the content of the stageData request –VOSpace used to stage output (or input)
IVOA Interop, Beijing, China, May Async Query Canonical StageData –Normally (for SIA etc.) proceeded by a queryData queryData response defines virtual datasets (as at present) these are computational tasks which can be performed –Provides a declarative approach to job specification –QueryData response can include cost estimation stageData request –Is a POST; returns a UWS JobID –Lists one or more tasks to be performed –Includes data staging (e.g., VOSpace) information –UWS mechanisms (polling, messaging) used to monitor job progress
IVOA Interop, Beijing, China, May Async Query TAP StageData Variant –Skips (normally) the cost estimation step not needed except for very large queries can be hard to estimate query costs with SQL –Request is parameter-based rather than declarative stageData request –Request includes the ADQL query directly as the task specifier –Data staging functionality is common –Job control and monitoring facilities are common
IVOA Interop, Beijing, China, May Metadata Queries Purpose –Mechanism used by client to query “database” metadata –At issue here is "dataset content" metadata (tables, columns, etc.) Possible Approaches –Uniform interface to query both table data and metadata –Dedicated operation for each query Registry Integration –In either case table/column metadata can be generated to –provide for registry-based caching and discovery
IVOA Interop, Beijing, China, May Metadata Queries Uniform Query Interface –Represents database metadata as tables which can be queried –Same interface used to query both table data and metadata –This is same approached used for the SQL INFORMATION_SCHEMA Examples –queryData(table="SCHEMA.tables", format="xml") –queryData(table="SCHEMA.columns", tableName="a") Advantages –Re-use of existing table query mechanism all related software at all levels can be reused features such as FORMAT options come for free –Adding more metadata does not require an interface change tables, columns, views, functions, indexes, etc.
IVOA Interop, Beijing, China, May Metadata Queries Dedicated Operations –Defines a custom operation to query each form of table metadata e.g., getTables, getColumns –Could potentially still re-use query response mechanisms e.g., FORMAT options and parameter Advantages –Slightly simpler query, e.g., to get list of tables Disadvantages –Standard data query interface cannot be used to access metadata –Less easily extended as need to extend service interface
IVOA Interop, Beijing, China, May Interface Summary Operations –simple synchronous query (queryData variant) –table metadata query (probably also queryData) –asynchronous query (stageData variant) –getCapabilities (standard) –getAvailability (standard) Protocol –HTTP details consistent with other services (multiple requests, –error handling, file formats, etc.)
IVOA Interop, Beijing, China, May Some Key TAP Issues Parameterized queries (POS,SIZE etc.) –Do we go only the ADQL route, or do we also provide a parameter-based query for the simplest use cases? Unified table data/metadata query –Do we provide a uniform interface for table data and table metadata queries? Form of interface –Can we agree to standardize the form of an interface at the level of the HTTP protocol, e.g., the details of how multiple operations, parameters, error responses, etc., are handled, and use this for all the IVOA data services? Grid capabilities –Do we wish to have a phased development for TAP which does not fully specify the grid capabilities in the initial version? –Will this be sufficiently useful?