Presentation is loading. Please wait.

Presentation is loading. Please wait.

API Aspect of the Science Platform

Similar presentations


Presentation on theme: "API Aspect of the Science Platform"— Presentation transcript:

1 API Aspect of the Science Platform
Fritz Mueller, DAX T/CAM August 16, 2017

2 APIs supporting the Science Platform
Web APIs: VO standard protocols where possible; custom extensions where necessary; proposed back to IVOA when applicable Catalog & other tabular data (dbserv) Image data (imgserv) Metadata (metaserv) Python objects (Data Butler) SQL database (Qserv) LSST Tucson, AZ - August , 2017

3 Web APIs: Catalog & Other Tabular Data (dbserv)
TAP (TAP 1.1, once recommended status) ADQL (ADQL 2.1, once recommended status) Return data formats: VOTable (1.3 per TAP 1.1) Additionally: JSON and SQLite gzip compression through HTTP LSST Tucson, AZ - August , 2017

4 Web APIs: Image Data (imgserv)
Discovery: SIA (SIA V2) Also TAP to database per IVOA ObsCore Data Model and CAOM2 Retrieval: Direct via SIA-returned URL when possible, or SODA Format: at least FITS; others could be added LSST Tucson, AZ - August , 2017

5 Web APIs: Metadata (metaserv)
Catalog and image metadata IVOA ObsCore DM, VOResource/RegTap, CAOM2 where possible Otherwise, TAP access to ”native” metadata tables LSST Tucson, AZ - August , 2017

6 Python Object API (Data Butler)
Abstracts storage technology, layout, and data encoding, via a generic key/value data identification API and plug-in architecture Rich, in-memory, Python objects in and out How pipeline payloads deal with LSST data products LSST Tucson, AZ - August , 2017

7 LSST2017 - Tucson, AZ - August 14 - 18, 2017
SQL Database (Qserv) A custom MPP RDBMS front-end, developed at SLAC, to meet the scaling and performance requirements of LSST LSST Tucson, AZ - August , 2017

8 SQL Database (Qserv) cont.
Leverages existing OSS technologies on the back-end Data shards hosted in MariaDB database instances XRootD for scalable data-addressed message and data transport Spherical geometry w/ overlap Advanced shared-scan architecture to support high level of concurrency Optimized for immutable data use case LSST Tucson, AZ - August , 2017

9 Verification and Validation
PDAC (Prototype Data Access Center): Integration environment with rest of Science Platform At non-trivial scale (~100 TB currently), scientifically valid data (SDSS Stripe82, IRSA AllWISE currently; NEOWISE & HSC upcoming) Feedback from SUIT team on functionality of service endpoints Feedback from small group of science users on performance, accuracy, usability, and appropriateness of design First PDAC user report: DMTR-22 LSST Tucson, AZ - August , 2017

10 Verification and Validation, cont.
Qserv Scaling and Performance Tests (LDM-552): Graduated series of data-challenge style tests, on a glide-path toward project requirements for DR1: Most recent test report: DMTR-16 LSST Tucson, AZ - August , 2017

11 Status: Web APIs & Data Butler
Preliminary implementations running in the PDAC, mostly custom interfaces, receiving feedback from SUIT team Current development focused on adding/extending VO protocol support Data Butler: Basic implementation has seen heavy use by Science Pipelines group Current development focused on architectural cleanup and implementation of additional backend storage/format plugins LSST Tucson, AZ - August , 2017

12 LSST2017 - Tucson, AZ - August 14 - 18, 2017
Status: Qserv Qserv: Three running instances (~30 physical nodes ea.) in continuous operation: dedicated hardware cluster at NCSA, early adoption also (2 clusters) at CC-IN2P3 Scale testing with databases up to ~70 TB, expected to cross 100 TB this year. On track to meet or exceed performance requirements Work in progress on data distribution/replication for data durability, improvement of auxiliary tooling (e.g. deployment, data ingest), improved query management (async/status) LSST Tucson, AZ - August , 2017

13 LSST2017 - Tucson, AZ - August 14 - 18, 2017
Selected Milestones Capability L2 Milestones Cycle VO web APIs Ingest infrastructure revamp for HSC reprocessing LDM-503-1 LDM-503-2 Fall 2017 Transformed EFD Schema Design Butler retrieves images from Data Backbone LSST-1220 LDM-503-3 LDM-503-4 Spring 2018 Services integrated with NCSA auth system Provenance system Qserv single-master fault-tolerance LDM-503-7 LDM-503-9 Fall 2018 User table import/export Calibration database Qserv multi-master fault-tolerance LDM Fall 2019 Image regeneration Next-to-database processing Security audit LDM Fall 2020 LSST Tucson, AZ - August , 2017


Download ppt "API Aspect of the Science Platform"

Similar presentations


Ads by Google