API Aspect of the Science Platform

Slides:



Advertisements
Similar presentations
28 October 2008 IVOA Interoperability Meeting -- Baltimore T HE I NTERNATIONAL V IRTUAL O BSERVATORY ALLIANCE TAP/VOTable Registry Interface Reg 1 – G.
Advertisements

September 13, 2004NVO Summer School1 VO Protocols Overview Tom McGlynn NASA/GSFC T HE US N ATIONAL V IRTUAL O BSERVATORY.
CASDA Virtual Observatory CSIRO ASTRONOMY AND SPACE SCIENCE Arkadi Kosmynin 11 March 2014.
EPE Release 2 IOC Review August 7, 2012 Ocean Observatories Initiative OOI EPE Release 2 Initial Operating Capability Review Lab/Lesson Builder (LLB) Service.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Building a Framework for Data Preservation of Large-Scale Astronomical Data ADASS London, UK September 23-26, 2007 Jeffrey Kantor (LSST Corporation), Ray.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
Introduction to LSST Data Management Jeffrey Kantor Data Management Project Manager.
Submitted by: Madeeha Khalid Sana Nisar Ambreen Tabassum.
Last News of and
Introduction to Apache OODT Yang Li Mar 9, What is OODT Object Oriented Data Technology Science data management Archiving Systems that span scientific.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
The Future of the iPlant Cyberinfrastructure: Coming Attractions.
Ocean Observatories Initiative Data Management (DM) Subsystem Overview Michael Meisinger September 29, 2009.
DC2 Post-Mortem/DC3 Scoping February 5 - 6, 2008 DC3 Goals and Objectives Jeff Kantor DM System Manager Tim Axelrod DM System Scientist.
Chris Kuruppu NWS Office of Science and Technology Systems Engineering Center (Skjei Telecom) 10/6/09.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
30 October 2008 IVOA Interoperability Meeting -- Baltimore T HE I NTERNATIONAL V IRTUAL O BSERVATORY ALLIANCE VOTable interface with Registry Joint Apps/DM/Registry.
The International Virtual Observatory Alliance (IVOA) interoperability in action.
Project Database Handler The Project Database Handler is a brokering application that mediates interactions between the project database and the external.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
UCL DEPARTMENT OF SPACE AND CLIMATE PHYSICS MULLARD SPACE SCIENCE LABORATORY Taverna Plugin VAMDC and HELIO (part of the ‘taverna-astronomy’ edition) Kevin.
Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
12 Oct 2003VO Tutorial, ADASS Strasbourg, Data Access Layer (DAL) Tutorial Doug Tody, National Radio Astronomy Observatory T HE US N ATIONAL V IRTUAL.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Advanced Database Concepts
Software Development and Deployment PDS Management Council Face-to-Face Berkeley, California November 18-19, 2014 Sean Hardman.
Implementation Review1 Archive Ingest Redesign March 14, 2003.
Distributed Archives Interoperability Cynthia Y. Cheung NASA Goddard Space Flight Center IAU 2000 Commission 5 Manchester, UK August 12, 2000.
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
PDS4 Project Report PDS MC F2F University of Maryland Dan Crichton March 27,
Apache Solr Dima Ionut Daniel. Contents What is Apache Solr? Architecture Features Core Solr Concepts Configuration Conclusions Bibliography.
Physical Oceanography Distributed Active Archive Center THUANG June 9-13, 20089th GHRSST-PP Science Team Meeting GHRSST GDAC and EOSDIS PO.DAAC.
The AstroGrid-D Information Service Stellaris A central grid component to store, manage and transform metadata - and connect to the VO!
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
Virtual multidisciplinary EnviroNments USing Cloud infrastructures Data Management at VENUS-C Ilja Livenson KTH
Integrated Information Services “IIS” JP Navarro, U. of Chicago/ANL OGF 30 October 28, 2010.
A review of the VO standards process Matthew Graham Cape Town
Petr Škoda, Jakub Koza Astronomical Institute Academy of Sciences
Integration of Oracle and Hadoop: hybrid databases affordable at scale
Science Platform from the User Perspective
AIDA Fourth Technology Forum
The Operations Portal and the Grid Operations Interoperability
What to Expect at the LSST Archive: The LSST Science Platform Mario Juric, University of Washington LSST Data Management Subsystem Scientist for the.
Strategies for NIS Development
Integration of Oracle and Hadoop: hybrid databases affordable at scale
From LSE-30: Observatory System Spec.
Database Replication and Monitoring
Joslynn Lee – Data Science Educator
PDAC Roadmap from SUIT point of view
INTAROS WP5 Data integration and management
Current Resources LSST-DEV General Purpose Login/Compute Node. Supports developers and staff. Primary Linux Machine for this use.
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
LDF “Highlights,” May-October 2017 (1)
PDAP Query Language International Planetary Data Alliance
The Web Service based approach for data distribution at the IRIS DMC
USF Health Informatics Institute (HII)
Databases, Web Pages and Archives
Testing RESTful Web APIs
Saranya Sriram Developer Evangelist | Microsoft
Google Sky.
Time Domain Interest Group
NoSQL databases An introduction and comparison between Mongodb and Mysql document store.
TN19-TCI: Integration and API management using TIBCO Cloud™ Integration
Open Systems Technologies Data Analyst Internship:
OpenDP: A Pitch for a Community Effort
Presentation transcript:

API Aspect of the Science Platform Fritz Mueller, DAX T/CAM August 16, 2017

APIs supporting the Science Platform Web APIs: VO standard protocols where possible; custom extensions where necessary; proposed back to IVOA when applicable Catalog & other tabular data (dbserv) Image data (imgserv) Metadata (metaserv) Python objects (Data Butler) SQL database (Qserv) LSST2017 - Tucson, AZ - August 14 - 18, 2017

Web APIs: Catalog & Other Tabular Data (dbserv) TAP (TAP 1.1, once recommended status) ADQL (ADQL 2.1, once recommended status) Return data formats: VOTable (1.3 per TAP 1.1) Additionally: JSON and SQLite gzip compression through HTTP LSST2017 - Tucson, AZ - August 14 - 18, 2017

Web APIs: Image Data (imgserv) Discovery: SIA (SIA V2) Also TAP to database per IVOA ObsCore Data Model and CAOM2 Retrieval: Direct via SIA-returned URL when possible, or SODA Format: at least FITS; others could be added LSST2017 - Tucson, AZ - August 14 - 18, 2017

Web APIs: Metadata (metaserv) Catalog and image metadata IVOA ObsCore DM, VOResource/RegTap, CAOM2 where possible Otherwise, TAP access to ”native” metadata tables LSST2017 - Tucson, AZ - August 14 - 18, 2017

Python Object API (Data Butler) Abstracts storage technology, layout, and data encoding, via a generic key/value data identification API and plug-in architecture Rich, in-memory, Python objects in and out How pipeline payloads deal with LSST data products LSST2017 - Tucson, AZ - August 14 - 18, 2017

LSST2017 - Tucson, AZ - August 14 - 18, 2017 SQL Database (Qserv) A custom MPP RDBMS front-end, developed at SLAC, to meet the scaling and performance requirements of LSST LSST2017 - Tucson, AZ - August 14 - 18, 2017

SQL Database (Qserv) cont. Leverages existing OSS technologies on the back-end Data shards hosted in MariaDB database instances XRootD for scalable data-addressed message and data transport Spherical geometry w/ overlap Advanced shared-scan architecture to support high level of concurrency Optimized for immutable data use case LSST2017 - Tucson, AZ - August 14 - 18, 2017

Verification and Validation PDAC (Prototype Data Access Center): Integration environment with rest of Science Platform At non-trivial scale (~100 TB currently), scientifically valid data (SDSS Stripe82, IRSA AllWISE currently; NEOWISE & HSC upcoming) Feedback from SUIT team on functionality of service endpoints Feedback from small group of science users on performance, accuracy, usability, and appropriateness of design First PDAC user report: DMTR-22 LSST2017 - Tucson, AZ - August 14 - 18, 2017

Verification and Validation, cont. Qserv Scaling and Performance Tests (LDM-552): Graduated series of data-challenge style tests, on a glide-path toward project requirements for DR1: Most recent test report: DMTR-16 LSST2017 - Tucson, AZ - August 14 - 18, 2017

Status: Web APIs & Data Butler Preliminary implementations running in the PDAC, mostly custom interfaces, receiving feedback from SUIT team Current development focused on adding/extending VO protocol support Data Butler: Basic implementation has seen heavy use by Science Pipelines group Current development focused on architectural cleanup and implementation of additional backend storage/format plugins LSST2017 - Tucson, AZ - August 14 - 18, 2017

LSST2017 - Tucson, AZ - August 14 - 18, 2017 Status: Qserv Qserv: Three running instances (~30 physical nodes ea.) in continuous operation: dedicated hardware cluster at NCSA, early adoption also (2 clusters) at CC-IN2P3 Scale testing with databases up to ~70 TB, expected to cross 100 TB this year. On track to meet or exceed performance requirements Work in progress on data distribution/replication for data durability, improvement of auxiliary tooling (e.g. deployment, data ingest), improved query management (async/status) LSST2017 - Tucson, AZ - August 14 - 18, 2017

LSST2017 - Tucson, AZ - August 14 - 18, 2017 Selected Milestones Capability L2 Milestones Cycle VO web APIs Ingest infrastructure revamp for HSC reprocessing LDM-503-1 LDM-503-2 Fall 2017 Transformed EFD Schema Design Butler retrieves images from Data Backbone LSST-1220 LDM-503-3 LDM-503-4 Spring 2018 Services integrated with NCSA auth system Provenance system Qserv single-master fault-tolerance LDM-503-7 LDM-503-9 Fall 2018 User table import/export Calibration database Qserv multi-master fault-tolerance LDM-503-11 Fall 2019 Image regeneration Next-to-database processing Security audit LDM-503-13 Fall 2020 LSST2017 - Tucson, AZ - August 14 - 18, 2017