Presentation is loading. Please wait.

Presentation is loading. Please wait.

ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.

Similar presentations


Presentation on theme: "ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney."— Presentation transcript:

1 ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney

2 Data Science Centers - Conceptual View Core Integrated Services Federated Services Catalog Core Integrated Services Federated Services Catalog Data Science Center Complimentary data science services Staff with expertise in many areas of data science – Partnering with Domain Scientists Core integrated services – Federated identity – Data replication and inter-site data & metadata access – Data publication and digital curation services – Inter-site workflows – Other critical replicated services Federated services catalog – Core common services – Site-specific services – Common service provisioning API Providing the ability to construct complex multi-site data analysis environments from composable and customizable services Data Science Center

3 A rich environment of common services that can be flexibly composed to meet specific requirements of science domains across DOE SC MPI ADIOS Metadata Harvesting & Management Metadata Harvesting & Management Indexing, Discovery & Dissemination Indexing, Discovery & Dissemination Semantic Analysis Semantic Analysis Platform Instantiation Interface Platform Instantiation Interface Workflow Composition & Execution Manager Workflow Composition & Execution Manager Data Mining Data Services Simulation Services Simulation Frameworks Scalable Debuggers Scientific Libraries Scientific Libraries Analytic Services Data Fusion System Software & Middleware Services Map Reduce Map Reduce HIVE Key Value Stores Graph Databases SQL Databases Human Computer Interaction Infrastructure Services HPC Compute Utility Compute Parallel File Systems Archival Storage Object Storage Workflow Composition Security Message Queues Network Storage Visualization Environments Visual Analytics Interface Visual Analytics Interface Data Transfer Tools Data Transfer Tools Advanced Networking SDN Our Infrastructure is Services

4 Pilot Integrated Services - Federated identity -Data replication -Data publication -Inter-site workflows Pilot Integrated Services - Federated identity -Data replication -Data publication -Inter-site workflows Multi-lab collaboration to demonstrate an integrated data science capability based on existing infrastructure – Federated identity & consistent security/cyber policies – Data replication and ease of data access across sites – Advanced analysis systems – Persistent services & data publication services Data Science Center Phase 1 – Site specific service workflows – Deployment of integrated services Phase 2 – Inter-site composition of services – Prototype federated services catalog Data Science Center Demonstration Overview

5 Guiding Principles Infrastructure services should be API-driven to a high degree, to allow composition of services Aim for commonality and consistency but allow for uniqueness (Federated versus tightly integrated) Allow domains to create and customize data analysis environments from services at various levels of the infrastructure based on their level of sophistication and existing services Provide an increased level of availability and redundancy than exist today 5

6 Which Services to Demo? We will focus on core services that we envision are useful across a broad set of science domains / use cases. We will sample from a couple of the domain demos and identify high-level services that we could demonstrate in a coordinated fashion across the 5 sites. 6

7 Potential Services of Relevance Single sign-on Replicated data storage Data publishing / curation Data capture service from facilities Tiered storage (near line and archival) Provenance (data and/or workflow provenance) Message queues (supporting distributed workflow systems) Anycast networking Network traffic isolation for performance and security Application as a service? Admin and/or User-controlled Service provisioning: – Anycast aware load balancer (web services behind perhaps) – Highly scalable MongoDB or MySQL Ultimately the services we demo will be guided by the science domain demonstrations 7

8 A rich environment of common services that can be flexibly composed to meet specific requirements of science domains across DOE SC Provenance & Publishing Analysis Execution Analysis Execution Data Services Simulation Services Simulation Execution Analytic Services Data Integration System Software & Middleware Services Key Value Stores SQL Databases Human Computer Interaction Infrastructure Services HPC Compute Utility Compute File Systems Archival Storage Message Queues Web/Visual Interface Web/Visual Interface Data Capture Data Capture Advanced Networking SDN Core Services Stack for Demo Data Transfer Single Sign-on/Security Workflow Composition User Provision-able Services


Download ppt "ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney."

Similar presentations


Ads by Google