Presentation is loading. Please wait.

Presentation is loading. Please wait.

Software infrastructure for a National Research Platform

Similar presentations


Presentation on theme: "Software infrastructure for a National Research Platform"— Presentation transcript:

1 Software infrastructure for a National Research Platform
Ian Foster The University of Chicago Argonne National Laboratory Talk at 1st National Research Platform Workshop Aug 7-8, Bozeman, Montana

2 Congratulations, you have a Science DMZ!
Credit: Eli Dart

3 What you really want is a science accelerator
Software Infrastructure High-speed data ingest Secure data sharing Data publication Smart instruments Ultra-scale collaboration Software transmutes silicon into discoveries

4 A strong software infrastructure is…
Accessible — trivially usable by all Ubiquitous — it goes where you need it Performant — fast end to end Secure — all resources are protected Reliable — you can count on it Programmable — you can build on it Manageable — it supports sys admins, too Sustainable — it will be there tomorrow

5 Accessible means trivially usable by all
Compute Facility Instrument Globus transfers files reliably, securely 2 Globus controls access to shared files on existing storage; no need to move files to cloud storage! 4 Curator reviews and approves; dataset published on campus or other system 7 Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Transfer Researcher selects files to share, selects user or group, and sets access permissions 3 Collaborator logs in to Globus and accesses shared files; no local account required; download via Globus 5 Share Researcher assembles dataset; describes it with Dublin core & domain-specific metadaa 6 Researcher initiates transfer request; or requested automatically by script, science gateway 1 Publication Repository A U P P S R M S PURPOSE SOFTWARE Peers, collaborators search and discover datasets; transfer and share using Globus 8 Access via web browser, command line, or REST API Use any storage Use existing identity Publish Personal Computer Discover

6 Ubiquitous means it goes where you need it
10,000+ active endpoints Native packages Installs in seconds Linux, Windows, MacOS GPFS, Lustre, OrangeFS, … AWS S3, Ceph RadosGW Spectra Logic BlackPearl Google Drive, HPSS Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable A U P P S R M S PURPOSE SOFTWARE Amazon Glacier

7 Performant means fast end to end
Specialized protocols Auto-configuration Parallel DTNs File system optimizations Tape system optimizations Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable 1PB in days, ArgonneNCSA R. Kettimuthu et al.

8 Secure means all resources are protected
Globus service is itself highly secure Best-practice cloud security Third-party security reviews Globus platform ensures your services are secure Accept credentials from 300+ identity providers Control proxy credential lifetimes Industry-standard OAuth-2 and OIDC protocols Data encryption Build secure services with controlled delegation Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable A U P P S R M S PURPOSE SOFTWARE

9 Reliable means you can count on it
Each transfer is monitored, retried upon failure Protocols support restart Fail over on multiple DTNs Service is cloud hosted, with replication, dynamic failover, monitoring 99.5% uptime over past three years Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable A U P P S R M S PURPOSE SOFTWARE

10 Programmable means you can build on it
Globus Auth API Globus Transfer API Globus Connect Data Publication & Discovery File Sharing File Transfer & Replication Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Web Command line A U P P S R M S PURPOSE SOFTWARE GET /endpoint/go%23ep1 PUT /endpoint/vas#my_endpt 200 OK X-Transfer-API-Version: 0.10 Content-Type: application/json Integrate file transfer and sharing capabilities into scientific web apps, portals, gateways, etc. Use institutional ID systems in external web applications REST API

11 Programmable means you can build on it
Jupyter Notebooks Globus Auth API Globus Transfer API Globus Connect Data Publication & Discovery File Sharing File Transfer & Replication Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Python SDK Integrate file transfer and sharing capabilities into scientific web apps, portals, gateways, etc. Use institutional ID systems in external web applications

12 Programmable means automation
globus.org Recurring transfers with sync option Copy /ingest 3:30am Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Data distribution .../my_share --/cohort045 --/cohort096 --/cohort127 Shared Endpoint Staging area cleanup Shared Endpoint 1. Check if successful transfer 2. Delete data from staging area .../distribute

13 Programmable means automation
Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable ARM Climate Research Facility

14 Manageable means it helps sys admins, too
Low admin costs Priority support Usage reporting Management console Alternative identity provider Training materials Constant innovation Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable A U P P S R M S PURPOSE SOFTWARE

15 Sustainable means it will be there tomorrow
Operated by professionals at the University of Chicago Supported by subscriptions from >65 institutions Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable Picture of team

16 Raising the bar on research software quality
5 major services 13 national labs use Globus 290PB transferred 10,000 active endpoints 50 Bn files processed 70,000 registered users 99.5% uptime 65+ institutional subscribers 1 PB largest single transfer to date 3 months longest continuously managed transfer 300+ federated campus identities 12,000 active users/year Accessible Ubiquitous Performant Secure Reliable Programmable Manageable Sustainable

17 Easier More Better Get more data to more people faster Authentication
Software infrastructure for a national research platform Get more data to more people faster Easier Authentication Transfer Sharing Publication Administration More Users Time Data Storage Better Collaboration Ideas Innovation Software transmutes hardware into discoveries

18 Thank you to our sponsors!
globus.org Our subscribers U.S. DEPARTMENT OF ENERGY


Download ppt "Software infrastructure for a National Research Platform"

Similar presentations


Ads by Google