Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago
Data sharing in collaborations Registry Staging Store Ingest Store Analysis Store Community Store ArchiveMirror Ingest Store Analysis Store Community Store ArchiveMirror Registry
Data Management User Stories “I need a good place to store / backup / archive my (big) research data” “I need to easily, quickly, and reliably move or mirror portions of my data to other places.” “I need a way to easily and securely share my data with my colleagues at other institutions.” “I want to publish my data.” “I want to discover published data.” …
Exemplar: ISI-MIP Inter-Sectoral Impact Model Intercomparison Project Framework to collate climate impact data across scales and sectors World-wide collaboration with data assets managed by the collaboration Inputs from various climate models & output forms basis for model evaluation and improvement Credits: Dr. Joshua Elliot, University of Chicago
ISI-MIP Use Cases Share data with researchers across institutions world-wide –Restricted sharing –Multiple institutions Accept data submissions –Restricted writing to archive Publish results –Move selected results to other locations –Track metadata –Discover data
What is Globus? Big data publish*, transfer and sharing… …with Dropbox-like simplicity… …directly from your own storage systems * In pilot phase
Collaboration Archive Univ. of Chicago Argonne IIT UIUC Publish walk-through 3. Assemble Dataset (Transfer Data) Curator 2. Describe Submission Scientist 4. Curate Dataset 1. Publish Data
Login with Campus Identity 8
New submission 9
Assemble the Dataset 10
Move data to publish archive 11
Grant Submission License 12
Submission Complete 13
Curator Logs in 14
Curation Workflow Options 15
Verify Metadata & Files 16
Approve the Submission 17
Submission is now Published with DOI 18
Collaboration Archive Univ. of Chicago Argonne IIT UIUC Discover walk-through 3. Assemble Dataset (Transfer Data) Curator 2. Describe Submission Scientist 4. Curate Dataset 1. Publish Data 6. Download 5. Search
Search Published Datasets 20
Discovering a Published Dataset 21
Download the Published Dataset 22
Select Download Destination 23
Globus Under the Covers Identity, Group, Profile Management Services … … Sharing Service Transfer Service Globus Toolkit Globus APIs Globus Connect
Reliable, secure, high-performance file transfer and synchronization “Fire-and-forget” transfers Automatic fault recovery Seamless security integration Powerful GUI and APIs Data Source Data Source Data Destination Data Destination User initiates transfer request 1 1 Globus moves and syncs files 2 2 Globus notifies user 3 3
Simple, secure sharing off existing storage systems Data Source Data Source User A selects file(s) to share, selects user or group, and sets permissions 1 1 Globus tracks shared files; no need to move files to cloud storage! 2 2 User B logs in to Globus and accesses shared file 3 3 Easily share large data with any user or group No cloud storage required
Thank you Signup and use Globus to transfer and share Signup as early adopters of publish Support
Thank you to our sponsors! U.S. DEPARTMENT OF ENERGY