Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago.

Slides:



Advertisements
Similar presentations
C. Grimme, A. Papaspyrou Scheduling in C3-Grid AstroGrid-D Workshop Project: C3-Grid Collaborative Climate Community Data and Processing Grid Scheduling.
Advertisements

High Performance Computing Course Notes Grid Computing.
Riding the Wave: a Perspective for Today and the Future APA Conference, November 2011 Monica Marinucci EMEA Director for Research, Oracle.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Preservation Strategies: What do long-term archives do with my data? Jeff Arnfield NOAA’s National Climatic Data Center Version 1.0 Review Date.
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Robust Tools for Archiving and Preserving Digital Data Joseph JaJa, Mike Smorul, and Mike McGann Institute for Advanced Computer Studies Department of.
Mike Smorul Saurabh Channan Digital Preservation and Archiving at the Institute for Advanced Computer Studies University of Maryland, College Park.
Supporting Federally Funded Research Requirements with DSpace and SWORD 10 th International Conference on Open Repositories Hui Zhang, Michael Boock Oregon.
Archival Prototypes and Lessons Learned Mike Smorul UMIACS.
Data-PASS Shared Catalog Micah Altman & Jonathan Crabtree 1 Micah Altman Harvard University Archival Director, Henry A. Murray Research Archive Associate.
Building Data-intensive Pipelines Ravi K Madduri Argonne National Lab University of Chicago.
Jisc Data Spring Pitch: Cloud Workbench Ben Butchart EDINA.
DATAVERSE FOR JOURNALS Mercè Crosas, Ph.D. Director of Data Science IQSS, Harvard Society for Scholarly Publishing 37 th Meeting,
WSRF Supported Data Access Service (VO-DAS)‏ Chao Liu, Haijun Tian, Dan Gao, Yang Yang, Yong Lu China-VO National Astronomical Observatories, CAS, China.
Budget Module For Sage MIP Fund Accounting. Sage Requirements Fund Accounting 10.0 or higher Budget Module optional but required for multiple budget versions.
Making Connections: SHARE and the Open Science Framework Jeffrey Open Repositories 2015.
Globus online Reliable, high-performance file transfer… made easy. XSEDE ECSS Symposium, Dec.12, 2011 Presenter: Steve Tuecke, Deputy Director Computation.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Life Cycle Models & Principles Jake Carlson Associate Professor of Library Science Data Services Specialist Purdue University Libraries.
Empowering people-centric IT Unified device management Access and information protection Desktop Virtualization Hybrid Identity.
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop Discovery Environment Overview.
Copenhagen, 7 June 2006 Toolkit update and maintenance Anton Cupcea Finsiel Romania.
Improved Access to RDA from the MSS OSD Executive Meeting April 28, 2009.
Training by the Office of Library and Information Services Contact for more information: karen.gardner- or
Digital Commons & Open Access Repositories Johanna Bristow, Strategic Marketing Manager APBSLG Libraries: September 2006.
Leveraging Globus Services to Support Climate Model Data Access Through the Earth System Grid Federation (ESGF) Brian Knosp 1, Luca Cinquini 1, Lukasz.
Hybrid Cloud and Windows Server 2003 end of support on Azure Rod Kruetzfeld Data Center Technical Strategist Microsoft Canada.
GO-ESSP Workshop, LLNL, Livermore, CA, Jun 19-21, 2006, Center for ATmosphere sciences and Earthquake Researches Construction of e-science Environment.
Choosing Between Data Sharing Repositories for Engineering Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch.
Content Management, Not Content Micromanagement Colin McFadden.
Datalayer Notebook Allows Data Scientists to Play with Big Data, Build Innovative Models, and Share Results Easily on Microsoft Azure MICROSOFT AZURE ISV.
Sync and Exchange Research Data b2drop.eudat.eu This work is licensed under the Creative Commons CC-BY 4.0 licence B2DROP EUDAT’s Personal.
Fedora and the Preservation of University Electronic Records Project NHPRC Electronic Records Research Grant Kevin L. Glick Manuscripts and Archives, Yale.
Globus online Software-as-a-Service for Research Data Management Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National.
U.S. Department of the Interior U.S. Geological Survey Decision Support Tools and USGS Data Management Best Practices Cassandra Ladino USGS Chesapeake.
Research data management using Globus ESIP Summer Meeting 2015 Rachana Ananthakrishnan University of Chicago
Globus and ESGF Rachana Ananthakrishnan University of Chicago
CSIRO’s Data Access Portal Sue Cook | Research Data Services Support 18 March 2014.
Globus Publish Lighting Talk Ben Blaiszik, Kyle Chard
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
Data Citation Implementation Pilot Workshop
Managing live digital content with DuraSpace services Bill Branan PASIG Spring 2015.
Globus online Delivering a scalable service Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory.
1 Managing Learning Assets New Horizons Conference Virginia Community College System Darek Sady Blackboard Senior Consultant April 2006 Roanoke, VA.
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
Brian Nosek University of Virginia -- Center for Open Science -- Improving Openness.
ETERE A Cloud Archive System. Cloud Goals Create a distributed repository of AV content Allows distributed users to access.
Data Infrastructure in the TeraGrid Chris Jordan Campus Champions Presentation May 6, 2009.
Store and exchange data with colleagues and team Synchronize multiple versions of data Ensure automatic desktop synchronization of large files B2DROP is.
Ian Foster Ben Blaiszik Kyle Chard, Rachana Ananthakrishnan, Steven Tuecke, UChicago Michael Ondrejcek,
Simplifying Large-Scale Data Movement with Globus Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National Laboratory.
Discover ScholarSphere A repository service collaboration between the University Libraries and ITS.
International Planetary Data Alliance Registry Project Update September 16, 2011.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
TOWARDS AN ARCHITECTURE FOR NATIONAL DATA SERVICES Ian Foster Director, Computation Institute Argonne National Laboratory & The University of
UIT Open Research Data and our experiences using Dataverse Helene N. Andreassen, Philipp Conzett, Stein Høydalsvik, Leif Longva, Obiajulu Odu UiT The Arctic.
Enhancements to Galaxy for delivering on NIH Commons
Data Ingestion in ENES and collaboration with RDA
Software infrastructure for a National Research Platform
Joseph JaJa, Mike Smorul, and Sangchul Song
A Framework for Managing and Sharing Research Workflow
SRA Submission Pipeline
Systems Analysis and Design 5th Edition Chapter 8. Architecture Design
Social media for global scientific community – Mendeley project
CMIP6 use case and adoption of RDA outputs
Jisc Research Data Shared Service (RDSS)
Presentation transcript:

Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago

Data sharing in collaborations Registry Staging Store Ingest Store Analysis Store Community Store ArchiveMirror Ingest Store Analysis Store Community Store ArchiveMirror Registry

Data Management User Stories “I need a good place to store / backup / archive my (big) research data” “I need to easily, quickly, and reliably move or mirror portions of my data to other places.” “I need a way to easily and securely share my data with my colleagues at other institutions.” “I want to publish my data.” “I want to discover published data.” …

Exemplar: ISI-MIP Inter-Sectoral Impact Model Intercomparison Project Framework to collate climate impact data across scales and sectors World-wide collaboration with data assets managed by the collaboration Inputs from various climate models & output forms basis for model evaluation and improvement Credits: Dr. Joshua Elliot, University of Chicago

ISI-MIP Use Cases Share data with researchers across institutions world-wide –Restricted sharing –Multiple institutions Accept data submissions –Restricted writing to archive Publish results –Move selected results to other locations –Track metadata –Discover data

What is Globus? Big data publish*, transfer and sharing… …with Dropbox-like simplicity… …directly from your own storage systems * In pilot phase

Collaboration Archive Univ. of Chicago Argonne IIT UIUC Publish walk-through 3. Assemble Dataset (Transfer Data) Curator 2. Describe Submission Scientist 4. Curate Dataset 1. Publish Data

Login with Campus Identity 8

New submission 9

Assemble the Dataset 10

Move data to publish archive 11

Grant Submission License 12

Submission Complete 13

Curator Logs in 14

Curation Workflow Options 15

Verify Metadata & Files 16

Approve the Submission 17

Submission is now Published with DOI 18

Collaboration Archive Univ. of Chicago Argonne IIT UIUC Discover walk-through 3. Assemble Dataset (Transfer Data) Curator 2. Describe Submission Scientist 4. Curate Dataset 1. Publish Data 6. Download 5. Search

Search Published Datasets 20

Discovering a Published Dataset 21

Download the Published Dataset 22

Select Download Destination 23

Globus Under the Covers Identity, Group, Profile Management Services … … Sharing Service Transfer Service Globus Toolkit Globus APIs Globus Connect

Reliable, secure, high-performance file transfer and synchronization “Fire-and-forget” transfers Automatic fault recovery Seamless security integration Powerful GUI and APIs Data Source Data Source Data Destination Data Destination User initiates transfer request 1 1 Globus moves and syncs files 2 2 Globus notifies user 3 3

Simple, secure sharing off existing storage systems Data Source Data Source User A selects file(s) to share, selects user or group, and sets permissions 1 1 Globus tracks shared files; no need to move files to cloud storage! 2 2 User B logs in to Globus and accesses shared file 3 3 Easily share large data with any user or group No cloud storage required

Thank you Signup and use Globus to transfer and share globus.org/signup Signup as early adopters of publish globus.org/data-publication Support

Thank you to our sponsors! U.S. DEPARTMENT OF ENERGY