Globus online Software-as-a-Service for Research Data Management Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National.

Slides:



Advertisements
Similar presentations
Globus Online Prepared by Bekir Güler. What is Globus Online  Globus online transfer is a hosted service (deployed on Amazon’s cloud infrastructure)
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago.
Data Grids Darshan R. Kapadia Gregor von Laszewski
Experiences In Building Globus Genomics Using Galaxy, Globus Online and AWS Ravi K Madduri University of Chicago and ANL.
GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Ian Foster Computation Institute Argonne National Lab & University of Chicago Education in the Science 2.0 Era.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grid Services at NERSC Shreyas Cholia Open Software and Programming Group, NERSC NERSC User Group Meeting September 17, 2007.
SESSION 9 THE INTERNET AND THE NEW INFORMATION NEW INFORMATIONTECHNOLOGYINFRASTRUCTURE.
Accelerating data-intensive science by outsourcing the mundane Ian Foster.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Internet GIS. A vast network connecting computers throughout the world Computers on the Internet are physically connected Computers on the Internet use.
Assessment of Core Services provided to USLHC by OSG.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Lessons from industry for science cyberinfrastructure Simplicity, scale, and sustainability via SaaS/PaaS.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
Cloud Computing in NASA Missions Dan Whorton CTO, Stinger Ghaffarian Technologies June 25, 2010 All material in RED will be updated.
BIRN Update Carl Kesselman Professor of Industrial and Systems Engineering Information Sciences Institute Fellow Viterbi School of Engineering University.
KM Technology Assessment “Knowledge and team collaboration servers” DSC8030/CIS8260 Dr. Samaddar Summer 2004 Jon A. Preston.
Topaz : A GridFTP extension to Firefox M. Taufer, R. Zamudio, D. Catarino, K. Bhatia, B. Stearn University of Texas at El Paso San Diego Supercomputer.
The Earth System Grid (ESG) Goals, Objectives and Strategies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
Globus online Reliable, high-performance file transfer… made easy. XSEDE ECSS Symposium, Dec.12, 2011 Presenter: Steve Tuecke, Deputy Director Computation.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
CYBERINFRASTRUCTURE FOR THE GEOSCIENCES Data Replication Service Sandeep Chandra GEON Systems Group San Diego Supercomputer Center.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Service - Oriented Middleware for Distributed Data Mining on the Grid ,劉妘鑏 Antonio C., Domenico T., and Paolo T. Journal of Parallel and Distributed.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Holding slide prior to starting show. A Portlet Interface for Computational Electromagnetics on the Grid Maria Lin and David Walker Cardiff University.
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
LEGS: A WSRF Service to Estimate Latency between Arbitrary Hosts on the Internet R.Vijayprasanth 1, R. Kavithaa 2,3 and Raj Kettimuthu 2,3 1 Coimbatore.
1 GRID Based Federated Digital Library K. Maly, M. Zubair, V. Chilukamarri, and P. Kothari Department of Computer Science Old Dominion University February,
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
Data Transfers in the ALCF Robert Scott Technical Support Analyst Argonne Leadership Computing Facility.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
Cooperative experiments in VL-e: from scientific workflows to knowledge sharing Z.Zhao (1) V. Guevara( 1) A. Wibisono(1) A. Belloum(1) M. Bubak(1,2) B.
Cole David Ronnie Julio. Introduction Globus is A community of users and developers who collaborate on the use and development of open source software,
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
ALCF Argonne Leadership Computing Facility GridFTP Roadmap Bill Allcock (on behalf of the GridFTP team) Argonne National Laboratory.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Research data management using Globus ESIP Summer Meeting 2015 Rachana Ananthakrishnan University of Chicago
Data Manipulation with Globus Toolkit Ivan Ivanovski TU München,
Globus and ESGF Rachana Ananthakrishnan University of Chicago
Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:
Globus.org/genomics Globus Galaxies Science Gateways as a Service Ravi K Madduri, University of Chicago and Argonne National Laboratory
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
Trusted by over 500 companies worldwide Founded in 1993 “Many of our quality managers say that since using a handheld, they simply wouldn’t consider doing.
Built on the Powerful Microsoft Azure Platform, Forensic Advantage Helps Public Safety and National Security Agencies Collect, Analyze, Report, and Distribute.
Globus online Delivering a scalable service Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Simplifying Large-Scale Data Movement with Globus Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National Laboratory.
Globus online Cloud Based Services for Science Steve Tuecke Computation Institute University of Chicago and Argonne National Laboratory.
TOWARDS AN ARCHITECTURE FOR NATIONAL DATA SERVICES Ian Foster Director, Computation Institute Argonne National Laboratory & The University of
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
Computing Clusters, Grids and Clouds Globus data service
RDA US Science workshop Arlington VA, Aug 2014 Cees de Laat with many slides from Ed Seidel/Rob Pennington.
University of Chicago and ANL
Golubev Alexandr, MAGIC project 2016
Wonderware Online Cost-Effective SaaS Solution Powered by the Microsoft Azure Cloud Platform Delivers Industrial Insights to Users and OEMs MICROSOFT AZURE.
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Presentation transcript:

globus online Software-as-a-Service for Research Data Management Steve Tuecke Deputy Director, Computation Institute University of Chicago & Argonne National Laboratory

Big Science built on Globus Toolkit Earth System Grid Cancer Biologiy Informatics Grid LIGO data grid LHC

Thinking about “small and medium labs” Big projects like LHC, LIGO, ESG, etc., can run resource-level services reliably—and build and operate effective collective services Small labs and collaborations have problems with both They need solutions, not toolkits—ideally outsourced solutions Can we harness the power of the cloud to scale access to the grid? 3

Time-consuming tasks in science Run experiments Collect data Manage data Move data Acquire computers Analyze data Run simulations Compare experiment with simulation Search the literature Communicate with colleagues Publish papers Find, configure, install relevant software Find, access, analyze relevant data Order supplies Write proposals Write reports … 4

Most research performed in small laboratories Researchers are trained in their field, not in IT – They are not experts in collecting, moving, storing, indexing, analyzing, mining, sharing, updating, publishing, and archiving massive amounts of data Only limited capital is available for them to spend on data and IT support Investment is spent on traditional research tools (e.g., microscopes)—but the world is changing – Now need substantial and sophisticated IT to perform research, data manipulation, data mining, collaboration 5 Researchers lack advanced IT infrastructure

Globus ToolkitGlobus Online Use the Grid Reliable file transfer Software-as-a-Service globusonline.org Build the Grid Components for building custom grid solutions globustoolkit.org 6

Time-consuming tasks in science Run experiments Collect data Manage data Move data Acquire computers Analyze data Run simulations Compare experiment with simulation Search the literature Communicate with colleagues Publish papers Find, configure, install relevant software Find, access, analyze relevant data Order supplies Write proposals Write reports … 7

A A B B Discover endpoints, determine available protocols, negotiate firewalls, configure software, manage space, determine required credentials, configure protocols, detect and respond to failures, determine expected performance, determine actual performance, identify diagnose and correct network misconfigurations, integrate with file systems, … Starting with data movement 8

Globus Online In Action Terabytes 31,000 files 56h 44m No human involvement Astrophysics simulation data generated in Tennessee, moved to Illinois for visualization (Enzo, UCSD; Futures Lab, Argonne)

Globus Online highlights 10 Fire-and-forget data movement Many files and lots of data Third-party transfers Performance optimization Across multiple security domains Expert operations and support Web interface Command line interface ls alcf#dtn:~ scp alcf#dtn:~/myfile \ nersc#dtn:~/myfile HTTP REST interface POST globusonline.org/ v0.10/ transfer GridFTP servers FTP servers High-performance data transfer nodes Globus Connect on local computers

Globus Online architecture 11

Globus Connect to/from your laptop 12

What’s next? Run experiments Collect data Manage data Move data Acquire computers Analyze data Run simulations Compare experiment with simulation Search the literature Communicate with colleagues Publish papers Find, configure, install relevant software Find, access, analyze relevant data Order supplies Write proposals Write reports … 13

Our goal: To leverage software-as-a-service (SaaS) to accelerate the pace of discovery and innovation worldwide, by providing millions of researchers with unprecedented access to powerful research tools “Civilization advances by extending the number of important operations which we can perform without thinking of them” Alfred North Whitehead, 1911 Looking to the future 14

15