Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys.

Slides:



Advertisements
Similar presentations
The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
Advertisements

High Performance Computing Course Notes Grid Computing.
11 CS 525 Advanced Distributed Systems Spring 2011 Indranil Gupta (Indy) Old Wine: Stale or Vintage? April 14, 2011 All Slides © IG.
1 On Death, Taxes, & the Convergence of Peer-to-Peer & Grid Computing Adriana Iamnitchi Duke University “Our Constitution is in actual operation; everything.
A Computation Management Agent for Multi-Institutional Grids
Introduction to Grids and Grid applications Gergely Sipos MTA SZTAKI
MTA SZTAKI Hungarian Academy of Sciences Grid Computing Course Porto, January Introduction to Grid portals Gergely Sipos
Seminar Grid Computing ‘05 Hui Li Sep 19, Overview Brief Introduction Presentations Projects Remarks.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
 Amazon Web Services announced the launch of Cluster Compute Instances for Amazon EC2.  Which aims to provide high-bandwidth, low- latency instances.
1-2.1 Grid computing infrastructure software Brief introduction to Globus © 2010 B. Wilkinson/Clayton Ferner. Spring 2010 Grid computing course. Modification.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
4b.1 Grid Computing Software Components of Globus 4.0 ITCS 4010 Grid Computing, 2005, UNC-Charlotte, B. Wilkinson, slides 4b.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Globus Computing Infrustructure Software Globus Toolkit 11-2.
Grid and Cloud Computing Anda Iamnitchi CIS 6930 Spring 2011
Grid Computing Net 535.
Introduction to Grid Computing Ann Chervenak Carl Kesselman And the members of the Globus Team.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 Globus Developments Malcolm Atkinson for OMII SC 18 th January 2005.
Globus 4 Guy Warner NeSC Training.
Includes slides borrowed freely from The Globus team Building Grid Services and The Globus Toolkit ® CISE : Globus Tutorial Anda Iamnitchi.
Kate Keahey Argonne National Laboratory University of Chicago Globus Toolkit® 4: from common Grid protocols to virtualization.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Grid Toolkits Globus, Condor, BOINC, Xgrid Young Suk Moon.
Grid Computing. What is a Grid? Many definitions exist in the literature Early definitions: Foster and Kesselman, 1998 –“A computational grid is a hardware.
A Lightweight Platform for Integration of Resource Limited Devices into Pervasive Grids Stavros Isaiadis and Vladimir Getov University of Westminster
DISTRIBUTED COMPUTING
Grid Computing - AAU 14/ Grid Computing Josva Kleist Danish Center for Grid Computing
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
The Grid: The First 50 Years Ian Foster Argonne National Laboratory University of Chicago Carl Kesselman Information Sciences Institute University of Southern.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
The Anatomy of the Grid Mahdi Hamzeh Fall 2005 Class Presentation for the Parallel Processing Course. All figures and data are copyrights of their respective.
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
1 4/23/2007 Introduction to Grid computing Sunil Avutu Graduate Student Dept.of Computer Science.
Copyright © 2002 Intel Corporation. Intel Labs Towards Balanced Computing Weaving Peer-to-Peer Technologies into the Fabric of Computing over the Net Presented.
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
Grid Middleware Tutorial / Grid Technologies IntroSlide 1 /14 Grid Technologies Intro Ivan Degtyarenko ivan.degtyarenko dog csc dot fi CSC – The Finnish.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
Grid Security: Authentication Most Grids rely on a Public Key Infrastructure system for issuing credentials. Users are issued long term public and private.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
CEOS Working Group on Information Systems and Services - 1 Data Services Task Team Discussions on GRID and GRIDftp Stuart Doescher, USGS WGISS-15 May 2003.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]
7. Grid Computing Systems and Resource Management
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Development of e-Science Application Portal on GAP WeiLong Ueng Academia Sinica Grid Computing
Military Technical Academy Bucharest, 2006 GRID - Synthesis - ADINA RIPOSAN Department of Applied Informatics.
2005 GRIDS Community Workshop1 Learning From Cyberinfrastructure Initiatives Grid Research Integration Development & Support
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
ACGT Architecture and Grid Infrastructure Juliusz Pukacki ‏ EGEE Conference Budapest, 4 October 2007.
Grid and Cloud Computing
Clouds , Grids and Clusters
Globus —— Toolkits for Grid Computing
University of Technology
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Introduction to Grid Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

Grid Computing Anda Iamnitchi Federated Distributed Systems, Fall ‘06 Including slides adapted from presentations by Ian Foster, Lee Liming, Paul Jeffreys

Front page FT, 7th March 2000

But…

What is the Grid? “ Resource sharing & coordinated problem solving in dynamic, multi- institutional virtual organizations” “When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder) “The Anatomy of the Grid”, Foster, Kesselman, Tuecke, 2001

Motivation (1): Revolution in Science Pre-Internet –Theorize &/or experiment, alone or in small teams; publish paper Post-Internet –Construct and mine large databases of observational or simulation data –Develop simulations & analyses –Access specialized devices remotely –Exchange information within distributed multidisciplinary teams

Motivation (2): Revolution in Business Pre-Internet –Central data processing facility Post-Internet –Enterprise computing is highly distributed, heterogeneous, inter-enterprise (B2B) –Business processes increasingly computing- & data-rich –Outsourcing becomes feasible => service providers of various sorts

The (Power) Grid: On-Demand Access to Electricity Time Quality, economies of scale

By Analogy, A Computing Grid Decouple production and consumption –Enable on-demand access –Achieve economies of scale –Enhance consumer flexibility –Enable new devices

Not Exactly a New Idea … “The time-sharing computer system can unite a group of investigators …. one can conceive of such a facility as an … intellectual public utility.” –Fernando Corbato and Robert Fano, 1966 “We will perhaps see the spread of ‘computer utilities’, which, like present electric and telephone utilities, will service individual homes and offices across the country.” –Len Kleinrock, 1967

But Things are Different Now …

Computing isn’t Really Like Electricity I import electricity but must export data “Computing” is not interchangeable but highly heterogeneous: data, sensors, services, … This complicates things; but also means that the sum can be greater than the parts –Real opportunity: Construct new capabilities dynamically from distributed services Raises fundamental questions –Achieving economies of scale –Quality of service across distributed services –Applications that exploit synergies

How Can We Tell Hype from Facts? Everyday problem, isn’t it? Learn/verify the facts Know the context –Multi-institutional (== federated) Thus, a cluster? (Sun Grid Engine!!!) –Dynamic (somewhat) Look at results –Research innovation (in computer and computational science) –Scientific discovery –Existing/deployed grids

“We must address scale & failure” P2P and Grids: Resource Sharing Across Administrative Domains “We need infrastructure” “On Death, Taxes and the Convergence of P2P and Grids”, Foster, Iamnitchi 2003

Compare & Contrast (1): Definitions Grid: P2P: “Infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high-end computational capabilities” (1998) “A system that coordinates resources not subject to centralized control, using open, general-purpose protocols to deliver nontrivial QoS” (2002) “Applications that takes advantage of resources at the edges of the Internet” (2000) “Decentralized, self-organizing distributed systems, in which all or most communication is symmetric” (2002)

Compare and Contrast (2): Details of Deployed Systems Target communities and incentives Resources engaged Applications Scale and failure Services and infrastructure

Target Communities & Incentives Grid Established communities –Science, some industry –Homogeneous –Restricted participation Good behavior: –Implicit incentives –Means to enforce it Consequences: Trust Well-defined “tax base” Less flexibility? P2P Anonymous individuals No implicit incentives for good behavior Consequences: No trust Free riding Implicit incentives for cheating: music sharing

Resources Grid More diverse (in type): –Files, storage, computing power, network, instruments More powerful Good availability Well connected Technical support Consequence: Costly resource integration P2P Computing cycles XOR files Less powerful Intermittent participation –Gnutella: avg. lifetime 1h (‘01) –MojoNation: 1/6 users always on –Overnet: 50% nodes available 70% of time over a week (‘02) Variably connected Some technical support as community effort Consequence: Ease of integration of new resources an early priority

Applications Grid Often complex & involving various combinations of –Data manipulation –Computation –Tele-instrumentation Wide range of computational models, e.g. –Embarrassingly || –Tightly coupled –Workflow Consequences: –Complexity often inherent in the application itself –(Inevitably?) Complex infrastructure to support applications P2P Some –File sharing –Number crunching –Content distribution –Measurements “Toy” applications only? –Albeit very popular “toys”! Consequence: –Complexity often derives from scale

Scale and Failure Grid Moderate number of entities –100s institutions, 1000s users Large amounts of activity –4.5 TB/day (D0 experiment) Approaches to failure reflect assumptions –E.g., centralized components P2P Large numbers of entities: –Millions of users Moderate activity –E.g., 1-2 TB in Gnutella (’01) Diverse approaches to failure –Some centralized (SETI, …) –Some highly self-configuring FastTrack3,488,719 eDonkey1,661,132 iMesh1,211,965 Overnet1,146,880 MP2P250,927 Gnutella219,009 DirectConnect204,237 ( January 25, 2004)

Grids for Physics: LHC Computing Grid

Services and Infrastructure Grid Standard protocols (Global Grid Forum, etc.) De facto standard software (open source Globus Toolkit) Shared infrastructure (authentication, discovery, resource access, etc.) Consequences: Reusable services Large developer & user communities Interoperability & code reuse P2P Each application defines & deploys completely independent “infrastructure” JXTA, BOINC, XtremWeb? Efforts started to define common APIs, albeit with limited scope to date Consequences: New (albeit simple) install per application Interoperability & code reuse not achieved

Convergent Environment: Large, Dynamic, Self-Configuring Grids Scale & volatility Functionality & infrastructure Grids P2P Large scale Weaker trust assumptions Ease of integration No centralized authority Intermittent resource/user participation Diversity in: Shared resources Sharing characteristics Variable technical support Infrastructure (sharable services) Support for diverse applications

Existing Technologies are Helpful, but Not Complete Solutions Peer-to-peer technologies –Limited scope and mechanisms Enterprise-level distributed computing –Limited cross-organizational support Databases –Vertically integrated solutions Web services –Not dynamic Semantic web –Limited focus

What’s Missing is Support for … Sharing & integration of resources, via –Discovery –Provisioning –Access (computation, data, …) –Security –Policy –Fault tolerance –Management In dynamic, scalable, multi-organizational settings

Building the Grid Open source software –Globus Toolkit ®, UK OGSA DAI, Condor, … Open standards –OGSA, other GGF, IETF, W3C standards, … Open communities –Global Grid Forum, Globus International, collaborative projects, … Open infrastructure –UK eScience, NSF Cyberinfrastructure, StarLight, AP- Grid, …

Globus Toolkit ® History DARPA, NSF, and DOE begin funding Grid work NASA begins funding Grid work, DOE adds support The Grid: Blueprint for a New Computing Infrastructure published GT Released Early Application Successes Reported NSF & European Commission Initiate Many New Grid Projects Anatomy of the Grid Paper Released Significant Commercial Interest in Grids Physiology of the Grid Paper Released GT 2.0 Released Does not include downloads from: NMI, UK eScience, EU Datagrid, IBM, Platform, etc.

How It Started While helping to build/integrate a diverse range of distributed applications, the same problems kept showing up over and over again. –Too hard to keep track of authentication data (ID/password) across institutions –Too hard to monitor system and application status across institutions –Too many ways to submit jobs –Too many ways to store & access files and data –Too many ways to keep track of data –Too easy to leave “dangling” resources lying around (robustness)

Forget Homogeneity! Trying to force homogeneity on users is futile. Everyone has their own preferences, sometimes even dogma. The Internet provides the model…

What Does the Globus Toolkit Cover? Goal Today

Theory -> Practice

building a grid (in practice)

Methodology Building a Grid system or application is currently an exercise in software integration. –Define user requirements –Derive system requirements or features –Survey existing components –Identify useful components –Develop components to fit into the gaps –Integrate the system –Deploy and test the system –Maintain the system during its operation This should be done iteratively, with many loops and eddys in the flow.

How it Really Happens Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service

How it Really Happens (without Globus) Web Browser Compute Server Data Catalog Data Viewer Tool Certificate authority Chat Tool Credential Repository Web Portal Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Registration Service A B C D E Application Developer 10 Off the Shelf12 Globus Toolkit0 Grid Community 0

How it Really Happens (with Globus) Web Browser Compute Server Globus MCS/RLS Data Viewer Tool Certificate Authority CHEF Chat Teamlet MyProxy CHEF Compute Server Resources implement standard access & management interfaces Collective services aggregate &/or virtualize resources Users work with client applications Application services organize VOs & enable access to other services Database service Database service Database service Simulation Tool Camera Telepresence Monitor Globus Index Service Globus GRAM Globus DAI Application Developer 2 Off the Shelf9 Globus Toolkit4 Grid Community 4

What Is the Globus Toolkit? The Globus Toolkit is a collection of solutions to problems that frequently come up when trying to build collaborative distributed applications. Not turnkey solutions, but building blocks and tools for application developers and system integrators. –Some components (e.g., file transfer) go farther than others (e.g., remote job submission) toward end-user relevance. To date (v1.0 - v4.0), the Toolkit has focused on simplifying heterogeneity for application developers. The goal has been to capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF). –The Toolkit also includes reference implementations of new/proposed standards in these organizations.

How To Use the Globus Toolkit By itself, the Toolkit has surprisingly limited end user value. –There’s very little user interface material there. –You can’t just give it to end users (scientists, engineers, marketing specialists) and tell them to do something useful! The Globus Toolkit is useful to application developers and system integrators. –You’ll need to have a specific application or system in mind. –You’ll need to have the right expertise. –You’ll need to set up prerequisite hardware/software. –You’ll need to have a plan.

Data Management Security Common Runtime Execution Management Information Services Web Services Components Non-WS Components Pre-WS Authentication Authorization GridFTP Grid Resource Allocation Mgmt (Pre-WS GRAM) Monitoring & Discovery System (MDS2) C Common Libraries GT2GT2 WS Authentication Authorization Reliable File Transfer OGSA-DAI [Tech Preview] Grid Resource Allocation Mgmt (WS GRAM) Monitoring & Discovery System (MDS4) Java WS Core Community Authorization Service GT3GT3 Replica Location Service XIO GT3GT3 Credential Management GT4GT4 Python WS Core [contribution] C WS Core Community Scheduler Framework [contribution] Delegation Service GT4GT4 Globus Toolkit Components

Increased functionality, standardization Custom solutions Open Grid Services Arch Real standards Multiple implementations Web services, etc. Managed shared virtual systems Computer science research Globus Toolkit Defacto standard Single implementation Internet standards The Emergence of Open Grid Standards 2010

Grid Communities Global Grid Forum –Standards, information exchange, advocacy –1000+ participants in tri-annual meetings Application communities –E.g., physics, earthquake engineering, biomedical, etc. Software development and support –NSF Middleware Initiative, UK eScience, Globus Toolkit, EGEE, …

Grid Communities & Technologies Yesterday –Small, static communities, primarily in science –Focus on sharing of computing resources –Globus Toolkit as technology base Today –Larger communities in science; early industry –Focused on sharing of data and computing –Open Grid Services Architecture Tomorrow –Large, dynamic, diverse communities that share a wide variety of services, resources, data –Challenging computer science research issues

Grid Dynamics: Vision vs. Reality Vision: On-demand access to computing –New communities form easily –On-demand resources from providers –Adapt easily to new missions, requirements Reality: Much manual configuration, e.g.: –Manually deployed services on dedicated hardware –Manually maintained access control lists –Sysadmin-maintained allocation policies –Human-mediated resource reservation

Reading Sources fp.mcs.anl.gov/~foster/talks.htm The Grid Book (other links on the course page)