The Grid: Globus and the Open Grid Services Architecture Dr. Carl Kesselman Director Center for Grid Technologies Information Sciences Institute University.

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

The Anatomy of the Grid: An Integrated View of Grid Architecture Carl Kesselman USC/Information Sciences Institute Ian Foster, Steve Tuecke Argonne National.
High Performance Computing Course Notes Grid Computing.
This product includes material developed by the Globus Project ( Introduction to Grid Services and GT3.
Condor-G: A Computation Management Agent for Multi-Institutional Grids James Frey, Todd Tannenbaum, Miron Livny, Ian Foster, Steven Tuecke Reporter: Fu-Jiun.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Globus Toolkit Futures: An Open Grid Services Architecture Ian Foster Carl Kesselman Jeffrey Nick Steven Tuecke Globus Tutorial, Argonne National Laboratory,
Grid Computing & Web Services: A Natural Partnership Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of.
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
The LHC Computing Grid Project Tomi Kauppi Timo Larjo.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grid and e-Science Technologies Simon Cox Technical Director Southampton Regional e-Science Centre.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Simo Niskala Teemu Pasanen
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Globus 4 Guy Warner NeSC Training.
The Challenges of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The.
The Grid as Infrastructure and Application Enabler Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
Vladimir Litvin, Harvey Newman Caltech CMS Scott Koranda, Bruce Loftis, John Towns NCSA Miron Livny, Peter Couvares, Todd Tannenbaum, Jamie Frey Wisconsin.
Grappa: Grid access portal for physics applications Shava Smallen Extreme! Computing Laboratory Department of Physics Indiana University.
XCAT Science Portal Status & Future Work July 15, 2002 Shava Smallen Extreme! Computing Laboratory Indiana University.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
Grid Security Steve Tuecke Argonne National Laboratory.
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
The Anatomy of the Grid: An Integrated View of Grid Architecture Ian Foster, Steve Tuecke Argonne National Laboratory The University of Chicago Carl Kesselman.
Development Timelines Ken Kennedy Andrew Chien Keith Cooper Ian Foster John Mellor-Curmmey Dan Reed.
The Grid and the Future of Business Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
The Anatomy of the Grid Introduction The Nature of Grid Architecture Grid Architecture Description Grid Architecture in Practice Relationships with Other.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
Middleware for Grid Computing and the relationship to Middleware at large ECE 1770 : Middleware Systems By: Sepehr (Sep) Seyedi Date: Thurs. January 23,
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
1 ARGONNE  CHICAGO Grid Introduction and Overview Ian Foster Argonne National Lab University of Chicago Globus Project
Grid Services I - Concepts
Authors: Ronnie Julio Cole David
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
Prof S.Ramachandram Dept of CSE,UCE Osmania University
GriPhyN EAC Meeting (Jan. 7, 2002)Paul Avery1 Integration with iVDGL è International Virtual-Data Grid Laboratory  A global Grid laboratory (US, EU, Asia,
The Grid Enabling Resource Sharing within Virtual Organizations Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department.
7. Grid Computing Systems and Resource Management
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
1 Service oriented computing Gergely Sipos, Péter Kacsuk
Middleware and the Grid Steven Tuecke Mathematics and Computer Science Division Argonne National Laboratory.
Realizing the Promise of Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science.
The Globus Toolkit The Globus project was started by Ian Foster and Carl Kesselman from Argonne National Labs and USC respectively. The Globus toolkit.
1 Open Science Grid: Project Statement & Vision Transform compute and data intensive science through a cross- domain self-managed national distributed.
] Open Science Grid Ben Clifford University of Chicago
Clouds , Grids and Clusters
Grid Computing B.Ramamurthy 9/22/2018 B.Ramamurthy.
Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.
Grid Introduction and Overview
The Grid and the Future of Business
Grid Services B.Ramamurthy 12/28/2018 B.Ramamurthy.
Introduction to Grid Technology
The Anatomy and The Physiology of the Grid
The Anatomy and The Physiology of the Grid
Presentation transcript:

The Grid: Globus and the Open Grid Services Architecture Dr. Carl Kesselman Director Center for Grid Technologies Information Sciences Institute University of Southern California

Outline l Why Grids l Grid Technology l Applications of Grids in Physics l Summary

Grid Computing

How do we solve problems? l Communities committed to common goals -Virtual organizations l Teams with heterogeneous members & capabilities l Distributed geographically and politically -No location/organization possesses all required skills and resources l Adapt as a function of the situation -Adjust membership, reallocate responsibilities, renegotiate resources

The Grid Vision “ Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations” -On-demand, ubiquitous access to computing, data, and services -New capabilities constructed dynamically and transparently from distributed services “When the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances” (George Gilder)

The Grid Opportunity: eScience and eBusiness l Physicists worldwide pool resources for peta-op analyses of petabytes of data l Civil engineers collaborate to design, execute, & analyze shake table experiments l An insurance company mines data from partner hospitals for fraud detection l An application service provider offloads excess load to a compute cycle provider l An enterprise configures internal & external resources to support eBusiness workload

Grid Communities & Applications: Data Grids for High Energy Physics Tier2 Centre ~1 TIPS Online System Offline Processor Farm ~20 TIPS CERN Computer Centre FermiLab ~4 TIPS France Regional Centre Italy Regional Centre Germany Regional Centre Institute Institute ~0.25TIPS Physicist workstations ~100 MBytes/sec ~622 Mbits/sec ~1 MBytes/sec There is a “bunch crossing” every 25 nsecs. There are 100 “triggers” per second Each triggered event is ~1 MByte in size Physicists work on analysis “channels”. Each institute will have ~10 physicists working on one or more channels; data for these channels should be cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec or Air Freight (deprecated) Tier2 Centre ~1 TIPS Caltech ~1 TIPS ~622 Mbits/sec Tier 0 Tier 1 Tier 2 Tier 4 1 TIPS is approximately 25,000 SpecInt95 equivalents

Grid Communities and Applications: Network for Earthquake Eng. Simulation l NEESgrid: US national infrastructure to couple earthquake engineers with experimental facilities, databases, computers, & each other l On-demand access to experiments, data streams, computing, archives, collaboration NEESgrid: Argonne, Michigan, NCSA, UIUC, USC

Living in an Exponential World (1) Computing & Sensors Moore’s Law: transistor count doubles each 18 months Magnetohydro- dynamics star formation

Living in an Exponential World: (2) Storage l Storage density doubles every 12 months l Dramatic growth in online data (1 petabyte = 1000 terabyte = 1,000,000 gigabyte) -2000~0.5 petabyte -2005~10 petabytes -2010~100 petabytes -2015~1000 petabytes? l Transforming entire disciplines in physical and, increasingly, biological sciences; humanities next?

An Exponential World: (3) Networks (Or, Coefficients Matter …) l Network vs. computer performance -Computer speed doubles every 18 months -Network speed doubles every 9 months -Difference = order of magnitude per 5 years l 1986 to Computers: x 500 -Networks: x 340,000 l 2001 to Computers: x 60 -Networks: x 4000 Moore’s Law vs. storage improvements vs. optical improvements. Graph from Scientific American (Jan- 2001) by Cleo Vilett, source Vined Khoslan, Kleiner, Caufield and Perkins.

Requirements Include … l Dynamic formation and management of virtual organizations l Online negotiation of access to services: who, what, why, when, how l Establishment of applications and systems able to deliver multiple qualities of service l Autonomic management of infrastructure elements l Open, extensible, evolvable infrastructure

The Grid World: Current Status l Dozens of major Grid projects in scientific & technical computing/research & education l Considerable consensus on key concepts and technologies -Open source Globus Toolkit™ a de facto standard for major protocols & services -Far from complete or perfect, but out there, evolving rapidly, and large tool/user base l Industrial interest emerging rapidly l Opportunity: convergence of eScience and eBusiness requirements & technologies

Globus Toolkit l Globus Toolkit is the source of many of the protocols described in “Grid architecture” l Adopted by almost all major Grid projects worldwide as a source of infrastructure l Open source, open architecture framework encourages community development l Active R&D program continues to move technology forward l Developers at ANL, USC/ISI, NCSA, LBNL, and other institutions

User process #1 Proxy Authenticate & create proxy credential GSI (Grid Security Infrastruc- ture) Gatekeeper (factory) Reliable remote invocation GRAM (Grid Resource Allocation & Management) Reporter (registry + discovery) User process #2 Proxy #2 Create process Register The Globus Toolkit in One Slide l Grid protocols (GSI, GRAM, …) enable resource sharing within virtual orgs; toolkit provides reference implementation ( = Globus Toolkit services) l Protocols (and APIs) enable other tools and services for membership, discovery, data mgmt, workflow, … Other service (e.g. GridFTP) Other GSI- authenticated remote service requests GIIS: Grid Information Index Server (discovery) MDS-2 (Meta Directory Service) Soft state registration; enquiry

Globus Toolkit: Evaluation (+) l Good technical solutions for key problems, e.g. u Authentication and authorization u Resource discovery and monitoring u Reliable remote service invocation u High-performance remote data access l This & good engineering is enabling progress u Good quality reference implementation, multi- language support, interfaces to many systems, large user base, industrial support u Growing community code base built on tools

Globus Toolkit: Evaluation (-) l Protocol deficiencies, e.g. u Heterogeneous basis: HTTP, LDAP, FTP u No standard means of invocation, notification, error propagation, authorization, termination, … l Significant missing functionality, e.g. u Databases, sensors, instruments, workflow, … u Virtualization of end systems (hosting envs.) l Little work on total system properties, e.g. u Dependability, end-to-end QoS, … u Reasoning about system properties

“Web Services” l Increasingly popular standards-based framework for accessing network applications -W3C standardization; Microsoft, IBM, Sun, others l WSDL: Web Services Description Language -Interface Definition Language for Web services l SOAP: Simple Object Access Protocol -XML-based RPC protocol; common WSDL target l WS-Inspection -Conventions for locating service descriptions l UDDI: Universal Desc., Discovery, & Integration -Directory for Web services

Web Services Example: Database Service l WSDL definition for “DBaccess” porttype defines operations and bindings, e.g.: -Query(QueryLanguage, Query, Result) -SOAP protocol l Client C, Java, Python, etc., APIs can then be generated DBaccess

Transient Service Instances l “Web services” address discovery & invocation of persistent services -Interface to persistent state of entire enterprise l In Grids, must also support transient service instances, created/destroyed dynamically -Interfaces to the states of distributed activities -E.g. workflow, video conf., dist. data analysis l Significant implications for how services are managed, named, discovered, and used -In fact, much of our work is concerned with the management of service instances

OGSA Design Principles l Service orientation to virtualize resources -Everything is a service l From Web services -Standard interface definition mechanisms: multiple protocol bindings, local/remote transparency l From Grids -Service semantics, reliability and security models -Lifecycle management, discovery, other services l Multiple “hosting environments” -C, J2EE,.NET, …

OGSA Service Model l System comprises (a typically few) persistent services & (potentially many) transient services -Everything is a service l OGSA defines basic behaviors of services: fundamental semantics, life-cycle, etc. -More than defining WSDL wrappers

Open Grid Services Architecture: Fundamental Structure l WSDL conventions and extensions for describing and structuring services -Useful independent of “Grid” computing l Standard WSDL interfaces & behaviors for core service activities -portTypes and operations => protocols

The Grid Service = Interfaces + Service Data Service data element Service data element Service data element GridService… other interfaces … Implementation Service data access Explicit destruction Soft-state lifetime Notification Authorization Service creation Service registry Manageability Concurrency Reliable invocation Authentication Hosting environment/runtime (“C”, J2EE,.NET, …)

The GriPhyN Project l Amplify science productivity through the Grid -Provide powerful abstractions for scientists: datasets and transformations, not files and programs -Using a grid is harder than using a workstation. GriPhyN seeks to reverse this situation! l These goals challenge the boundaries of computer science in knowledge representation and distributed computing. l Apply these advances to major experiments -Not just developing solutions, but proving them through deployment

GriPhyN Approach l Virtual Data -Tracking the derivation of experiment data with high fidelity -Transparency with respect to location and materialization l Automated grid request planning -Advanced, policy driven scheduling l Achieve this at peta-scale magnitude l We present here a vision that is still 3 years away, but the foundation is starting to come together

Virtual Data l Track all data assets l Accurately record how they were derived l Encapsulate the transformations that produce new data objects l Interact with the grid in terms of requests for data derivations

GriPhyN/PPDG Data Grid Architecture Application Planner Executor Catalog Services Info Services Policy/Security Monitoring Repl. Mgmt. Reliable Transfer Service Compute ResourceStorage Resource DAG (concrete) DAG (abstract) DAGMAN, Kangaroo GRAMGridFTP; GRAM; SRM GSI, CAS MDS MCAT; GriPhyN catalogs GDMP MDS Globus

Virtual Data in CMS Virtual Data Long Term Vision of CMS: CMS Note 2001/047, GRIPHYN

NCSA Linux cluster 5) Secondary reports complete to master Master Condor job running at Caltech 7) GridFTP fetches data from UniTree NCSA UniTree - GridFTP- enabled FTP server 4) 100 data files transferred via GridFTP, ~ 1 GB each Secondary Condor job on WI pool 3) 100 Monte Carlo jobs on Wisconsin Condor pool 2) Launch secondary job on WI pool; input files via Globus GASS Caltech workstation 6) Master starts reconstruction jobs via Globus jobmanager on cluster 8) Processed objectivity database stored to UniTree 9) Reconstruction job reports complete to master GriPhyN Challenge Problem: CMS Event Reconstruction Work of: Scott Koranda, Miron Livny, Vladimir Litvin, & others

GriPhyN-LIGO SC2001 Demo Work of: Ewa Deelman, Gaurang Mehta, Scott Koranda, & others

iVDGL: A Global Grid Laboratory l International Virtual-Data Grid Laboratory -A global Grid laboratory (US, Europe, Asia, South America, …) -A place to conduct Data Grid tests “at scale” -A mechanism to create common Grid infrastructure -A laboratory for other disciplines to perform Data Grid tests -A focus of outreach efforts to small institutions l U.S. part funded by NSF ( ) -$13.7M (NSF) + $2M (matching) “We propose to create, operate and evaluate, over a sustained period of time, an international research laboratory for data-intensive science.” From NSF proposal, 2001

iVDGL Components l Computing resources -2 Tier1 laboratory sites (funded elsewhere) -7 Tier2 university sites software integration -3 Tier3 university sites outreach effort l Networks -USA (TeraGrid, Internet2, ESNET), Europe (Géant, …) -Transatlantic (DataTAG), Transpacific, AMPATH?, … l Grid Operations Center (GOC) -Joint work with TeraGrid on GOC development l Computer Science support teams -Support, test, upgrade GriPhyN Virtual Data Toolkit l Education and Outreach l Coordination, management

iVDGL Components (cont.) l High level of coordination with DataTAG -Transatlantic research network (2.5 Gb/s) connecting EU & US l Current partners -TeraGrid, EU DataGrid, EU projects, Japan, Australia l Experiments/labs requesting participation -ALICE, CMS-HI, D0, BaBar, BTEV, PDC (Sweden)

-U FloridaCMS -CaltechCMS, LIGO -UC San DiegoCMS, CS -Indiana UATLAS, GOC -Boston UATLAS -U Wisconsin, MilwaukeeLIGO -Penn StateLIGO -Johns HopkinsSDSS, NVO -U Chicago/ArgonneCS -U Southern CaliforniaCS -U Wisconsin, MadisonCS -Salish KootenaiOutreach, LIGO -Hampton UOutreach, ATLAS -U Texas, BrownsvilleOutreach, LIGO -FermilabCMS, SDSS, NVO -BrookhavenATLAS -Argonne LabATLAS, CS Initial US iVDGL Participants Tier2 / Software CS support Tier3 / Outreach Tier1 / Labs (funded elsewhere)

Summary l Technology exponentials are changing the shape of scientific investigation & knowledge -More computing, even more data, yet more networking l The Grid: Resource sharing & coordinated problem solving in dynamic, multi-institutional virtual organizations l Current Grid Technology

Partial Acknowledgements l Open Grid Services Architecture design -Karl USC/ISI -Ian Foster, Steve -Jeff Nick, Steve Graham, Jeff IBM l Globus Toolkit R&D also involves many fine scientists & engineers at ANL, USC/ISI, and elsewhere (see l Strong links with many EU, UK, US Grid projects l Support from DOE, NASA, NSF, Microsoft

For More Information l Grid Book - l The Globus Project™ - l OGSA - l Global Grid Forum -