The DOE Science Grid Computing and Data Infrastructure for Large-Scale Science William Johnston, Lawrence Berkeley National Lab Ray Bair, Pacific Northwest.

Slides:



Advertisements
Similar presentations
Xsede eXtreme Science and Engineering Discovery Environment Ron Perrott University of Oxford 1.
Advertisements

1 US activities and strategy :NSF Ron Perrott. 2 TeraGrid An instrument that delivers high-end IT resources/services –a computational facility – over.
Earth System Curator Spanning the Gap Between Models and Datasets.
ASCR Data Science Centers Infrastructure Demonstration S. Canon, N. Desai, M. Ernst, K. Kleese-Van Dam, G. Shipman, B. Tierney.
U.S. Department of Energy’s Office of Science Basic Energy Sciences Advisory Committee Dr. Daniel A. Hitchcock October 21, 2003
High Performance Computing Course Notes Grid Computing.
EInfrastructures (Internet and Grids) US Resource Centers Perspective: implementation and execution challenges Alan Blatecky Executive Director SDSC.
Performance-responsive Middleware for Grid Computing Dr Stephen Jarvis High Performance Systems Group University of Warwick, UK High Performance Systems.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Computing and Data Infrastructure for Large-Scale Science Deploying Production Grids: NASA’s IPG and DOE’s Science Grid William E. Johnston
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Knowledge Environments for Science: Representative Projects Ian Foster Argonne National Laboratory University of Chicago
Simo Niskala Teemu Pasanen
Computing in Atmospheric Sciences Workshop: 2003 Challenges of Cyberinfrastructure Alan Blatecky Executive Director San Diego Supercomputer Center.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Commodity Grid (CoG) Kits Keith Jackson, Lawrence Berkeley National Laboratory Gregor von Laszewski, Argonne National Laboratory.
CI Days: Planning Your Campus Cyberinfrastructure Strategy Russ Hobby, Internet2 Internet2 Member Meeting 9 October 2007.
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
CoG Kit Overview Gregor von Laszewski Keith Jackson.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.
Jarek Nabrzyski, Ariel Oleksiak Comparison of Grid Middleware in European Grid Projects Jarek Nabrzyski, Ariel Oleksiak Poznań Supercomputing and Networking.
What is Cyberinfrastructure? Russ Hobby, Internet2 Clemson University CI Days 20 May 2008.
PPDG and ATLAS Particle Physics Data Grid Ed May - ANL ATLAS Software Week LBNL May 12, 2000.
Miron Livny Computer Sciences Department University of Wisconsin-Madison Welcome and Condor Project Overview.
Computing and Data Infrastructure for Large-Scale Science NERSC and the DOE Science Grid: Support for Computational and Data-Intensive Science Bill Johnston.
The Globus Project: A Status Report Ian Foster Carl Kesselman
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
DataTAG Research and Technological Development for a Transatlantic Grid Abstract Several major international Grid development projects are underway at.
ESnet PKI Developed for the DOE Science Grid and SciDAC.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
Grid Architecture William E. Johnston Lawrence Berkeley National Lab and NASA Ames Research Center (These slides are available at grid.lbl.gov/~wej/Grids)
Introduction to Grid Computing Ed Seidel Max Planck Institute for Gravitational Physics
National Collaboratories Program Overview Mary Anne ScottFebruary 7, rd DOE/NSF Meeting on LHC and Global Computing “Infostructure”
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Commodity Grid Kits Gregor von Laszewski (ANL), Keith Jackson (LBL) Many state-of-the-art scientific applications, such as climate modeling, astrophysics,
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
GRID ARCHITECTURE Chintan O.Patel. CS 551 Fall 2002 Workshop 1 Software Architectures 2 What is Grid ? "...a flexible, secure, coordinated resource- sharing.
Authors: Ronnie Julio Cole David
The Earth System Grid (ESG) Computer Science and Technologies DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.
GRIDS Center Middleware Overview Sandra Redman Information Technology and Systems Center and Information Technology Research Center National Space Science.
Cyberinfrastructure What is it? Russ Hobby Internet2 Joint Techs, 18 July 2007.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Middleware Camp NMI (NSF Middleware Initiative) Program Director Alan Blatecky Advanced Networking Infrastructure and Research.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
SOS August 21, 2006 GGF Security for Open Science Center for Enabling Technology Lead PI - Deb Agarwal, Lawrence Berkeley National Laboratory - Lawrence.
May 6, 2002Earth System Grid - Williams The Earth System Grid Presented by Dean N. Williams PI’s: Ian Foster (ANL); Don Middleton (NCAR); and Dean Williams.
August 3, March, The AC3 GRID An investment in the future of Atlantic Canadian R&D Infrastructure Dr. Virendra C. Bhavsar UNB, Fredericton.
Group Science J. Marc Overhage MD, PhD Regenstrief Institute Indiana University School of Medicine.
Leveraging the InCommon Federation to access the NSF TeraGrid Jim Basney Senior Research Scientist National Center for Supercomputing Applications University.
1 NSF/TeraGrid Science Advisory Board Meeting July 19-20, San Diego, CA Brief TeraGrid Overview and Expectations of Science Advisory Board John Towns TeraGrid.
Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Cyberinfrastructure: Many Things to Many People Russ Hobby Program Manager Internet2.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Networking: Applications and Services Antonia Ghiselli, INFN Stu Loken, LBNL Chairs.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
NSF Middleware Initiative Purpose To design, develop, deploy and support a set of reusable, expandable set of middleware functions and services that benefit.
© Copyright AARNet Pty Ltd PRAGMA Update & some personal observations James Sankar Network Engineer - Middleware.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
All Hands Meeting 2005 BIRN-CC: Building, Maintaining and Maturing a National Information Infrastructure to Enable and Advance Biomedical Research.
Supporting Advanced Scientific Computing Research Basic Energy Sciences Biological and Environmental Research Fusion Energy Sciences High Energy Physics.
Collaborations and Interactions with other Projects
NERSC and the DOE Science Grid:
Implementing Production Grids
Proposed Grid Protocol Architecture Working Group
Presentation transcript:

The DOE Science Grid Computing and Data Infrastructure for Large-Scale Science William Johnston, Lawrence Berkeley National Lab Ray Bair, Pacific Northwest National Lab Ian Foster, Argonne National Lab Al Giest, Oak Ridge National Lab Bill Kramer, National Energy Research Scientific Computing Center and the DOE Science Grid Engineering Team

2 DOE Science Grid  The Need for Science Grids l The nature of how large scale science is done is changing u distributed data, computing, people, instruments u instruments integrated with large-scale computing l “Grids” are middleware designed to facilitate the routine interactions of all of these resources in order to support widely distributed, multi-institutional science and engineering.

3 DOE Science Grid  Distributed Science Example: Supernova Cosmology l “Supernova cosmology” is cosmology that is based on finding and observing special types of supernova during the few weeks of their observable life l It has lead to some remarkable science (Science magazine’s “Breakthrough of the year award” (1998): Supernova cosmology indicates universe expands forever), however it is rapidly becoming limited by the ability of the researchers to manage the complex data-computing-instrument interactions

Supernova Cosmology Requires Complex, Widely Distributed Workflow Management

5 DOE Science Grid Supernova Cosmology l This is one of the class of problems that Grids are focused on. It involves: u management of complex workflow u reliable, wide area, high volume data management u inclusion of supercomputers in time constrained scenarios u easily accessible pools of computing resources u eventual inclusion of instruments that will be semi-automatically retargeted based on data analysis and simulation u next generation will generate vastly more data (from SNAP - satellite based observation)

6 DOE Science Grid What are Grids? l Middleware for uniform, secure, and highly capable access to large and small scale computing, data, and instrument systems, all of which are distributed across organizations l Services supporting construction of application frameworks and science portals l Persistent infrastructure for distributed applications (e.g. security services and resource discovery) l 200 people working on standards at the IETF-like Global Grid Forum (

7 DOE Science Grid Grids l There are several different types of user of Grid services u discipline scientists u problem solving system / framework / science portal developers u computational tool / application writers u Grid system managers u Grid service builders l Each of these user communities have somewhat different requirements for Grids, and the Grid services available or under development are trying to address all of these groups

Grid Information Service Uniform Resource Access Brokering Global Queuing Global Event Services Co- Scheduling Data Management Uniform Data Access Collaboration and Remote Instrument Services Network Cache Communication Services Authentication Authorization Security Services Auditing Fault Management Monitoring Grid Common Services: Standardized Services and Resources Interfaces Web Services and Portal Toolkits Applications (Simulations, Data Analysis, etc.) Application Toolkits (Visualization, Data Publication/Subscription, etc.) Execution support and Frameworks (Globus MPI, Condor-G, CORBA-G) Distributed Resources Science Portals and Scientific Workflow Management Systems Condor pools of workstations network caches tertiary storage scientific instruments clusters national supercomputer facilities Architecture of a Grid = operational services (Globus, SRB) High Speed Communication Services

9 DOE Science Grid  State of Grids l Grids are real, and they are useful now l Basic Grid services are being deployed to support uniform and secure access to computing, data, and instrument systems that are distributed across organizations l Grid execution management tools (e.g. Condor-G) are being deployed l Data services, such as uniform access to tertiary storage systems and global metadata catalogues, are being deployed (e.g. GridFTP and Storage Resource Broker) l Web services supporting application frameworks and science portals are being prototyped

10 DOE Science Grid State of Grids l Persistent infrastructure is being built u Grid services are being maintained on the compute and data systems of interest (Grid sysadmin) u cryptographic authentication supporting single sign-on is provided through Public Key Infrastructure u resource discovery services are being maintained (Grid Information Service – distributed directory service) u This is happening, e.g., in the DOE Science Grid, EU Data Grid, UK eScience Grid, NASA’s IPG, etc. u For DOE science projects, ESNet is running a PKI Certification Authority and assisting with policy issues among DOE Labs and their collaborators

11 DOE Science Grid  DOE Science Grid l SciDAC project to explore the issues for providing persistent operational Grid support in the DOE environment: LBNL, NERSC, PNNL, ANL, and ORNL u Initial computing resources l  10 small, medium, and large clusters u High bandwidth connectivity end-to-end (high-speed links from site systems to ESNet gateways) u Storage resources: four tertiary storage systems (NERSC, PNNL, ANL, and ORNL) u Globus providing the Grid Common Services u Collaboration with ESNet for security and directory services

Initial Science Grid Configuration NERSC Supercomputing & Large-Scale Storage PNNL LBNL ANL ESnet Europe DOE Science Grid ORNL ESNet MDS CA Grid Managed Resources User Interfaces Application Frameworks Applications Grid Services: Uniform access to distributed resources Grid Information Service Uniform Resource Access Brokering Global Queuing Global Event Services Co-Scheduling Data Cataloguing Uniform Data Access Collaboration and Remote Instrument Services Network Cache Communication Services Authentication Authorization Security Services AuditingFault Management Monitoring Asia-Pacific Funded by the U.S. Dept. of Energy, Office of Science, Office of Advanced Scientific Computing Research, Mathematical, Information, and Computational Sciences Division

Grid Services: Uniform access to distributed resources Science Portal and Application Framework Grid Information Service Uniform Resource Access Brokering Global Queuing Global Event Services Co-Scheduling Data Cataloguing Uniform Data Access Collaboration and Remote Instrument Services Network Cache Communication Services Authentication Authorization Security Services AuditingFault Management Monitoring compute and data management requests Grid Managed Resources NERSC Supercomputing Large-Scale Storage PNNL LBNL ANL ORNL SNAP ESnet PPDG Asia-Pacific Europe DOE Science Grid and the DOE Science Environment

14 DOE Science Grid The DOE Science Grid Program: Three Strongly Linked Challenges  How do we reliably and effectively deploy and operate a DOE Science Grid? u Requires coordinated effort by multiple labs u ESNet for directory and certificate services u Manage basic software plus import other R&D work u What else? Will see.  Application partnerships linking computer scientists and application groups u How do we exploit Grid infrastructure to facilitate DOE applications?  Enabling R&D u Extending technology base for Grids u Packaging Grid software for deployment u Developing application toolkits u Web services for science portals

15 DOE Science Grid Roadmap for the Science Grid CY Grid compute and data resources federated across partner sites using Globus Pilot simulation and collaboratory users Production Grid Info. Services and Certificate Authority Auditing and monitoring services Grid system administration Help desk, tutorials, application integration support Integrate R&D from other projects Scalable Science Grid

16 DOE Science Grid  SciDAC Applications and the DOE Science Grid l SciGrid has some computing and storage resources that can be made available to other SciDAC projects u By “some” we mean that usage authorization models do not change by incorporating a systems into the Grid u To compute on individual SciGrid systems you have to negotiate with the owners of those systems u However, all of the SciGrid systems have committed to provide some computing and data resources to SciGrid users

17 DOE Science Grid SciDAC Applications and the DOE Science Grid l There are several ways to “join” the SciGrid u As a user u As a new SciGrid site (incorporating your resources) l There are different issues for users and new SciGrid sites l Users: u Users will get instruction on how to access Grid services u Users must obtain a SciGrid PKI identity certificate u There is some client software that must run on the user’s system

18 DOE Science Grid SciDAC Applications and the DOE Science Grid l New SciGrid sites: u New SciGrid sites (where you wish to incorporate your resources into the SciGrid) need to join the Engineering Working Group l This is where the joint system admin issues are worked out l This is where Grid software issues are worked out l Keith Jackson chairs the WG

19 DOE Science Grid SciDAC Applications and the DOE Science Grid u New SciGrid sites may use the Grid Information Services (resource directory) of an existing site, or may set up their own u New SciGrid sites may also use their own PKI Certification Authorities, however the issuing CAs must have published policy compatible with the ESNet CA l Entrust CAs will work, in principle – however there is very little practical experience with this, and a little additional software may be necessary

20 DOE Science Grid Science Grid: A New Type of Infrastructure l Grid services providing standardized and highly capable distributed access to resources used by a community l Persistent services for distributed applications l Support for building science portals Vision: DOE Science Grid will lay the groundwork to support DOE Science applications that require, e.g., distributed collaborations, very large data volumes, unique instruments, and the incorporation of supercomputing resources into these environments

21 DOE Science Grid DOE Science Grid Level of Effort l LBNL/DSD, 2 FTE; NERSC, 1 FTE; ANL, 2 FTE; ORNL, 1.5 FTE; PNNL, 1.0 FTE u Project Leadership (0.35 FTE): Johnston/LBNL, Foster/ANL, Geist/ORNL, Bair/PNNL, Kramer/NERSC u Directory and Certificate Infrastructure (1.0 FTE): ANL and LBNL/ESnet u Grid Deployment, Evaluation, Enhancement (2.7 FTE): ANL, LBNL, ORNL, PNNL, NERSC u Application and User Support (1.5 FTE): ANL, ORNL, PNNL u Monitoring and Auditing Infrastructure and Security Model (2.0 FTE): ANL, LBNL, ORN