Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Computer Science Group Head Scientific Computing Division National.

Similar presentations


Presentation on theme: "Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Computer Science Group Head Scientific Computing Division National."— Presentation transcript:

1 Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Computer Science Group Head Scientific Computing Division National Center for Atmospheric Research Associate Professor Department of Computer Science University of Colorado at Boulder ESMF on the GRID Workshop July 20, 2005 Henry Tufo

2 Department of Compujter Science University of Colorado at Boulder 2 Motivation: NCAR as an Integrator qIt is our position that NCAR must provide integrated solutions to the community. qScientific workflows are becoming too complicated for manual (or semi-manual) implementation. qNot reasonable to expect a scientist to: q Design simulation solutions by chaining together application software packages q Manage the data lifecycle (check out, analysis, publishing, and check in) q Do this in an evolving computational and information environment qNCAR must provide the software infrastructure to allow scientist to seamlessly (and painlessly) implement their workflows, thereby allowing them to concentrate on what they’re good at: SCIENCE! qGoal is increased scientific productivity and requires an unprecedented level of integration of both systems and software. qLong term investment: return to the organization won’t show up in the bottom line immediately.

3 Department of Compujter Science University of Colorado at Boulder 3 Motivation: Robust Modeling Environments qOur goal is to develop a simple, production quality modeling environment for NCAR and the geoscience community that insulates scientists from the technical details of the execution environment q Cyberinfrastructure q System and software integration q Data archiving qGrid-BGC is an example of such an environment and is the first of these environments developed for NCAR q Learning as we develop and deploy q Tasked by the geoscience community, but developed services are applicable to other collaborative research projects

4 Department of Compujter Science University of Colorado at Boulder 4 Outline qIntroduction qCarbon Cycle Modeling qGrid-BGC Prototype Architecture qExperiences with the Grid qGrid-BGC Production Architecture qFuture Work

5 Department of Compujter Science University of Colorado at Boulder 5 Introduction: Participants qThis is a collaborative project between the National Center for Atmospheric Research (NCAR) and the University of Colorado at Boulder (CU) qNASA has provided funding for three years via the Advanced Information Systems Technology (AIST) program qResearchers: q Peter Thornton (PI), NCAR q Henry Tufo (co-PI), CU q Luca Cinquini, NCAR q Jason Cope, CU q Craig Hartsough, NCAR q Rich Loft, NCAR q Sean McCreary, CU q Don Middleton, NCAR q Nate Wilhelmi, NCAR q Matthew Woitaszek, CU

6 Department of Compujter Science University of Colorado at Boulder 6 Carbon Cycle Modeling: Workflow qScientific models q Daymet q Biome-BGC qDaymet interpolates a high resolution grid of weather observations for a region qBiome BGC calculates carbon cycle parameters at the individual grid points for each region qModels originally intended for analysis of small geographic regions. qAnalysis of larger regions is accomplished by simulating its composite regions

7 Department of Compujter Science University of Colorado at Boulder 7 Carbon Cycle Modeling: Grid-BGC Motivation qRequires the management of multiple simultaneously executing workflows q Execution management q Data management q Task automation qDistributed resources across multiple organizations q Data archive and front-end portal are located at NCAR q Execution resources are located at CU Goal: Create an easy to use computational environment for scientists running large scale carbon cycle simulations.

8 Department of Compujter Science University of Colorado at Boulder 8 Grid-BGC Prototype Architecture (June, 2004): Overview qMain components q Web portal GUI q JobManager q DataMover qPrototype goals q Create a computational grid from GT 3.2 q Create a reliable execution environment q Simplify usage and management

9 Department of Compujter Science University of Colorado at Boulder 9 Experiences qSecurity q Security breach during Spring 2004 impacted Grid-BGC’s design q Continually evolving requirement and component q One time password authentication q Short lived certificates q Unique requirements may make implementation difficult qReliability and Fault Tolerance q Grid-BGC is self healing q Detect and correct failures without scientist intervention q Hold uncorrectable errors for administrative action

10 Department of Compujter Science University of Colorado at Boulder 10 Experiences: Grid Development qGrid-BGC is the first production quality computational grid developed by NCAR and CU. qPrototype development began January 2004 and ended September 2004 qMore difficult than we anticipated q Pleasant persistence in the face of frustration is an essential quality for developers of a grid computing environment q We found that the middleware and development tools for GT 3.2 were not “production grade” for our environment, but were rapidly improving q Distributed, heterogeneous, and asynchronous system development and debugging are difficult tasks q Developers’ mileage with grid middleware will vary q Globus components may or may not be useful q While critical components are available, more specific components will certainly need development

11 Department of Compujter Science University of Colorado at Boulder 11 Experiences: Analysis of the Prototype qWhat we did well q Proof of concept grid environment was a success q Portal, grid services, and automation tools create a simplified user environment q Fault tolerant qWhat we needed to improve q Modularize the monolithic architecture q Break out functionality q Move towards a service oriented architecture q Re-evaluate data management policies and tools q GridFTP / Replica Location Service (RLS) vs. DataMover q Misuse of NCAR MSS as temporary storage platform q Use the appropriate Globus compliant tools q Many Globus components can be used in place third-party or in-house tools q More recent release of GT have improved the quality and usability of the components and documentation

12 Department of Compujter Science University of Colorado at Boulder 12 Grid-BGC Production Architecture (June, 2005): Overview qProduction architecture address prototypes goals q Ease of use q Efficient and productive qAdditionally, production architecture addresses q Modular design q GT4 Compliance q Restructuring execution and data management

13 Department of Compujter Science University of Colorado at Boulder 13 Grid-BGC Production Architecture (June, 2005): Overview

14 Department of Compujter Science University of Colorado at Boulder 14 Grid-BGC Production Architecture

15 Department of Compujter Science University of Colorado at Boulder 15 Grid-BGC Production Architecture: Improvements qUsage of more Globus Toolkit components and services q WS-GRAM q GridFTP and Reliable File Transfer (RFT) service qRestructuring data management q Remove of DataMover, replace with RFT q Use Replica Location Service (RLS) as needed qRestructuring execution management q Removal of JobManager, replace with WS GRAM and a custom client q Develop a simple workflow manager qModularize components q Break out individual components from the monolithic kernel q Create a service oriented architecture q Separate functional components q Generic and reusable

16 Department of Compujter Science University of Colorado at Boulder 16 Grid-BGC Production Architecture: GT4 Experience qImprovements in Grid middleware immediately useful in our environment q MyProxy officially supported q WS GRAM meets our expectations q RFT and GridFTP qPorting Grid-BGC prototype was easier than expected q Modular design limited the amount of changes needed q Improved documentation of GT components

17 Department of Compujter Science University of Colorado at Boulder 17 Future Work: Expansion of the Grid-BGC Environment qIntegrate NASA’s Columbia Supercomputer into the Grid-BGC environment q Late Summer 2005 q Deploy the computational framework and components q End-to-end testing between distributed Grid-BGC components qIntegrate resources provided by the system’s users

18 Department of Compujter Science University of Colorado at Boulder 18 Future Work: CU / NCAR HTWM

19 This research was supported in part by the National Aeronautics and Space Administration (NASA) under AIST Grant AIST-02-0036, the National Science Foundation (NSF) under ARI Grant #CDA- 9601817, and NSF sponsorship of the National Center for Atmospheric Research. Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Questions? Ideas? Comments? Suggestions? http://www.gridbgc.ucar.edu tufo@cs.colorado.edu http://csc.cs.colorado.edu


Download ppt "Experiences from Simulating the Global Carbon Cycle in a Grid Computing Environment Computer Science Group Head Scientific Computing Division National."

Similar presentations


Ads by Google