Presentation is loading. Please wait.

Presentation is loading. Please wait.

Open Science Grid For CI-Days NYSGrid Meeting Sebastien Goasguen, John McGee, OSG Engagement Manager School of Computing.

Similar presentations


Presentation on theme: "Open Science Grid For CI-Days NYSGrid Meeting Sebastien Goasguen, John McGee, OSG Engagement Manager School of Computing."— Presentation transcript:

1 Open Science Grid For CI-Days NYSGrid Meeting Sebastien Goasguen, sebgoa@clemson.edu John McGee, mcgee@renci.org OSG Engagement Manager School of Computing Clemson University, Clemson, SC Renaissance Computing Institute University of North Carolina, Chapel Hill, NC

2 2 21st Century Discovery The three fold way –theory –experiment –computational analysis Supported by –multimodal collaboration systems –distributed, multi-petabyte data archives –leading edge computing systems –distributed experimental facilities –distributed multidisciplinary teams Socialization and community –multidisciplinary groups –geographic distribution –new enabling technologies –creation of 21st century IT infrastructure sustainable, multidisciplinary communities Theory Experiment Simulation

3 Shift from Single User, Single Resource To: Multiple Users, Multiple Resources Any Combination of users and resources forms a Virtual Organization (VO) Grid computing is solving the problem of sharing resources among VO

4 Cyberinfrastructure : “Information Technology infrastructure to support a Virtual Organization” Therefore there are many Cyberinfrastructures not a single on The IT infrastructure is not only about HPC, but also software and applications The CI is put together to meet the needs of the VO members There are many re-usable components Leveraging existing assets is encouraged CI follows basic principles of service orientation and grid architecture The Open Science Grid aims at supporting VO to enable science, it can be a component of the CI you build for a particular VO. Disclaimer: This slide is the view of the author…

5 The Open Science Grid OSG is a consortium of software, service and resource providers and researchers, from universities, national laboratories and computing centers across the U.S., who together build and operate the OSG project. The project is funded by the NSF and DOE, and provides staff for managing various aspects of the OSG. Brings petascale computing and storage resources into a uniform grid computing environment Integrates computing and storage resources from over 50 sites in the U.S. and beyond A framework for large scale distributed resource sharing addressing the technology, policy, and social requirements of sharing

6 Principal Science Drivers High energy and nuclear physics –100s of petabytes (LHC)2007 –Several petabytes2005 LIGO (gravity wave search) –0.5 - several petabytes2002 Digital astronomy –10s of petabytes2009 –10s of terabytes2001 Other sciences emerging –Bioinformatics (10s of petabytes) –Nanoscience –Environmental –Chemistry –Applied mathematics –Materials Science

7 Virtual Organizations (VOs) The OSG Infrastructure trades in Groups not Individuals VO Management services allow registration, administration and control of members of the group. Facilities trust and authorize VOs. Storage and Compute Services prioritize according to VO group. Set of Available Resources VO Management Service OSG and WAN VO Management & Applications VO Management & Applications Campus Grid Image courtesy: UNM

8 Current OSG Resources OSG has more than 50 participating institutions, including self-operated research VOs, campus grids, regional grids and OSG-operated VOs Provides about 10,000 CPU-days per day in processing Provides 10 Terabytes per day in data transport CPU usage averages about 75% OSG is starting to offer support for MPI

9 What The OSG Offers that you may need to support your VO(s) Low-threshold access to many distributed computing and storage resources A combination of dedicated, scheduled, and opportunistic computing The Virtual Data Toolkit software packaging and distributions Grid Operations, including facility-wide monitoring, validation, information services and system integration testing Operational security Troubleshooting of end-to-end problems Education and Training

10 Date range: 2007-04-29 00:00:00 GMT - 2007-05-07 23:59:59 GMT

11 OSG Bottom line: Framework to support VOs: VO of users only VO of resources VO of users and resources Can help you with: Supporting your VO Making your resources available inside and outside campus Enable science through user engagement

12 Campus Grids to the Rescue

13 Why should my University facilitate (or drive) resource sharing? Because it’s the right thing to do –Enables new modalities of collaboration –Enables new levels of scale –Democratizes large scale computing –Sharing locally leads to sharing globally –Better overall resource utilization –Funding agencies At the heart of the cyberinfrastructure vision is the development of a cultural community that supports peer-to- peer collaboration and new modes of education based upon broad and open access to leadership computing; data and information resources; online instruments and observatories; and visualization and collaboration services. - Arden Bement CI Vision for 21 st Century introduction At the heart of the cyberinfrastructure vision is the development of a cultural community that supports peer-to- peer collaboration and new modes of education based upon broad and open access to leadership computing; data and information resources; online instruments and observatories; and visualization and collaboration services. - Arden Bement CI Vision for 21 st Century introduction

14 Campus Grids They are a fundamental building block of the OSG –The multi-institutional, multi-disciplinary nature of the OSG is a macrocosm of many campus IT infrastructure coordination issues. Currently OSG has three operational campus grids on board: –Fermilab, Purdue, Wisconsin –Working to add Clemson, Harvard, Lehigh Elevation of jobs from Campus CI to OSG is transparent Campus scale brings value through –Richness of common software stack with common interfaces –Higher common denominator makes sharing easier –Greater collective buying power with venders –Synergy through common goals and achievements

15 Simplified View

16

17 Submitting jobs through OSG to UW Campus Grid (Dan Bradley, UW Madison) schedd (Job caretaker) startd (Job Executor) HEP matchmaker CS matchmaker GLOW matchmaker flocking schedd (Job caretaker) condor_submit condor gridmanager Open Science Grid User Globus gatekeeper GUMS

18 FermiGrid - Current Architecture (Keith Chadwick) CMS WC1 CDF OSG1 CDF OSG2 D0 CAB1 GP Farm VOMS Server SAZ Server GUMS Server Step 1 - user issues voms-proxy-init user receives voms signed credentials Step 2 – user submits their grid job via globus-job-run, globus-job-submit, or condor-g Step 4 – Gateway requests GUMS Mapping based on VO & Role Step 3 – Gateway checks against Site Authorization Service clusters send ClassAds via CEMon to the site wide gateway Step 5 - Grid job is forwarded to target cluster BlueArc Periodic Synchronization D0 CAB2 Site Wide Gateway Exterior Interior

19 Clemson Campus Condor Pool Machines in 27 different locations on Campus ~1,700 job slots >1.8M hours served in 6 months users from Industrial and Chemical engineering, and Economics Fast ramp up of usage Accessible to the OSG through a gatekeeper

20 Campuses and Regional Grids Campus Condor pool backfills idle nodes in PBS clusters - provided 5.5 million CPU- hours in 2006, all from idle nodes in clusters Use on TeraGrid: 2.4 million hours in 2006 spent Building a database of hypothetical zeolite structures; 2007: 5.5 million hours allocated to TG http://www.cs.wisc.edu/condor/PCW2007/presentations/cheeseman_Purdue_Condor_Week_2007.ppt

21 “What impressed me most was how quickly we were able to access the grid and start using it. We learned about it at RENCI, and we were running jobs about two weeks later,” says Kuhlman. For each protein we design, we consume about 3,000 CPU hours across 10,000 jobs,” says Kuhlman. “Adding in the structure and atom design process, we’ve consumed about 100,000 CPU hours in total so far.” Engaging Users (more this afternoon)

22 What can we do together? Clemson’s OSG team is looking for a few partners to help deploy campus wide grid infrastructure that integrates with local enterprise infrastructure and the national CI RENCI’s OSG team is available to help scientists get their applications running on OSG –low impact starting point –Help your researchers gain significant compute cycles while exploring OSG as a framework for your own campus CI mailto: osg@renci.org

23 E N D Sebastien Goasguen, sebgoa@clemson.edu John McGee, mcgee@renci.org Questions ?


Download ppt "Open Science Grid For CI-Days NYSGrid Meeting Sebastien Goasguen, John McGee, OSG Engagement Manager School of Computing."

Similar presentations


Ads by Google