Defining the Technical Roadmap for the NWICG – OSG Ruth Pordes Fermilab
2 Open Science Grid in a nutshell Set of Collaborating Computing Farms - Compute Elements. Commodity Linux Farms; MSS & Disk. From ~20 CPUs in Department Computers To 10,000 CPU SuperComputer Any NWICG University Local Batch System OSG CE gateway OSG & the Wide Area Network
3 Set of Collaborating Storage Sites - Storage Elements. Mass Storage Systems And Disk Caches From 20 GBytes Disk Cache To 4 Petabyte Robotic Tape Systems Any NWICG Shared Storage OSG SE gateway OSG & the Wide Area Network
4 Supported Software Stacks Integrated Supported Reference Software Services — Most Services on Linux PC “Gateways” -- minimal impact on compute nodes. — Loose coupling between services, heterogeneity in releases and functionality. Independent Collections for Client, Server, Administrator
5 Grid - of - Grids Inter-Operating and Co-Operating Grids: Campus, Regional, Community, National, International Open Consortium of Virtual Organizations doing Research & Education
6 What is OSG? A Consortium of people working together to Interface Farms and Storage to a Grid and Researchers using these resources by adapting their applications to run on the Grid and Software developers providing middleware and A project that provides the Operations, Support, Training and Help to make it effective.
7 NWICG - in OSG terminology 4 Compute Elements (CEs) and/or Storage Elements (SEs) A Regional Grid -- a shared common cyberinfrastructure A Virtual Organization with which to Partner.
8 Who is OSG ? Large global physics collaborations: US ATLAS, US CMS, LIGO, CDF, D0, STAR Education projects e.g. Mariachi,I2U2. Grid technology groups: Condor, Globus, SRM, NMI. Many DOE Labs and DOE/NSF sponsored University IT facilities. Partnerships e.g. TeraGrid, European Grids, Regional/Campus Grids e.g. Texas, Wisconsin…
10 The OSG Map Aug-2006 Some OSG sites are also on TeraGrid or EGEE. 10 SEs & 50 CEs (about 30 very active)
11 Genome analyis (GADU) “bridged” GLOW jobs 2000 running jobs 500 waiting jobs Use - Daily Monitoring 04/23/2006
12 Use commodity networks - ESNet, Campus LANs Well network provisioned sites e.g. connected to Starlight to low bandwidth connections e.g. Taiwan Connectivity ranges from full-duplex, outgoing only, to fully behind firewalls. Network Connectivity
13 Bridging Campus Grid Jobs - GLOW Dispatch jobs from local security, job, storage infrastructure and “uploading” to wide-area infrastructure.
14 FermiGrid? Interfacing All Fermilab Resources to common Campus Infrastructure Gateway to Open Science Grid Unified and reliable common interface and services through one FermiGrid gateway - security, job scheduling, user management, and storage. Sharing Resources Policies and Agreements enable fast response to changes in resource needs by Fermilab users. More information is available at
15 Access to FermiGrid OSG General Users Fermilab Users CDF Farm FermiGrid Gateway OSG “agreed” Users D0 Farm Common Farms CMS Farms
16 OSG History and Goals Grown from of grass-roots collaboration of GriPhyN, iVDGL and PPDG participants in years of funding starting ~9/2006 from DOE SciDAC-II and NSF MPS and OCI Core Goal to Deliver to US LHC & LIGO scale in next 2 years: — Need to routinely distribute data at 1-5 Gbps over sites. — Need to routinely exceed 10,000 running jobs per client — Need to reach 99% success rate for 10,000 jobs per day submission under heavy load 1 GigaByte/sec
17 OSG Core Competencies Integration: software and systems. Operations: common support and procedures. Inter-Operation: across administrative and technical boundaries. Each open to technical work with NWICG
18 Integration Testing of the System A Core Part of OSG. Multi-site Integration Grid tests new OSG Releases and Configurations as a Community activity. Software Readiness and Validations occur before deployment on the Integration Grid. Integration Grid Sites - a parallel grid to the Production System Integration Grid Sites - a parallel grid to the Production System
20 Operations Model Real support organizations often play multiple roles Lines represent communication paths and, in our model, agreements. We have not progressed very far with agreements yet. Gray shading indicates that OSG Operations composed of effort from all the support centers Core Support Group + Community Contributions
23 Training - e.g. Grid Summer Workshop Year 4 Hands on. Technical trainers. Nice Setting (Padre Island). Students got their own applications to run on OSG!
24 What is the VDT? A collection of software Grid software (Condor, Globus and lots more) Virtual Data System (Origin of the name “VDT”) Utilities An easy installation Goal: Push a button, everything just works Two methods: Pacman: installs and configures it all RPM: installs some of the software, no configuration A support infrastructure
25 What software is in the VDT? Security VOMS (VO membership) GUMS (local authorization) mkgridmap (local authorization) MyProxy (proxy management) GSI SSH CA CRL updater Monitoring MonaLISA gLite CEMon Accounting OSG Gratia Job Management Condor (including Condor-G & Condor-C) Globus GRAM Data Management GridFTP (data transfer) RLS (replication location) DRM (storage management) Globus RFT Information Services Globus MDS GLUE schema & providers
26 Client tools Virtual Data System SRM clients (V1 and V2) UberFTP (GridFTP client) Developer Tools PyGlobus PyGridWare Testing NMI Build & Test VDT Tests What software is in the VDT? Support Apache Tomcat MySQL (with MyODBC) Non-standard Perl modules Wget Squid Logrotate Configuration Scripts And More!
27 Due diligence to Security Risk assessment, planning, Service auditing and checking Incident response, Awareness and Training, Configuration management, User access Authentication and Revocation, Auditing and analysis. End to end trust in quality of code executed on remote CPU -signatures? Identity and Authorization: Extended X509 Certificates OSG is a founding member of the US TAGPMA. DOEGrids provides script utilities for bulk requests of Host certs, CRL checking etc. VOMS extended attributes and infrastructure for Role Based Access Controls.
29 VO Registers with with Operations Center. User registers with VO. Sites Register with the Operations Center. VOs and Sites provide Support Center Contact and join Operations groups We’re all fun people!
30 The OSG VO A VO for individual researchers and users. Managed by the OSG itself. Where one can learn how to use the Grid!
