1 Grid Computing in North Carolina: Past and Present SURA Cyber-infrastructure Workshop Georgia State university January 6, 2005 MCNC Grid Computing and Networking Services Phil Emer Chuck Kesler
2 MCNC’s Role in Grid MCNC is a service provider Manages production infrastructure Manages production infrastructure For R&E community across NC For R&E community across NC So to us, grid is: Infrastructure Infrastructure an access method an access method A service delivery platform A service delivery platform MCNC is the experiment support center for the National Lambda Rail
3 NC Research and Education Network Qwest Internet Abilene Level3 (GbE) RaleighRTP OC48 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy Greensboro Winston-Salem Charlotte Level3 Duke (GbE)NCSU (GbE) UNC-CH (GbE) UNC-G NCAT ASU WFU WSSU NCSA UNC-C Qwest 7609 Wilmington Fayetteville Greenville ECU ECSU CMST FSU UNCP UNCW Asheville UNCA WCU Greenville OC12 SRP Ring counter-rotating ring <=50ms reroute Fully active redundancy
4 Toward Grid We have many administrative domains We have a network of distributed points of presence We provide access to shared resources – which are distributed So conditions are favorable for attaining a state of grid-ness What we need is an exercising application NC BioGrid!
5 Grid Computing in North Carolina: Past and Present Chuck Kesler January 2005
6 The Grid Revolution in NC NC BioGrid Proving ground for Grid Successful prototype apps Catalyst for collaboration International recognition
7 Why Bio + Grid? (circa 2002) Moore’s law has allowed labs to keep ahead of data, but sequence data is now outpacing processing capability Biotech and pharma industries are highly competitive and capital intensive Getting ahead and staying ahead of the competition will require the creation of new and unique capabilities It’s about staying ahead of the curve...
8 The NC BioGrid Partnership NC Biotech Center Provided the catalyst through the NC Genomics & Bioinformatics Consortium Provided the catalyst through the NC Genomics & Bioinformatics ConsortiumMCNC Provided the funding and dedicated staff Provided the funding and dedicated staffSun Donated infrastructure hardware Donated infrastructure hardware Established Sun Center of Excellence in Bioinformatics Established Sun Center of Excellence in BioinformaticsIBM Donated human capital (application developers) Donated human capital (application developers) Triangle Universities Focal point for the collaboration Focal point for the collaboration Brought early adopters to the table Brought early adopters to the table Created collaborative working groups Created collaborative working groups
9 NC BioGrid Accomplishments In the Summer of 2002, installed a dedicated testbed for evaluating grid middleware and developing grid applications for bioinformatics Testbed spanned multiple administrative domains with systems located at MCNC, NC State, UNC-CH & Duke, and included representative heterogenity of hardware and OS platforms found at those sites Employed “best of breed” approach to grid middleware deployment Working groups met up to twice a month during Created several pilot applications using the testbed
10 Job Scheduling Platform LSF Platform LSF Sun Grid Engine Sun Grid Engine User Portal CHEF / OGCE CHEF / OGCE MyProxy MyProxy NC BioGrid Middleware: Best-of-Breed Approach Compute Grid Globus V2 (NMI) Globus V2 (NMI) Avaki V2 Avaki V2 Data Grid Avaki Data Grid V4 Avaki Data Grid V4 GridFTP (Globus) GridFTP (Globus)
11 NC BioGrid - Data Grid Avaki 4.0 Data Grid Federation of data providers across the WAN Federation of data providers across the WAN Provides a global name space for user home directories, shared project spaces, databases, and applications Ability to have results from canned SQL queries show up as files in the global name space Variety of access methods Variety of access methods Web-based user interface NFS and CIFS through local “data grid access servers” to provide access at the native OS level Simple deployment Simple deployment No kernel mods required Each site can run a “share server” to distribute their local home and project directories to the grid Web-based management interface
12 NC BioGrid - Compute Grid Globus Toolkit NSF Middleware Initiative (NMI) V2 (Globus 2.4.3) NSF Middleware Initiative (NMI) V2 (Globus 2.4.3) Provides “gatekeeper” functionality for submitting jobs through to the local cluster manager Provides GridFTP support for file transfer Provides MDS to track grid resource characteristics MCNC provides infrastructure services MCNC provides infrastructure services Certificate Authority (initially based on the Globus SimpleCA) GIIS (master resource directory for the grid)
13 NC BioGrid - Web Portal CHEF/OGCE – a grid portal framework Implements web-based interfaces for managing job submissions, file access, and online meetings Implements web-based interfaces for managing job submissions, file access, and online meetings Originally developed as a distance learning tool MyProxy – security credential repository Provides the portal with a mechanism for accessing and using Globus security credentials Provides the portal with a mechanism for accessing and using Globus security credentials
14 Portal Example
15 NC BioGrid Proof of Concept Applications Parameter Space Study with BLAST BLAST compares a target gene sequence against a known genome to find similarities BLAST compares a target gene sequence against a known genome to find similarities Grid BLAST distributed 1,000+ target sequences across the grid for comparison Grid BLAST distributed 1,000+ target sequences across the grid for comparison IBM Extreme Blue Project Built a grid interface to BioPerl libraries Built a grid interface to BioPerl libraries UNC-CH/IBM QSAR Application Grid-enabled version of a drug compound screening application Grid-enabled version of a drug compound screening application Finds compounds that have promising biological activity characteristics that should receive further research Finds compounds that have promising biological activity characteristics that should receive further research
16 The Grid Revolution in NC NC BioGrid MCNC Enterprise Grid Apply NC BioGrid lessons Cluster and SMP resources Research platform for GTEC Core component in NCGrid
17 32-CPU SGI Altix Linux SMP Server 128-CPU IBM Linux Cluster (64 nodes) 8-TB Storage LSF Master Job Scheduler Interactive Nodes / Grid Gatekeeper / GridFTP Global Grid Resource DB (GIIS) Users Campus Grids The MCNC Enterprise Grid Portals (FIREWALL)
18 The Enterprise Grid and MCNC’s Services Strategy NCREN State-wide Grid Services Enterprise Grid Services Value-add Information Systems Services Self-serve Data Center Services DATA CENTER Hosting & InfrastructureGrid Computing GTEC, NLR, ANR and other Innovation Initiatives Information Security Services Data Archival Services Information Assurance DEPLOYMENT
19 The Grid Revolution in NC NC BioGrid MCNC Enterprise Grid NC Grid Initiative State-wide partnership Leverage lessons learned Grid education & training resource Enable first mover applications
20 NC Grid: A Grid for Grid Developers (for now, at least) Provide a development testbed that spans the state Multi-institutional resources MCNC offers the Enterprise Grid as a resource MCNC offers the Enterprise Grid as a resource MCNC is also developing a “grid appliance,” which can be easily deployed and remotely supported as a campus or department point of presence on the grid MCNC is also developing a “grid appliance,” which can be easily deployed and remotely supported as a campus or department point of presence on the grid Currently the community is working together to determine the “middleware stack” GT4 vs. GT3 GT4 vs. GT3 OGCE vs. GridSphere OGCE vs. GridSphere CA architecture CA architecture Data grid strategy Data grid strategy Platform standards Platform standards etc... etc...
21 A Sampling of Current Grid Projects in NC GridNexus at UNC-W Workflow builder for grid applications Workflow builder for grid applications UNC-CH, RENCI, and MCNC UNC-CH, RENCI, and MCNC Portal and grid infrastructure for running ADCIRC model Portal and grid infrastructure for running ADCIRC modelBioPortal RENCI at UNC-CH RENCI at UNC-CH Grid Computing CS Course Offered by WCU to campuses across the state via NCREN video service Offered by WCU to campuses across the state via NCREN video service First offered in Fall 2004 (~30 students), to be offered again in Fall 2005 First offered in Fall 2004 (~30 students), to be offered again in Fall 2005