Friday, October 20, 2006 Barry Wilkinson Department of Computer Science University of North Carolina Charlotte Grid Computing Activities within the Department of Computer Science at UNC-Charlotte
Outline Brief description of grid computing Activities: Supercomputing 2003 conference demonstration Grid Computing Course (2004- ) Bioinfomatics algorithm hardware accelerator (participant) VisualGrid Project (2005- )
“The grid virtualizes heterogeneous geographically disperse resources” from "Introduction to Grid Computing with Globus," IBM Redbooks Using geographically distributed and interconnected computers together for computing and for resource sharing. Grid Computing
Need to harness computers Original driving force behind grid computing the same as behind the early development of networks that became the Internet: – Connecting computers at distributed sites for high performance computing.
Virtual Organization Usually, grid computing involves teams working together on a common goal, sharing computing resources and possibly experimental equipment. Geographically distributed grid computing team called a virtual organization. The resources shared include software and experimental data. Crosses multiple administrative domains.
Applications Originally e-Science applications – Computational intensive Not necessarily one big problem but a problem that has to be solved repeatedly with different parameters. – Data intensive. – Experimental collaborative projects Now also e-Business applications to improve business models and practices.
Supercomputing 2003 Demonstration First personal contact with grid computing (November 2003). Participant in Supercomputing 2003 demo organized by the University of Melbourne (Raj Buyya). 21 countries, numerous sites.
Subsequent activities Projects: Grid Computing Course (2004- ) VisualGrid Project (2005- ) Bioinfomatics hardware accelerator (participant) SURAGrid participant
Grid Computing Course Taught on North Carolina Research and Education televideo network that connects all 16 state campuses and also private institutions Fall 2004: 8 sites Fall 2005: 12 sites Principally an undergraduate course, (some graduate students) Course Home Page ITCS4010F05
Participating Sites, Fall 2005 Participating UNC campuses Private institutions Wake Tech. Community College Lenoir Rhyne College Elon University
Fall 2005 Course grid structure MCNCUNC-WUNC-ANCSUWCUUNC-CASU CA Backup facility, not actually used
Challenges - Technical Issues (grid computing) Setting up the grid infrastructure – very “challenging” Providing students with a stable distributed grid computing platform Moving the students through a set of detailed programming assignments in the face of system and student problems. Relied heavily on faculty contacts at each site.
Some Publications B. Wilkinson and C. Ferner, “Teaching Grid Computing across North Carolina Part II,” IEEE Distributed Systems Online, vol 7, no 7, B. Wilkinson and C. Ferner, “Teaching Grid Computing across North Carolina Part I,” IEEE Distributed Systems Online, vol 7, no 6, M. A. Holliday, B. Wilkinson, and J. Ruff, “Using an End-to-End Demonstration in an Undergraduate Grid Computing Course,” ACMSE 2006: 44th ACM Southeast Conference, March 10-12, 2006, Melbourne, Florida. B. Wilkinson, M. Holliday, and C. Ferner, “Experiences in Teaching a Geographically Distributed Undergraduate Grid Computing Course,” Workshop, IEEE Int. Symp. Cluster Computing and the Grid (CCGrid2005), Cardiff, UK, May , B. Wilkinson and M. Holliday, “State-Wide Collaborative Grid Computing Course,” 2005 Teaching and Learning with Technology Conference, March 30, 2005, Raleigh, NC. M. A. Holliday, B. Wilkinson, J. House, S. Daoud, and C. Ferner, “A Geographically-Distributed, Assignment-Structured Undergraduate Grid Computing Course,” SIGCSE 2005 Technical Symposium on Computer Science Education, St. Louis, Missouri, February , 2005.
National Publicity Science Grid This Week Feature story Gridtoday.com
VisualGrid Project Goal: Collaborative environmental visualization research using a grid computing infrastructure Started Jan 2006 Involves two sites: – UNC-Charlotte – UNC-Asheville plus Environment Protection Agency, Raleigh, NC (funding agency) EPA
Project Structure (Virtual Organization) Visualization Charlotte Visualization Center. Dept. of Computer Science UNC Charlotte Environmental Modeling Global Institute for Environmental Energy Systems UNC Charlotte National Environmental Modeling and Analysis Center UNC Asheville Grid infrastructure Departments of Computer Science UNC Charlotte and UNC-Asheville Environmental Protection Agency (funding agency)
UNC-Charlotte Group Leaders Visualization Charlotte Visualization Center Bill Ribrasky, Bank of America Endowed Chair of Information Technology (VisualGrid PI) Aidong Lu, Asst. Professor of Computer Science Environmental Studies Global Inst. of Energy & Environmental Syst. Hilary Inyang, Duke Energy Distinguished Professor Sunyoung Bae, Research Associate Grid Infrastructure Barry Wilkinson, Professor of Computer Science
Environmental Planning Inyang, Fisher, and Mbamalu, 2003 Proposed Power Plant location
VisualGrid Infrastructure Group: Goal: To create a geographically distributed set of resources and facilitate collaboration between VisualGrid researchers. Team: Barry Wilkinson Jeremy Villalobos (MS student) Nikul Suthar (MS Student) Keyur Sheth (MS student) Jasper Land (BS student) Department of Computer Science UNC-Charlotte Infrastructure Support 52-node University Research Cluster Chuck Price, Director of University Research Computing Mike Mosley, Senior Systems Developer
Achievements Created a secure multi-institutional research grid between UNC-Charlotte and UNC-Asheville with distributed compute and data storage resources. Developed VisualGrid portal, a customized web-based interface to access combined VisualGrid resources and execute applications. Single sign-on to all resources. HTTPS server. Provided simplified one-step on-line registration for new users. Provided a Certificate Authority for authenticating users. Developed “portlets” within portal including for the CMAQ application to greatly simplify its use.
Development System (Four 3.4 Ghz dual Xeons) visualgrid.uncc.edu Visualization lab data server (4 Tbytes) Compute resources 52-node (104 processor) University Research Cluster Software: Globus 4.0, Condor. CA Certificate Authority UNC-Charlotte resources UNC-Asheville resources transylvania.tr.cs.unca.edu (8-node system) VisualGrid Configuration VisualGrid portal
National Attention Listed as one of the portals to use OGCE2
X509 certificates are used to provide security in a grid system. Each user needs a certificate issued by a “certificate authority” (CA). Grid systems use a so-called user proxy certificates to allow resources to control resources on the user’s behalf. X509 Certificates
Users certified by a local CA UNC-C CA
CA’s with Mutual Trust UNC-C CA UNC-A CA GT4
Multiple Grid Nodes With multiple grid nodes, users need: Account on each system, and access control set accordingly. A certificate acceptable by the local certificate authority (i.e. signed by a CA it trusts)
Getting an account Go to portal and select “register” New User VisualGrid on-line registration form CA/System Administrator Create accounts, set access control, sign certificate, … Fill in form Provide password and other information Request Confirmation Acknowledgement Contact other grid resource administrators if users requests account on their resource
UNC-Asheville Bioinformatics hardware accelerator 52-node UNC-Charlotte university research cluster UNC-C Dept of CS grid computing development system 4TB Windows 2003 data server reached through coit-grid02.uncc.edu (samba mount)
Sample portlets Grid Resource Information Portlet CMAQ portlet, main page CMAQ settings portlet Tabs for various CMAQ actions
VisualGrid Links VisualGrid Infrastructure group page VisualGrid portal VisualGrid Portal User’s Guide wiki
Grid-Enabling Bioinformatics Accelerator Project to develop a grid-enabled hardware accelerator for the Smith-Waterman algorithm Uses Xilinx FPGA modules Principal Investigators: – Arun A Ravindran (EE dept) – Arindam Mukherjee (EE dept) EE PhD student – Kushal Datta NSF funding for hardware accelerator received.
Publication R. K. Karanam, A. Ravindran, A. Mukherjee, C. Gibas, B. Wilkinson, “Using FPGA-Based Hybrid Computers for Bioinformatics Applications – Seamless Integration of FPGAs into Grid computing infrastructures is key to the adoption of FPGAs by bioinformaticians,” Xilinx XCell Journal, 3rd quarter, pp. 80–83, 2006.
Collaboration with SURAGrid Develop and offer Grid course(s) using SURAGrid – Crosses state boundaries. Integrate bioinformatics accelerator into SURAGrid (possible)
Acknowledgements Partial support for the work described here was provided by the National Science Foundation, University of North Carolina Office of the President, and Environmental Protection Agency. National Science Foundation, “Introducing Grid Computing into the Undergraduate Curricula,” ref. DUE , PI: A. B. Wilkinson, co-PI’s Mark Holliday and D. Luginbuhl, $100,000, , Additional Funding,” ref. DUE , PI: B. Wilkinson, $8216, University of North Carolina Office of President, “A Consortium to Promote Computational Science and High Performance Computing,” PI: B. Kurtz (Appalachian State University) co-PIs: B. Berg, W. Campbell, W. Hightower, M. Holliday, J. Hollingworth, R. Hull, D-H Hwang, S. Lea, Y. Li, S. V. Providence, D. Powell, R. Shore, S. Suthaharan, R. Tashakkori, and B. Wilkinson, total $650,000, University of North Carolina Office of President, “Fostering Undergraduate Research Partnerships through a Graphical User Environment for the North Carolina Computing Grid,” PI: R. Vetter (UNC- Wilmington), co-PIs: L. Bartolotii, D. R. Berman, R. Boston, J. Brown, C. Ferner, T. Hudson, T. Janicki, N. Martin, M. McClelland, J. Porter, A. Stapleton, and B. Wilkinson, total $557,634, Environmental Protection Agency, “Proposal to Establish the VisualGrid” PI W. Ribarsky, co-PIs S. Bae, B. Wilkinson, H. Inyang, A. Lu, $485,000, 01/02/ /31/2006.
Questions?