Presentation is loading. Please wait.

Presentation is loading. Please wait.

A Regional Analysis Center at the University of Florida Jorge L. Rodriguez University of Florida September 27, 2004

Similar presentations


Presentation on theme: "A Regional Analysis Center at the University of Florida Jorge L. Rodriguez University of Florida September 27, 2004"— Presentation transcript:

1 A Regional Analysis Center at the University of Florida Jorge L. Rodriguez University of Florida September 27, 2004 jorge@phys.ufl.edu

2 2 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Outline Facility’s Function and Motivation Physical Infrastructure Facility Systems Administration Future Plans

3 3 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Facility’s Function & Motivation Operational support for various organization –Experimental High Energy Physics CMS (Compact Muon Solenoid @ the LHC) CDF (Central Detector Facility @ FNAL) CLEO ( e+e- collider experiment @ Cornell) –Computer Science GriPhyN iVDGL –“Friends and Neighbors” US Grid computing projects

4 4 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida High Energy Physics Activities US-CMS Tier2 Center –Monte Carlo (MC) production Small number of expert and dedicated users Primary consumer of computing resources –Support US-CMS regional analysis community Larger number (~40) of not so expert users Large consumer of disk resources CLEO & CDF analysis and MC simulation –Local CDF activities in recent past –Expect ramp up of local CLEO-C activities

5 5 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Computer Science Research Grid computing research projects –iVDGL & Grid3 production cluster(s) Grid3 Production site Provides resource to 100s of users via grid “logins” –GriPhyN & iVDGL Grid development Middleware infrastructure development –grid3dev and other testbeds Middleware and Grid application development –GridCat : http://www.ivdgl.org/gridcathttp://www.ivdgl.org/gridcat –Sphinx : http://www.griphyn.org/sphinxhttp://www.griphyn.org/sphinx –CAVES : http://caves.phys.ufl.eduhttp://caves.phys.ufl.edu Need to test on real cluster environments

6 6 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida The Primary Challenge All of this needs to be supported with minimal staffing, “cheap” hardware and moderate expertise

7 Physical Infrastructure

8 8 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Physical Infrastructure Server Room –Dedicated floor space from dept. –Our feed into campus backbone FLR upgrade to 10 Gbps by 2005 Hardware: Currently 75 servers –Servers: mix of Dual PIII & P4 Xeon –LAN: mix of FastE and GigE –Total of 9.0 TB of Storage, 5.4 TB on dCache 4U dual Xeons fileservers w/dual 3ware RAID controllers… SunEnterprise with FC and RAID enclosures –More storage and servers on order New dedicated analysis farm Additional 4 TB dual Opteron system 10 GigE ready (S2io cards)

9 9 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Video Conferencing Two Polycom equipped conference rooms –Polycom ViewStations H.323 and H.262 –Windows XP PCs Access Grid –Broadcast and participate in lectures from our large conference room

10 10 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Visitor’s Offices and Services Visitor Work Spaces –6 Windows & Linux Desktops –Expanding visitor workspace Workspaces, printers, LCD projector … Espresso Machine –Cuban Coffee –Lighthearted conversation

11 Facility Systems Administration Design Overview

12 12 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Facility Design Considerations Centralization of systems management –A single repository of server, cluster and meta- cluster configuration and description Significantly simplifies maintenance and upgrades Allows for easy resource deployment and reassignments –Organization is a very, very good thing! Support multiple versions of the RedHat dist. –Keep up with HEP experiments expectations –Currently we support RH7.3 and RHEL 3 –The future is Scientific linux (based on RHEL) ?

13 13 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Cluster Management Technology We are a pretty much a ROCKS Shop! –ROCKS is a open source Cluster Distribution –Its main function is to deploy an OS on a cluster –ROCKS is layered on top the RedHat dist. –ROCKS is extensible ROCKS provides us with the framework and tools necessary to meet our design requirements simply and efficiently http://www.rocksclusters.org

14 14 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida ROCKS the 5 Minute Tour ROCKS builds on top RH kickstart technology –A simplistic Kickstart digression A kickstart is a single ASCII script –Lists a single and/or groupings of RPMS to be installed –Proviced staged installation sections %pre, %main %post … Anaconda the installer –Parses and processes kickstart commands and installs the system This is in fact what you interact with when you install RedHat on your desktop

15 15 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida ROCKS the 5 Minute Tour cont. ROCKS enhances kickstart by –Providing machinery to push installations to servers DHCP: node identity assignment to a specific MAC address https: protocol used to exchange data (kickstart, images, RPM) cgi script: generates kickstart on the fly –ROCKS also provides a kickstart management system kickstart generator parses user defined XML spec. files and combines that with node specific information stored in a MySQL database Complete system description is packaged in components grouped into logical object modules

16 16 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida ROCKS the 5 Minute Tour cont. The standard ROCKS graph describes a single cluster with service nodes Note: use of “rolls”

17 17 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida gatekeeper-pg ROCKS’ and MySQL/XML frontend gatekeeper comput e gaterkeeper-dgt compute-pg compute-dgt uflorida-frontend grinux01 grinux40 ufloridaPG ufloridaDGT grinuxN grinuxM ufgrid01 gatekeeper-grid01 compute-grid01 XML graphs: appliances are like the classes Global Variables, MAC address, node names, distribution, membership, appliances… Physical servers: are the objects ROCKS’ MySQL DataBase

18 Facility Systems Administration Implementation

19 19 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida ROCKS at UFlorida Current installation is based on ROCKS 3.2.0 Basic graph architecture modified to meet our requirements –Single frontend manages multiple clusters and service nodes –Support dual distributions RH7.3 & RHEL 3 –Direct interaction with ROCKS MySQL database –Extensive modification to xml trees The XML tree is based on an older ROCKS vers. 2.3.2 Our own “uflorida” graphs, one for each distribution Many changes & additions to the stnd XMLs

20 20 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida The uflorida XML graphs uflorida.xml for RHEL 3 servers uflorida.xml for RH7.3 servers

21 21 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida grid users only and limited access from WAN The Servers RHEL 3 RH7.3 uflorida-frontend gatoraid1 gatoraid2 grinuxN+1 grinuxM grinux41 grinuxN grinux01 grinux40 nfs-homes ufdcache alachua archermicanopy ufgrid01 ufgrid03 ufgrid02 ufgrid04 GriPhyN ufloridaDGT ufloridaPG WAN all users and limited access from WAN private LAN No user log in and very limited or no access from the WAN

22 22 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida private LAN uflorida-frontend gatoraid1 gatoraid2 grinuxN+1 grinuxM grinux41 grinuxN grinux01 grinux40 nfs-homes ufdcache alachua archermicanopy ufgrid01 ufgrid03 ufgrid02 ufgrid04 GriPhyN ufloridaDGT ufloridaPG WAN The Services RHEL 3 RH7.3 The big kahuna: Our stripped down version of the ROCKS frontend ROCKS administration node RPM server ROCKS DB server Kickstart generation DHCP server Admin nfs server etc No users, no logins, no home … Strict firewall rules Primary DNS server NFS servers Users home area Users data Grid users User Interface machines Grid User login Access to grid via condorG GriPhyn & Analysis Servers User login machines User environment Other Services cvs pserver webserver

23 23 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida private LAN Batch System Services RHEL 3 RH7.3 ufdcache alachua archermicanopy ufgrid01 ufgrid03 ufgrid02 ufgrid04 GriPhyN ufloridaDGT ufloridaPG WAN uflorida-frontend gatoraid1 gatoraid2 grinuxN+1 grinuxM grinux41 grinuxN grinux01 grinux40 nfs-homes gatekeeper / master nodes Grid access only GRAM, GSIFTP… Other services minimum

24 24 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida private LAN dCache with SRM: Virtual file system (pnfs) with cache’ing ufdcache alachua archermicanopy ufgrid01 ufgrid03 ufgrid02 ufgrid04 GriPhyN ufloridaDGT ufloridaPG WAN dCache Disk Storage Service RHEL 3 RH7.3 dCache pool nodes 40 x 50 GB partitions uflorida-frontend gatoraid1 gatoraid2 grinuxN+1 grinuxM grinux41 grinuxN grinux01 grinux40 nfs-homes dCache Administrator “admin door” SRM & dCache webserver RAID fileserver Entire 2 TBs of disk on dCache

25 25 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida The uflorida XML graphs RHEL 3 graph RH 7.3 graph The production cluster with dCache pool compute nodes The analysis cluster NIS server and clients analysis environment

26 26 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida UFlorida Website Currently just the default ROCKS website. Ours is accessible only from.ufl.edu domain the ganglia pages

27 Status and Plans

28 28 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Facility’s Status Participated in all major CMS data challenges –Since Fall of 2001: major US contributor of events Since end of 2002: GRID only MC production –Contributed to CDF analysis UFlorida Group: 4 faculty & staff and 2-3 students Recently re-deployed infrastructure –Motivated mostly by security concerns –Added new clusters for additional user communities Ram up to support full USCMS Tier2 center

29 29 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Facility’s Future Plan to re deploy with ROCKS 3.3.x –Recast our own 2.3.2 based XMLs as real 3.3.x –Provide access to new features Recast our tweaked and tuned services terms of Rolls Make use of new ROCKS on the WAN Improve collaboration with our peers –With US-CMS FNAL (Tier1) –State wide FIU, FSU (Tier3) and campus wide HPC –Other Tier2 centers US CMS & Atlas

30 30 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Facility’s Future New hardware is always comming –New dedicated analysis and files servers on order 6 new dual Xeon based servers 8.0 TB of disk –New Opteron systems on order Participate in SC ‘04 bandwidth challenge CIT to JAX Connect to new FLR 10GigE network equipment –Official Tier2 center will bring new hardware Approximately 120 servers Additional 200 TB of storage

31 31 Jorge L. Rodriguez: A Regional Analysis Center at the University of Florida Summary We have successfully built, from scratch, a computing facility at the University of Florida –Support HEP experiments (LHC, FNAL …) –Support Computer Science activities –Expect a much more active analysis/user community Infrastructure designed to support an large increase in hardware in support of a larger community of users in anticipation of LHC turn on and beyond


Download ppt "A Regional Analysis Center at the University of Florida Jorge L. Rodriguez University of Florida September 27, 2004"

Similar presentations


Ads by Google