Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson \ ~thomas/madlug
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Overview The Computer Systems Lab (CSL) Clusters The condor/db cluster Scalable Linux Administration
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Overview The Computer Systems Lab (CSL) Clusters The condor/db cluster Scalable Linux Administration
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Computer Systems Lab Purpose Staff Resources
Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Purpose “To support the research and teaching missions of the Department of Computer Sciences”
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Staff 8 Full Time Part Time
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Responsibilities Networks –Gigabit, 100BaseT, ATM, FDDI –Cisco, Foundry routers –3com, HP, Cisco switches
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Responsibilities (cont.) Operating Systems –Solaris, Linux, Digital Unix, AIX, IRIX, NT Applications –compilers, dbs, simulators, , image processing....
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Responsibilities (cont.) 641 software packages installed –69 Gbytes –multiple version –each package installed for several architectures –several thousand builds
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Responsibilities - (cont.) Workstations –600 PCs (including cluster) –200 Sparcs –15 Alphas –others 5600 User home directories –69 Gbytes
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Responsibilities (more) AFS –1 Tbyte of ubiquitous file space –14 File Servers, 3 db Servers –95% client cache hit rates Backups –2 week epoch cycle (1 Tb) –Daily incs
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Overview The Computer Systems Lab (CSL) Clusters The condor/db cluster Scalable Linux Administration
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Clusters Definitions Architectures Example Applications
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Definitions NOW - Network of workstations COW - Cluster of workstations –“Some degree of network isolation” –“Dedicated function”
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Architectures N-dimensional arrays –“previous & next” neighbor –hypercube Simple Network
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Architectures Distributed –MPI –PVM –condor
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Examples The Hive
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Examples - The Hive
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Examples - The Hive (cont.)
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Redundant Networks
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Cluster Applications Image Analysis – tilton.html Parallel Virtual File System (PVFS) – Speech Recognition –
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Cluster Applications (cont.) Physics –Viscoelasticity –Seismology –Big Bang html
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Cluster Applications (cont.) Physics (cont.) –Laser Interferometer Gravitational-Wave Observatory (LIGO) –NA49 (??) –Large Acceptance Hadron Detector for an Investigation of Pb-induced Reactions at the CERN SPS
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Overview The Computer Systems Lab (CSL) Clusters The condor/db cluster Scalable Linux Administration
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Computer Science Cluster Two connected clusters –Dual Xeon 550mHz, 512k cache, 1 Gig RAM, Ultra 2 SCSI 9 Gig boot disk, tulip network –64 node compute cluster –36 node db cluster with 4 extra 9 Gig disks and GNIC-II Gigabit ethernet –Red Hat Linux 6.1, kernel
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Cluster Architecture
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Cluster Picture
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Overview The Computer Systems Lab (CSL) Clusters The condor/db cluster Scalable Linux Administration
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Scalable Linux Administration What Why Installation Maintenance
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Scalable Admin - What Leverage Control systems Remote monitoring Operating system upgrades Centralized Services –kerberos, afs, logging
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Scalable Admin - Why Consistent user view –Available applications –Stability Predictable Admin Environment Security
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Scalable Admin - Installation Red Hat Kickstart –Configuration file network config, nfs locations, disk layout, RPMs to install –Boot disk, nfs, or bootp/dhcp –Post-install script –redhat-6.1/i386/doc/HOWTO/KickStart-HOWTO
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Sample Kickstart Script # $Id: ks.cfg,v /10/07 18:57:24 thomas Exp $ lang en_US network --bootproto bootp nfs --server pinstall.cs.wisc.edu --dir /install/redhat- 6.0/i386 keyboard us zerombr yes clearpart --all part / --size 100 #part /tmp --size 300 part /var --size 75 part /usr --size 570 part swap --size 127 part /var/vice/cache --size 120 part /local --size 2 --grow --maxsize 4000
Computer Systems Lab The University of Wisconsin Madison Department of Computer Sciences Scalable Admin - Maintenance Update RPMS –Create list of RPMs, versions, and files to install –Each computer updates based on list Special files –package (afs) –cfengine (gnu) –config files (filedist)
Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson