Download presentation
Presentation is loading. Please wait.
Published byIrene Norris Modified over 8 years ago
1
John Kewley Grid Technology Group e-Science Centre Condor: The CCLRC Experience UK Condor Week 2004
2
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Outline oThe Challenge of Condor on Personal Workstations oThe Pools: configuration and status oOur Users
3
John Kewley Grid Technology Group e-Science Centre The Challenge of Condor On Personal Workstations UK Condor Week 2004
4
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Abundance of machines oWindows workstations (but centrally administered) oLinux desktops (but administered by “owners”) oCommodity Clusters (unavailable, many being decommissioned, no access to root) oServers for CVS, backup, external web access, access grid (production systems – mission critical) oTraining machines (turned off when not in use – only 4 at present) oHPCx (No comment!) Under
5
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Security / Paranoia o2 zone firewall separates machines oNo root access to server machines oNo root access to personal Linux Workstations oPersonal firewalls “Not on MY machine you’re not”
6
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Site Firewalls + Flocking Internal PoolExternal Pool
7
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Site Firewall(s) o2 levels of Firewall oEvery request for a change in the site firewall needs justification - takes up to 2 working days. oIn theory, every submit node needs to be able to talk to some fixed (configurable) and ephemeral ports in every execute as well as the central node. oIn addition, both UDP and TCP need to be opened. oIt would be good if we could have a more precise definition of exactly what is necessary.
8
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Firewalls within a Condor Pool oSome resource owners have firewalls on their personal workstations oSince Condor needs each submit node to be able to talk to every potential execute node, this necessitates the opening of every firewall in the pool to every submit node when it is added. oBetween adding the new node and the firewalls being updated, the firewalled nodes will be unavailable for use. Or are they? Maybe someone should tell Condor!
9
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Adding a new machine to the pool oIf we add a new machine to the pool, the existing firewalls may not have anticipated this. oThe firewalls will likely block this new machine oA Job may still match for the newly added machine to the firewalled resource. oThis job will not be able to run oParts of the system can jam as a result. –condor_q on submitting node –Subsequent parts of the submit script –(maybe also parts of the central node)
10
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Private networks oSimilar “jams” occur if part of your pool (or flock of pools) is on a network that is unavailable to some of the other nodes oHow can we permit jobs from submit nodes that can access the private network to run on these nodes whilst preventing Condor sending jobs from other submit nodes there?
11
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Workaround Solution oMirror the firewall settings using ClassAds oThey can be updated at the whim of the machine owner as long as the settings are mirrored. oNew users can be added at any time without disruption For more details, see my talk in the Security WG
12
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Other problems oLack of root access – I had to go and grovel to each resource owner not only for permission to install condor, but for them to log me in as root so I could do the installation. oMany different Linuxes. Condor installs neatly with the rpm on Red Hat family Linuxes. I had no trouble on the other ones, but the additional installation steps I had to perform for updating init.d was different in each case. I now use an updated version of the condor.root issued with the release.
13
John Kewley Grid Technology Group e-Science Centre The Pools: Configuration and Status UK Condor Week 2004
14
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Strategy o“ Community” approach: everyone has the right to run jobs from their machine. o2 Condor Pools –One for internal use only –One for access by external collaborators and testing
15
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Internal Pool oComprised of central node, personal workstations and other “spare” machines. oInside “thick” part of site firewall, so no submission access from outside DL (although we expect to flock to/from other CCLRC sites) oBuild up trust by gradually growing pool
16
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week External Pool oComprised of the remains of a “broken-down” cluster oOriginally Dual “head” node plus 8 workers on a private network. Now Dual + 4 standalone nodes. oInside a “thin” firewall, so external access can be granted to collaborators (e.g. ETF/OMII Distributed Build and Test project) oOriginally could be flocked to from the Internal Pool
17
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Configuration (1) oAlways run jobs (this may change at some point) oThe majority of machines are setup for both execute and submit (even central node at present). There is only one node set up for submit only. oAdditional ClassAds –OS Flavour and Version –To mirror firewall settings (see Firewall “Avoidance” talk in WG2 tomorrow) oDual-boot nodes are configured for Condor in both of their manifestations
18
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Configuration (2) o All machines setup the same way (in /opt/condor ) condor.sh for installation in /etc/profile.d : CONDOR_ROOT=/opt/condor export CONDOR_CONFIG=${CONDOR_ROOT}/etc/condor_config export PATH=${PATH}:${CONDOR_ROOT}/bin condor.csh for installation in /etc/profile.d : set condor_root = /opt/condor setenv CONDOR_CONFIG "${condor_root}/etc/condor_config" set path = ( ${path} ${condor_root}/bin ) oCommon condor_config.local for inclusion oCommon condor init.d script with several enhancements over packaged one
19
John Kewley Grid Technology Group e-Science Centre The Pools: Configuration and Status UK Condor Week 2004
20
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Internal Pool Stats o11 resource “Owners” at 2 sites o11 OS Variants o1 submit-only node (head-node of e-HTPX cluster – Red Hat 9) o27 Processors on 21 execution Machines (including central node) 6 Windows –3x Windows XP Professional –2x Windows 2000 Professional –1x Windows NT 4.0 Workstation 21 Linux –6x SuSE Linux 9.0 –2x SuSE Linux 8.0 –5x White Box Enterprise Linux 3.0 –1x Red Hat Enterprise Linux 3.0 –3x Red Hat Linux 9 –2x Red Hat Linux 8.0 –1x Mandrake Linux 10.0 –1x Gentoo Linux 1.4
21
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week condor_status $ condor_status -f "%-6s" Arch -f "%-7s" OpSys \ -f " %-12s" OPSYS_FLAVOUR \ -f "\n" OpSys | sort | uniq -c 1 1 INTEL LINUX Gentoo 1 INTEL LINUX Mandrake10 2 INTEL LINUX RH80 3 INTEL LINUX RH9 1 INTEL LINUX RHEL2 2 INTEL LINUX SUSE80 6 INTEL LINUX SUSE90 5 INTEL LINUX WBL 1 INTEL WINNT40 3 INTEL WINNT50 2 INTEL WINNT51
22
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week External Pool Stats o2 resource “owners” o2 OS Variants oCan flock to/from pools at 4 other sites oIn the process of adding GSI Security o5 Machines containing 6 Linux Processors: –2x Red Hat Linux 7.3 –4x White Box Enterprise Linux 3.0 (currently disabled since inaccessible from outside due to firewall restrictions)
23
John Kewley Grid Technology Group e-Science Centre Our Users UK Condor Week 2004
24
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week e-HTPX The e-HTPX project is developing a Grid-based e-science environment to allow structural biologists remote, integrated access to web and grid technologies associated with protein crystallography. http://clyde.dl.ac.uk/e-htpx/index.htm
25
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Stage 1 – Select protein target Stage 2 – Crystallization of Protein Stage 3 – Data Collection (X-ray diffraction images, Scaling and Integration) Stage 4 – Structure Solution (HPC data processing to derive digital protein model) Stage 5 – Submit model into public database A single all encompassing web interface from which users can initiate, plan, direct and document the experimental workflow either locally or remotely from a desktop computer. Start Finish Target Selection Structure Solution e-HTPX Workflow
26
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week e-HTPX Structure Solution oGiven a target sequence for a protein, the Protein data bank (PDB) is searched for similar sequences. oThe corresponding structures are downloaded for use in a high-throughput system for determining the structure of the target protein. oDepending on the protein structure size and matching criteria, up to several hundred structures can be downloaded. The modelling for these is carried out by submitting multiple jobs to the cluster and/or Condor pool.
27
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week e-HTPX Structure Solution
28
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week CCP1 / GAMESS-UK CCP1: “The Electronic Structure of Molecules“ http://www.ccp1.ac.uk/ GAMESS-UK is a multi-method ab initio molecular electronic structure program.
29
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week CCP1 / GAMESS-UK oGAMESS-UK is a Quantum-Mechanical molecular modelling program used by chemists, physicists and biologists to run molecular calculations. oGiven the nuclear coordinates of a molecule, GAMESS-UK calculates a wavefunction that describes its electronic properties. oFrom the wavefunction, various molecular properties (e.g. shape, energetics and reactivity) can be calculated. http://www.cfs.dl.ac.uk/
30
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week GAMESS-UK + Condor The following are being investigated: oBuilding GAMESS-UK and run its tests on a variety of environments (OS, compilers, libraries) oUsing pool to build release packages of a cut-down evaluation version of the software. oUsing Condor as it is intended: submitting many jobs to ascertain.
31
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week ETF “Build and Test” Testbed oThe external pool is part of the ETF “Build and Test” testbed. oSoftware bundles are distributed to a variety of OS types around the flocked pool for building and testing. oThis type of (flocked) pool relies on heterogeneity and small numbers of each type are all that are required. http://polaris.ecs.soton.ac.uk:65000/ http://wiki.nesc.ac.uk/read/sfct?HomePage
32
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Other non-HTC Uses oI want to ensure my code compiles without warnings and/or runs its basic tests on –As many OSs as possible –With as many different compilers as possible oI want to perform a release build of my product for platform X, but I only have accounts on A, B and C oI have several server-licensed products and many potential occasional users. How can this be made available to them more easily (within the bounds of the licence of course!)
33
John Kewley Grid Technology Group e-Science Centre In Conclusion UK Condor Week 2004
34
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Summary o12 brave souls have offered up their personal workstations so others can run arbitrary vanilla jobs. oInstallations have been made on 12 different operating systems oBoth pools are now in use. Provision of administrative support is underway – web page, user guide, etc oDistributed build is great! oFirewalls are not (although I now understand firewalls a lot better)!
35
Presenter Name Facility Name John Kewley Grid Technology Group e-Science Centre 11 th October 2004 UK Condor Week Final Thoughts oSetting up a Condor pool of personal workstations requires considerable coaxing, convincing, coercion and cajoling. oFlocking through firewalls should be easier. Something needs doing, at least for flocking. oDistributed build can be very useful, but Condor’s default ClassAds could do with extending (at least to more accurately describe the OS) oWhat use can be made of pools which are seriously heterogenous?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.