Download presentation
Presentation is loading. Please wait.
1
Installing Galaxy on a cluster :
issues around the DB server, queue system, external authentication, etc. Nikolay Vazov University Center for Information Technologies (USIT) University of Oslo Norway
2
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
The new UiO hpc-cluster: Abel (in operation since October 1st 2012) Some facts about the Abel cluster: Ranked 96 in the world top 500 list with TFlops 178.6 TFlops correspond roughly to ca 2700 PCs the fastest i Noway, 3rd in Scandinavia Ranked 68 in the Green500 list 652 compute nodes and 20 administration nodes All compute nodes on the cluster have a minimum 64 GB RAM, 16 physical CPU cores and are connected by FDR (56 Gbps) Infiniband. cores used for computing: correspond to quad-core PCs 400 TB shared disk Compute nodes with 350 TB local discs Compute nodes have a total of 48 TB RAM Power consumption 230KW (full load) Trivia: all the nodes were mounted in 14 hours (appr. 1'15” per node!) Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
3
The existing service – the Bioportal
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
4
Bioportal features - jobs
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
5
Bioportal features - files
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
6
Galaxy in Abel - configuration
cluster External authentication (FEIDE) Locally registered users node node node node node node Interface between Galaxy and SLURM – DRMAA job scheduler - SLURM Apache proxy PostgreSQL DB server Located on a different host Paster (WSGI) SSL connection Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
7
Galaxy in Abel - configuration
cluster External authentication (FEIDE) Locally registered users node node node node node node Interface between Galaxy and SLURM – DRMAA job scheduler - SLURM Apache proxy PostgreSQL DB server Located on a different host Paster (WSGI) SSL connection Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
8
Job scheduling with Galaxy
Galaxy – specifies the job runners DRMAA library - generic interface to various scheduling systems SLURM – schedules the jobs ( client/server) Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
9
Job scheduling Galaxy -> DRMAA -> SLURM
Galaxy server is outside the cluster. We prefer this situation to the Galaxy server being a part of the cluster. Galaxy, DRMAA and SLURM are located on an nfs mounted partition. Galaxy: universe_wsgi.ini # -- Job Execution # Comma-separated list of job runners to start. local is always started. If # ... The runners currently available are 'pbs' and 'drmaa'. start_job_runners = drmaa # The URL for the default runner to use when a tool doesn't explicitly define a # runner below. default_cluster_job_runner = drmaa:/// Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
10
Job scheduling Galaxy -> DRMAA -> SLURM
export DRMAA_PATH=/kevlar/proje cts/drmaa/lib/libdrmaa.so.1.0.2 export SLURM_DRMAA_CONF=/et c/slurm_drmaa.conf hpc-dev01 etc# cat slurm_drmaa.conf Job_categories: { default: "-A staff -p normal --mem-per- cpu= comment=hello", } Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
11
Job scheduling Galaxy -> DRMAA -> SLURM
Plus a couple of changes ( in the DRMAA egg (drmaa-0.4b3-py2.6.egg) Find munge Display the web form to specify node, cores, memory, partition, etc. Parse the data from the web form and set up a string into <path-to-galaxy>/database/pbs/slurm_settings.tmp Create a real sbatch file, add missing parameters, module load, etc, and send the job to the cluster Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
12
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
Job scheduling Galaxy -> DRMAA -> SLURM (thanks to Katerina Michalickova) SLURM (client has to be installed on the mounted partition) : /etc/slurm/slurm.conf hpc-dev01 slurm# cat slurm.conf ## slurm.conf: main configuration file for SLURM ## $Id: slurm_2.2.conf,v /09/20 15:13:58 root Exp $ ## FIXME: check GroupUpdate*, TopologyPlugin, ## UnkillableStepProgram, UsePAM ### ### Cluster ClusterName=titan # NOW abel SlurmctldPort=6817 SlurmdPort=6818 TmpFs=/work TreeWidth=5 ## Timers: #default: MessageTimeout=10 ## FIXME: should be reduced when/if we see that slurmd is behaving: #SlurmdTimeout=36000 WaitTime=0 ### Slurmctld ControlMachine=blaster.teflon.uio.no SlurmUser=slurm StateSaveLocation=/tmp Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
13
SSL to the PostgreSQL server (thanks to Nate Coraor)
Downloaded and recompiled an psycopg egg In universe_wsgi.ini database_connection = /<dbname>?sslmode=require Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
14
Authentication (thanks to Roland Hedberg)
pysaml / Modify lib/galaxy/web/controllers/user.py Authentication working, but can not capture the POST from the IdP any help is appreciated :) Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
15
Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
Thank you hpc/abel/index.html Swiss Galaxy Workshop, Wednesday, October 3rd, Bern
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.