Download presentation
Presentation is loading. Please wait.
Published byArchibald Tyler Modified over 9 years ago
1
The CRI compute cluster CRUK Cambridge Research Institute
3
The CRUK Cambridge Research Institute Founded to enable translational research Basic biology Early phase clinical trials Late phase translational studies Leveraging the specialist experience and facilities provided by The University of Cambridge Addenbrooke’s Hospital A CRUK facility, hosting CRUK core services (including Information Systems) CRUK Groups and Group Leaders Cambridge University Groups and Group Leaders
4
Research objectives with significant Information Systems demands Genomics - Clonal sequencing (Solexa) generating ~32TB per annum per sequencer using 8-16 CPU cores full time Histopathology Scanners generating 16TB per annum Microscopy Generating 8+TB per annum Processing time series sequences In vivo imaging MRI, PET-CT Systems Biology 20+ systems biology researchers working on expression data, network models etc.
5
Multiple groups, similar requirements MRI imaging Compute High performance Storage Long term storage Genomics Bioinformatics Tavare Group Institute
6
2007/2008 Architectural consolidation MRI imaging HP Blade Cluster HP Lustre SFS storage Long term storage Genomics Bioinformatics MacOS X SAN I/O storage
7
“Virtual” group infrastructure using LSF MRI imaging HP Blade Cluster HP Lustre SFS storage Long term storage Genomics Bioinformatics MacOS X SAN I/O storage Institute Tavare Storage policies Genomics MRI imaging Bioinformatics Institute Tavare Platform LSF job scheduler
8
2008/2009 Storage consolidation HP Blade Cluster HP Lustre SFS storage Long term storage EMC SAN I/O storage Institute Tavare Storage policies Genomics MRI imaging Bio- informatics Genomics MRI imaging Bioinformatics Institute Tavare Platform LSF job scheduler
9
The CRUK CRI cluster
10
Blades Head node I/O node SFS storage Solexa storage Aperio storage Ariol storage I/O storage Networking
11
Blades Head node I/O node SFS storage Solexa storage Aperio storage Ariol storage I/O storage Networking Desktop client Input files Output files LSF job submission Linux home directories Shared binaries for blades /data for input – output to network /usr/local/bin for shared binaries /lustre high performance storage
12
Seeing the cluster from the desktop The I/O storage and linux homes are visible from the CRI network:
13
Filesystems /home 100GB Linux home directories Visible from all the cluster nodes Use for local code, scripts etc backed up /data 2.7TB Use for delivering data to and from the cluster Lower performance to the blades – not used for processing Not backed up, files over 2 weeks old may be deleted without warning /lustre 16TB High performance, use for processing Not backed up, files over 1 month old may be deleted without warning
14
Platform LSF - Queue structure Ownership of Blades: Core facilities -Genomics Genomics (6x8 cores) -Imaging Imaging (3x8 cores) -Bioinformatics bioinformatics -Information Systems information_systems Groups -Tavare Lab stlab (18x8 cores) high_memory (2x8 cores) -Other Groups cluster (4x8 cores) …But ownerhsip doesn’t necessarily match daily usage patterns.
15
Balanced Scheduling – Fairshare Policy GroupShare Simon Tavaré Group20 Genomics5 Bioinformatics6 Imaging3 General4 Dynamic Priority = number_of_shares (cpu_time * CPU_TIME_FACTOR + run_time * RUN_TIME_FACTOR + (1 + job_slots) * RUN_JOB_FACTOR)
16
Information Systems Processes User accounts Managed via central Service Desk Linux accounts bound to AD (Windows/Mac) accounts Troubleshooting Linux support in London and Cambridge Accessed via Service Desk Software installation All Blades share binaries Users can put local code in home directory IS department will install common code in /usr/local/bin
17
Summary: The CRUK Cambridge Research Institute is delivering a shared computational science infrastructure Principle “Virtualisation” to make scalable, easy to administer systems Common architecture to deliver cost and service benefits Practice Blade architecture suitable for most computing needs Networking and storage need careful design Benefits Optimal use of resources Low wastage Excess capacity “buffers” new experimental and development techniques …to date, provision of compute power hasn’t limited science at the CRI
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.