IM&T Vacation Program Benjamin Meyer Virtualisation and Hyper-Threading in Scientific Computing
HPC Clusters A large set of connected computers Used for computation intensive workloads, rather than I/O orientated operations Each node runs own instance of OS
CSIRO’s Bragg Cluster 128 compute nodes with 16 CPUs each 2048 cores in total 128GB of RAM per node 384 Fermi Tesla M2050 GPUs 172,032 streaming cores
What is Virtualisation?
Hypervisor Software which allows different and multiple operating systems to run on the underlying hardware Ensures all privileged operations are appropriately handled to maintain system integrity Invisible to operating system OS thinks it is running natively VMware ESXi Hypervisor used for this project
Benefits: Heterogeneous Clusters
Benefits: Live Migration Running jobs can be moved to other hardware Allows dynamic scheduling Preemptive failure/down time evasion
Checkpointing Status of OS, application and memory are saved at intervals Allows for easy failure recovery Software debugging Clean compute Security Run time/failure isolation Clean start Benefits
Performance Comparison Floating point operations per second
Performance Comparison Updates to random memory locations per second GUPs 100% 87.1% 54.8%
Performance Comparison MPI (message passing) latency
Hyper-Threading
Thread 1 Thread 2 Thread 1 Thread 2 Physical Cores Logical Cores (seen by OS) Thread 1 Thread 2 Physical Cores non Hyper-Threaded Hyper-Threaded Time Resource 1 Resource 2 Resource 3 Resource 4 Time Resource 1 Resource 2 Resource 3 Resource 4 Hyper-Threading Example
Performance Comparison Floating point operations per second
Performance Comparison Updates to random memory locations per second
References Tim Ho (2012, Nov.). CSIRO Advanced Scientific Computing User Manual [Online]. Available: (2013, Jan.). Top500 HPC Statistics [Online]. Available: (2012, Oct.). IBM Blue Gene #1 in Supercomputing [Online]. Available: (2013). Virtualize for Efficiency, Higher Availability and Lower Costs [Online]. Available: (2012). Tuning a Linux HPC Cluster: HPC Challenge [Online]. Available: