Partitioned Multistack Evironments for Exascale Systems Jack Lange Assistant Professor University of Pittsburgh
What I’ve heard… In-situ everything… – Network will not support current behaviors – Must collapse multiple functions onto single platform Visualization, data analysis, checkpointing, etc… Visualization Cluster Supercomputer Storage Cluster Exascale Machine
What does this mean for the OS? At Petascale we could optimize each environment separately – Each had their own OS and hardware At Exascale workloads will be co-located – Can a single OS handle all workloads effectively? Claim: Probably not – Each has different resource requirements and behaviors – Exascale will need to support multiple OS environments on the same hardware Exascale Node Lightweight Kernel HPC application Management Processes Resource Manager Embedded Linux Analysis + Visualization Linux Debugger
Challenges Increase in complexity at both hardware and software layers – Heterogeneous hardware GPUs, Lightweight cores, SSDs, … – Complex Topologies NUMA on chip, NUMA on node, dedicated GPU nodes, … – Heterogeneous applications – Hardware and software failures – Power constraints How can we manage all of this at the OS layer? – A unified and monolithic OS environment isn’t going to work
Approaches Exascale machines will need the ability to run multiple OS instances in parallel – Each targeting a particular application/workload Linux for vis., LWK for application, etc Hopefully virtualization can help… – But there will probably be limited hardware support for it Will need other techniques for partitioning resources – Virtualization-lite? – Lightweight and distributed resource managers – Flexible communication channels – Many others…