Partitioned Multistack Evironments for Exascale Systems Jack Lange Assistant Professor University of Pittsburgh.

Slides:



Advertisements
Similar presentations
Issues of HPC software From the experience of TH-1A Lu Yutong NUDT.
Advertisements

Linux Operating System Linux is a free open-source operating system based on Unix. Linux was originally created by Linus Torvalds with the assistance of.
Accelerators for HPC: Programming Models Accelerators for HPC: StreamIt on GPU High Performance Applications on Heterogeneous Windows Clusters
Partition and Isolate: Approaches for Consolidating HPC and Commodity Workloads Jack Lange Assistant Professor University of Pittsburgh.
Challenges and Opportunities for System Software in the Multi-Core Era or The Sky is Falling, The Sky is Falling!
CURRENT AND FUTURE HPC SOLUTIONS. T-PLATFORMS  Russia’s leading developer of turn-key solutions for supercomputing  Privately owned  140+ employees.
GPGPU Introduction Alan Gray EPCC The University of Edinburgh.
Overview: Chapter 7  Sensor node platforms must contend with many issues  Energy consumption  Sensing environment  Networking  Real-time constraints.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Virtualization in HPC Minesh Joshi CSC 469 Dr. Box Feb 1, 2012.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
PARALLEL PROCESSING COMPARATIVE STUDY 1. CONTEXT How to finish a work in short time???? Solution To use quicker worker. Inconvenient: The speed of worker.
HPMMAP: Lightweight Memory Management for Commodity Operating Systems
Distributed Processing, Client/Server, and Clusters
1 BGL Photo (system) BlueGene/L IBM Journal of Research and Development, Vol. 49, No. 2-3.
Chapter 13 Embedded Systems
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Virtualization in Data Centers Prashant Shenoy
CS 441: Charles Durran Kelly.  What are Wireless Sensor Networks?  WSN Challenges  What is a Smartphone Sensor Network?  Why use such a network? 
Hardware/Software Concepts Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
Virtual Machines. Virtualization Virtualization deals with “extending or replacing an existing interface so as to mimic the behavior of another system”
VIRTUALISATION OF HADOOP CLUSTERS Dr G Sudha Sadasivam Assistant Professor Department of CSE PSGCT.
Deploying Moodle with Red Hat Enterprise Virtualization Brian McSpadden Director of Network Operations Remote-Learner.net.
Computer System Architectures Computer System Software
Dual Stack Virtualization: Consolidating HPC and commodity workloads in the cloud Brian Kocoloski, Jiannan Ouyang, Jack Lange University of Pittsburgh.
Yavor Todorov. Introduction How it works OS level checkpointing Application level checkpointing CPR for parallel programing CPR functionality References.
OpenMP in a Heterogeneous World Ayodunni Aribuki Advisor: Dr. Barbara Chapman HPCTools Group University of Houston.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Achieving Isolation in Consolidated Environments Jack Lange Assistant Professor University of Pittsburgh.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
Stern Center for Research Computing
STORAGE ARCHITECTURE/ EXECUTIVE: Virtualization It’s not what you think you’re buying. John Blackman Independent Storage Consultant.
CS 1651 Advanced Systems Software Jack Lange Assistant Professor University of Pittsburgh.
An architecture for space sharing HPC and commodity workloads in the cloud Jack Lange Assistant Professor University of Pittsburgh.
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters Kenji Kaneda Yoshihiro Oyama Akinori Yonezawa (University of Tokyo)
Virtual Machine and its Role in Distributed Systems.
Beowulf Cluster Jon Green Jay Hutchinson Scott Hussey Mentor: Hongchi Shi.
Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"
Directed Reading 2 Key issues for the future of Software and Hardware for large scale Parallel Computing and the approaches to address these. Submitted.
ATCA based LLRF system design review DESY Control servers for ATCA based LLRF system Piotr Pucyk - DESY, Warsaw University of Technology Jaroslaw.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
IM&T Vacation Program Benjamin Meyer Virtualisation and Hyper-Threading in Scientific Computing.
Multi-stack System Software Jack Lange Assistant Professor University of Pittsburgh.
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
PART II OPERATING SYSTEMS LECTURE 8 SO TAXONOMY Ştefan Stăncescu 1.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Next Generation Operating Systems Zeljko Susnjar, Cisco CTG June 2015.
The Role of Virtualization in Exascale Production Systems Jack Lange Assistant Professor University of Pittsburgh.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
Take enterprise virtualization to the next level
Plumbing the Computing Platforms of Big Data Dilma Da Silva Professor & Department Head Computer Science & Engineering Texas A&M University.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
Some GPU activities at the CMS experiment Felice Pantaleo EP-CMG-CO EP-CMG-CO 1.
Background Computer System Architectures Computer System Software.
Fermi National Accelerator Laboratory & Thomas Jefferson National Accelerator Facility SciDAC LQCD Software The Department of Energy (DOE) Office of Science.
Breakout 1: OS and VM discussion
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Parallel Hardware Dr. Xiao Qin Auburn.
Current Generation Hypervisor Type 1 Type 2.
Providing Security for Embedded Devices Through Virtualization
Virtual Private Servers – Types of Virtualization platforms Virtual Private ServersVirtual Private Servers, popularly known as VPS is considered one of.
Virtualization Layer Virtual Hardware Virtual Networking
QNX Technology Overview
Virtualization Techniques
LAB 01 Installation of VIRTUAL MACHINE and LINUX
CLUSTER COMPUTING.
Subject Name: Operating System Concepts Subject Number:
Virtualization Charles Warren.
Introduction to Clusters, Rocks, and MPI
Virtualization Dr. S. R. Ahmed.
Presentation transcript:

Partitioned Multistack Evironments for Exascale Systems Jack Lange Assistant Professor University of Pittsburgh

What I’ve heard… In-situ everything… – Network will not support current behaviors – Must collapse multiple functions onto single platform Visualization, data analysis, checkpointing, etc… Visualization Cluster Supercomputer Storage Cluster Exascale Machine

What does this mean for the OS? At Petascale we could optimize each environment separately – Each had their own OS and hardware At Exascale workloads will be co-located – Can a single OS handle all workloads effectively? Claim: Probably not – Each has different resource requirements and behaviors – Exascale will need to support multiple OS environments on the same hardware Exascale Node Lightweight Kernel HPC application Management Processes Resource Manager Embedded Linux Analysis + Visualization Linux Debugger

Challenges Increase in complexity at both hardware and software layers – Heterogeneous hardware GPUs, Lightweight cores, SSDs, … – Complex Topologies NUMA on chip, NUMA on node, dedicated GPU nodes, … – Heterogeneous applications – Hardware and software failures – Power constraints How can we manage all of this at the OS layer? – A unified and monolithic OS environment isn’t going to work

Approaches Exascale machines will need the ability to run multiple OS instances in parallel – Each targeting a particular application/workload Linux for vis., LWK for application, etc Hopefully virtualization can help… – But there will probably be limited hardware support for it Will need other techniques for partitioning resources – Virtualization-lite? – Lightweight and distributed resource managers – Flexible communication channels – Many others…