Toward a Unified HPC and Big Data Runtime Joshua Suetterlein University of Delaware Joshua Landwehr, Joseph Manzano, Andres Marquez PNNL 11/18/2018
Why should we consider joining HPC and Big Data? Motivation Why should we consider joining HPC and Big Data? Boundaries are beginning to blur Next generation systems can cost Billions of $$$ in R&D and Millions to maintain Moreover technical challenges provide a unique opportunity HPC Big Data Computational science has become essential for engineering and science Reliably manages exponentially growing data Computational models enable the impossible (or impractical) Data analytics explores complex relationships among and growing data http://cacm.acm.org/magazines/2015/7/188732-exascale-computing-and-big-data/fulltext 11/18/2018
Motivation Cont. Known Exascale Challenges Energy efficiency Resilience Memory technology Interconnect technology Algorithms Scientific productivity http://ascr-discovery.science.doe.gov/2014/11/exascale-road-bumps/ Hidden Challenge: I/O vs Concurrency System 2009 2015 2024 Total Concurrency 225,000 O(million) O(billion) System Memory 0.3 PB 5 PB 10 PB I/O 0.2 TB 10 TB/s global PFS + 100 TB/s burst buffer 20 TB/s global PFS + 500 TB/s burst buffer Science at Extreme Scale: Architectural Challenges and Opportunities – Lucy Nowell April, 2014 11/18/2018
Motivation Cont. How will we perform science? Increased parallelism enables larger experiments Weak scaling – increase concurrency ≈ more data Current paradigm – offload results to another machine i.e. post-mortem analysis New paradigm – leverage In-situ/Big Data analytics then offload reduced results Another view: If a simulation is the source, we can leverage streaming analytics to hide IO latency Synergistic Challenges in Data-Intensive Science and Exascale Computing. March, 2013 11/18/2018
Fine-Grain How is computational science traditionally performed? MPI + OpenMP dominate scientific code Architectural trends previously amenable to OpenMP and MPI are changing Unprecedented concurrency Highly coupled memory and processing elements Deepening memory hierarchies Increasingly heterogeneity Strict energy and power budgets A dataflow inspired, asynchronous, event driven, fine-grain execution model appears flexible enough to express and exploit sufficient parallelism to utilize new architectures Gao. 7/2002 Fran Allen’s Retirement Workshop 11/18/2018
with Big Data Extensions Fine-Grain Cont. Open Community Runtime (OCR) is a codelet execution model developed under UHPC and X-Stack initiatives Exploits asynchronous, event driven, fine-grain execution Can OCR be augmented with streaming Big Data techniques to reduce IO pressures and hide latency? Application Fine-grain runtime with Big Data Extensions RDMA Scheduler (SLURM) File System (Lustre) Compute Nodes 11/18/2018
Opportunities Fine-grain runtimes have the potential of providing more than weak scaling For Exascale, system memory is getting bigger (~10PB) Big Data trends provide a new abstractions for analytics In-situ has not been very flexible Events, memory abstractions, and compression are already being explored 11/18/2018
Challenges Programming model disconnect Load balancing How can we effectively map simulated data generated by fine-grain tasks into partitioned key/value Load balancing Transparent parallel execution of a partition of data Priorities: Simulation vs Analysis This is similar to streaming back-pressure except we control the source… How does this effect the critical path? 11/18/2018
Questions Thank You! 11/18/2018