Presentation is loading. Please wait.

Presentation is loading. Please wait.

On the Role of Burst Buffers in Leadership-Class Storage Systems

Similar presentations


Presentation on theme: "On the Role of Burst Buffers in Leadership-Class Storage Systems"— Presentation transcript:

1 On the Role of Burst Buffers in Leadership-Class Storage Systems
Ning Liu, Jason Cope, Philip Carns, Christopher Carothers, Robert Ross, Gary Grider, Adam Crume, Carlos Maltzahn Contact:

2 Why do we need burst buffer?
Modern HPC storage system is designed to absorb peak I/O burst resulting in bandwidth waste. This storage system design cannot keep pace with the growing data demands as supercomputers evolve to the era of exascale. e.g. statistics from ALCF Intrepid Blue Gene/P system shows that 98.8% of the time the I/O system is at 33% bandwidth capacity or less, and 69.6% of the time the system is at 5% bandwidth capacity or less. Aggregate I/O throughput over one-minute intervals on ALCF BG/P [1] [1] CODES:Enabling Co-Design of Multi-Layer Exascale Storage Architectures Project Proposal

3 Storage System of Intrepid (BG/P)
Assuming the audience need to know details about the targeted storage we modeled.

4 Software Stack on Intrepid
Give the audience a brief view of the software stack on Intrepid, adding BB to IOFSL layer

5 Adding Burst Buffer to the Storage System Model
We build a parallel discrete event based leadership class storage system model[2]. We propose a tier of solid-state disk (SSD) burst buffer integrating to existing I/O nodes. [2] Modeling a Leadership-scale Storage System, N Liu, C Carothers, J Cope, P Carns, R Ross, A Crume, C Maltzahn 9th International Conference on Parallel Processing and Applied Mathematics 2011 (PPAM 2011)

6 Methodology Why Parallel Discrete-Event Simulation (DES)?
Large-scale systems are difficult to understand Analytical models are often constrained Parallel DES simulation offers: Dramatically shrink model’s execution-time Prediction of future “what-if” systems performance Potential for real-time decision support Minutes instead of days Analysis can be done right away Example models: national air space (NAS), ISP backbone(s), distributed content caches, next generation supercomputer systems.

7 ROSS: Parallel Discrete Event Simulator
Developed in ANSI C, API is simple and lean. Using Jefferson’s Time Warp event scheduling mechanism. Reverse computation. Global virtual time algorithm exploits IBM Blue Gene’s fast barrier and collective networks. Ross main page:

8 Study of Bursty Applications
TABLE I: Top four write-intensive jobs on Intrepid, December 2011 Say that these are the drivers of the simulation. Explain these to detail as needed.

9 Burst Buffer Parallel DES Model

10 Burst Buffer Model Parameters
TABLE II: Summary of relevant SSD device parameters and technology available as of January 2012 Vendor Size (TiB) NAND Bandwidth (GiB/s) Latency (μs) Write Read FusionIO 0.40 SLC 1.30 1.40 15 47 1.20 MLC 68 Intel 0.25 0.32 0.50 80 65 Virident 0.30 1.10 16 0.60 19 62 Burst buffer latency model:

11 Single I/O Workload Case
Using IOR benchmark ( consecutive 4MiB write requests) Capture write time, excludes open/close time Full storage system uses 128 file servers Half storage system uses 64 Starting from this slides, we will focus on the results. Simulated performance of IOR for various storage system and burst buffer configurations

12 Investigate Parameter Space
Simulated IOR performance for various burst buffer configurations and I/O workloads Simulated application runtime performance for various I/O and burst buffer configurations

13 Mixed Workload Experiments (1)
Full storage system with burst buffer disabled. (Compute Node View) Jobs execution time = 5.5 hours Full storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours

14 Mixed Workload Experiments (2)
Full storage system with burst buffer disabled. (Enterprise Storage View) Full storage system with burst buffer enabled. (Enterprise Storage View)

15 Mixed Workload Experiments (3)
Full storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours Full storage system with burst buffer enabled. (Enterprise Storage View)

16 Mixed Workload Experiments (4)
Half storage system with burst buffer disabled. (Compute Node View) Jobs execution time = 5.8 hours Half storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours

17 Mixed Workload Experiments (5)
Full storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours Half storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours

18 Mixed Workload Experiments (6)
Half storage system with burst buffer disabled. (Enterprise Storage View) Half storage system with burst buffer enabled. (Enterprise Storage View)

19 Mixed Workload Experiments (7)
Half storage system with burst buffer enabled. (Compute Node View) Jobs execution time = 4.4 hours Half storage system with burst buffer enabled. (Enterprise Storage View)

20 Mixed Workload Experiments (8)
Full storage system with burst buffer enabled. (Enterprise Storage View) Half storage system with burst buffer enabled. (Enterprise Storage View)

21 Summary and Future Work
Burst buffer enables the following Increase the application perceived bandwidth Reduce the storage system while still achieving desirable bandwidth Decrease application time to solution Increase storage system bandwidth utilization Absorb bursty I/O workload Future work Investigate the use of burst buffer on different tiers of the storage system Enable the use of simulator framework to extreme scale based on current leadership class storage system model Understand complex system components and facilitate the design through simulation

22 Acknowledgements Questions? Special thanks to Dr. Jason Cope;
Special thanks to my advisor Prof. Christopher Carothers, thesis advisor Dr. Robert Ross; Thanks to Philip Carns, Kevin Harms, Gary Grider, Carlos Maltzahn, and Adam Crume. Thanks everyone here today! Questions?


Download ppt "On the Role of Burst Buffers in Leadership-Class Storage Systems"

Similar presentations


Ads by Google