Download presentation
Presentation is loading. Please wait.
Published byLeslie Allen Modified over 9 years ago
1
Versioning Architectures for Local and Global Memory Hajime Fujita 123, Kamil Iskra 2, Pavan Balaji 2, Andrew A. Chien 12 1 University of Chicago, 2 Argonne National Laboratory 3 Intel Dec 15, 2015Hajime Fujita, ICPADS 20151
2
Funding Acknowledgment and Legal Disclaimers Dec 15, 2015Hajime Fujita, ICPADS 20152 Intel and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information go to http://www.intel.com/performance. *Other names and brands may be claimed as the property of others. This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Award DE-SC0008603 and Contract DE-AC02-06CH11357 and completed in part with resources provided by the University of Chicago Research Computing Center.
3
Background High error rate in large-scale supercomputers Growing concern about latent errors (e.g. silent data corruption ) o Errors that have latency between their occurrence and detection Multi-versioned data store being a promising approach to address latent errors Dec 15, 2015Hajime Fujita, ICPADS 20153
4
How Multi-version Helps? Multi-versioning enables flexible recovery from latent errors Dec 15, 2015Hajime Fujita, ICPADS 20154 version corrupted version corrupted version corrupted checkpoint corrupted checkpoint error occurred version corrupted version corrupted version Error detected Error detected Start Error detected restored v ersion restored v ersion Rollback new state Recovery using a part of an old version (b) Rollback using an old version (c) Forward error correction using an old version (a) Traditional checkpoint/restart Restart
5
Programming with GVR Globally-shared, multi-version array for application state preservation Explicit library calls for array manipulation/version creation Dec 15, 2015Hajime Fujita, ICPADS 20155 Put Get Put Version 2 Version 1... Array A Array B Process Put Get (Global View Resilience)
6
Many Versions are Partial Updates Dec 15, 20156Hajime Fujita, ICPADS 2015 Opportunity for saving storage/bandwidth requirements H.Fujita, et al., Log-Structured Global Array for Efficient Multi-Version Snapshots, CCGrid 2015
7
How to Make Versions Efficiently? Dec 15, 2015Hajime Fujita, ICPADS 20157 Approach 1: Copy entire array each time Current Old Current Old Approach 2: Keep updated data only Current Old Approach 3: Allocate memory block on-demand Runtime Overhead Memory Savings Low High H.Fujita, et al., Empirical Comparison of Three Versioning Architecture, Cluster 2015
8
Approach 1: Flat Array Copy and keep entire array on each version creation Dec 15, 2015Hajime Fujita, ICPADS 20158 Current Version Version 1 Version 0 ✔ Simple structure, fast access ✖ High memory demand, copy overhead
9
Approach 2: Flat with Change Tracking Use a flat array for current version, then only record updated regions upon version creation Dec 15, 2015Hajime Fujita, ICPADS 20159 Current Version Version 1 Version 0 ✔ Relatively fast access, small footprint ✖ At least one full array, change tracking overhead
10
Approach 2: Flat with Change Tracking (Cont.) Detecting updated region User : GVR library records updates on write operations (e.g. put() or acc()) Kernel : Page write protection + page fault handling HW : Use CPU-based dirty-page tracking o Requires special modification to the kernel Dec 15, 2015Hajime Fujita, ICPADS 201510
11
Approach 2: Flat with Change Tracking (Cont.) Versioning directions (keep new or old values?) Dec 15, 2015Hajime Fujita, ICPADS 201511 Redo versioning Keep new values Undo versioning Keep original values One additional copy
12
Approach 3: Log-structured Array Allocate memory block on-demand o Allocated regions form a log o Log = data + metadata (index) Dec 15, 2015Hajime Fujita, ICPADS 201512 Current Version Version 1 Version 0 Log H.Fujita, et al., Log-Structured Global Array for Efficient Multi- Version Snapshots, CCGrid 2015 ✔ Small footprint ✖ High access overhead
13
Problem Statement Which array architecture brings the best performance and the lowest memory consumption, under varied workloads? Dec 15, 2015Hajime Fujita, ICPADS 201513 Flat Flat + change tracking Change tracking: user/kernel/HW Versioning direction: undo/redo Log-structured array Global Memory Versioning Local Memory Versioning
14
Evaluation 1: Runtime Performance of Local Memory Versioning Dec 15, 2015Hajime Fujita, ICPADS 201514 Flat Flat + change tracking Change tracking: user/kernel/HW Versioning direction: undo/redo Log-structured array Global Memory Versioning Local Memory Versioning
15
Local Memory Versioning: Setup RandomAccess benchmark from HPC Challenge o Repeat 8-byte load/store to uniformly random locations o Create a version at a certain interval Intel® Xeon® processor E5620 (2.4GHz, 4 core) Linux kernel 3.18.4 Applied a patch to enable access to dirty bit information, based on [Vasavada 2011] Dec 15, 2015Hajime Fujita, ICPADS 201515
16
Runtime Performance with Various Tracking Schemes Redo performance to relative to flat (no versioning) Dec 15, 2015Hajime Fujita, ICPADS 201516 Best tracking scheme Low frequency: kernel/HW High frequency: user =how many read/write ops per version
17
Runtime Performance with Different Versioning Directions Compare Undo vs Redo Redo performance relative to Undo Dec 15, 2015Hajime Fujita, ICPADS 201517 Redo is up to 22% slower due to extra copy
18
Evaluation 2: Runtime Performance and Memory Consumption of Global Memory Versioning Dec 15, 2015Hajime Fujita, ICPADS 201518 Flat Flat + change tracking Change tracking: user/kernel/HW Versioning direction: undo/redo Log-structured array Global Memory Versioning Local Memory Versioning
19
Synthetic Benchmark Get() and Put() to random locations + version creation Parameter: Versioning frequency Dec 15, 2015Hajime Fujita, ICPADS 201519 Environment: UChicago RCC Midway Intel® Xeon® processor E5-2670 (8 cores x2) InfiniBand FDR-10 MVAPICH2 (gcc) Based on APEX-Map [E. Strohmaier et al. 2004] LineLocalityExample HighRadix sort MediumN-body LowMatmul Array Index P0 P1 P2 Probability
20
Runtime Performance with Various Global Versioning Dec 15, 2015Hajime Fujita, ICPADS 201520 Flat with change tracking best for performance Throughput (Kops/s) #procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50% change tracking Medium locality (k=0.025)
21
Memory Usage with Various Global Versioning Dec 15, 2015Hajime Fujita, ICPADS 201521 Log-structured array best for memory usage Memory usage (MiB) #procs=32, block size=4096 B, array size=256 MiB/proc, read ratio=50%, versioning frequency=1e-5
22
Evaluation 3: Version Retrieval Cost Dec 15, 2015Hajime Fujita, ICPADS 201522 Flat Flat + change tracking Change tracking: user/kernel/HW Versioning direction: undo/redo Log-structured array Global Memory Versioning Local Memory Versioning
23
Version Retrieval Cost Partial retrieval o e.g. Localized recovery 1.Create 256 versions with certain fill ratio 2.Pick one version 3.Read from 10,000 random locations in that version Dec 15, 2015Hajime Fujita, ICPADS 201523 Full retrieval o e.g. Full rollback 1.Create 256 versions with certain fill ratio 2.Pick one version 3.Read the entire contents of that version version Get version Get
24
Full Version Retrieval Cost Dec 15, 2015Hajime Fujita, ICPADS 201524 Flat/log array have constant cost of version rollback Redo versioning is good at restoring older versions, whereas undo is good at newer versions
25
Partial Version Retrieval Cost Dec 15, 2015Hajime Fujita, ICPADS 201525 Flat/log array have more uniform, shorter latency Flat with tracking encounters higher variation and average latency Fill ratio = 1%
26
Related Work Log-structured file systems o LFS [Rosenblum 1992], PLFS [Bent 2009] o Focused on improving write performance, while our focus is in capturing writes Log-structured distributed data stores o RAMCloud [Ongaro 2011, Rumble 2014], SILT [Lim 2011], Pilaf [Mitchell 2013] o Similar structure to log-structured array o GVR is array-oriented (not KV-oriented) Incremental checkpointing o [Plank 1995], TICK [Gioiosa 2005], [Agarwal 2004] o Not focusing on RDMA, a new challenge to transparent change tracking Dec 15, 2015Hajime Fujita, ICPADS 201526
27
Summary Compared local and global memory versioning architectures for efficient versioning Findings from evaluation o Flat with change tracking: best performance in most cases o Log-structured array: best choice for memory savings, uniform and low-cost recovery Future Work o Analysis of data redundancy inside the array, seeking a way to harden the array (e.g. error correction coding) o Investigation on hardware/software architecture that allows fine-grain, efficient change tracking on remote memory Dec 15, 2015Hajime Fujita, ICPADS 201527 http://gvr.cs.uchicago.edu
28
Backup Dec 15, 2015Hajime Fujita, ICPADS 201528
29
Fine-grain Comparison on Memory Change Tracking (1) Memory access latency of the first write to each page Dec 15, 2015Hajime Fujita, ICPADS 201529 Kernel change tracking has higher latency due to page fault handling
30
Performance Comparison (2) Performance over various versioning frequency, RMA, #procs=32, block size=4096B, array size=512MB/proc, read ratio=50% Log-structured array works better for localized (smaller k) access pattern Dec 15, 2015Hajime Fujita, ICPADS 201530
31
Memory Consumption Dec 15, 2015Hajime Fujita, ICPADS 201531 Log-structured array requires the least amount of memory Undo versioning requires additional memory for the undo buffer Flat array requires fixed amount of memory, regardless of locality For flat with tracking and log array, higher locality incurs lower memory consumption
32
Incremental/decremental Dec 15, 2015Hajime Fujita, ICPADS 201532
33
Full Version Retrieval Cost Dec 15, 2015Hajime Fujita, ICPADS 201533 Flat/log array have constant cost of version rollback Redo versioning is good at restoring older versions, whereas undo is good at newer versions
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.