Download presentation
Presentation is loading. Please wait.
Published byCorey Todd Modified over 9 years ago
1
Storage Issues
2
Replica Placement Most existing works focus on how to place replica with low cost. Maybe it is safer that we separate the replicas as far as possible? ◦ In same server => server crash ◦ In same rack = > rack failure ◦ In same datacenter = > earthquake or other cataclysms Consider both distance and cost.
3
Data Deduplication Data deduplication is a specialized data compression technique for eliminating coarse-grained redundant data. ◦ Improve storage utilization. Issues: ◦ How to improve the duplication detection and chuck existence querying efficiency. Efficient chunking, faster hash indexing, locality- preserving index catching, and efficient bloom filters …etc. ◦ Compressing the unique chunks and performing (fixed-size) large writes through containers or similar structures.
4
Read Performance of Deduplication Storage Publication of David H. C. Du, HPCC’11. Read performance is critical to reconstruct the original data stream.
5
Read Performance of Deduplication Storage(Cont.) One example is to store images of VMs(process/memory/disk) to shared network storage. ◦ VM images of idle desktops are migrated to network storage for energy saving.
6
Benchmarks Filebench http://sourceforge.net/apps/mediawiki/filebench/inde x.php Phoronix Test Suite – disk test suite http://www.phoronix-test-suite.com/ Bonnie++ http://freecode.com/projects/bonnie
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.