Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage Issues. Last Time Deduplication storage ◦ Read performance is critical to reconstruct the original data stream.

Similar presentations


Presentation on theme: "Storage Issues. Last Time Deduplication storage ◦ Read performance is critical to reconstruct the original data stream."— Presentation transcript:

1 Storage Issues

2 Last Time Deduplication storage ◦ Read performance is critical to reconstruct the original data stream.

3 Problem Assume that data streams are stored in different disks. After deduplication, every chunk is unique. Decide a chunk deployment such that the number of cross-disk access of every data stream during reconstruction is almost the same. A B C

4 Example A B C C a = C b = C c = Avg. = Std. = 0 4 2 2 √(8 / 3) 5 0 1 √(14 / 3) 3 3 0 √(6 / 3) 2 1 √(2 / 3)2 2 0 Optimal Solution!

5 Another Example C a = C b = C c = Avg. = Std. = A B C …… kk/2 k 0 √(k 2 / 12) k/2 0 Optimal Solution!

6 Bipartite Graph ABCABC 1 2 3 4 5 A1 B2 A3 B4 A5 B1 A2 C3 A4 C4 B5

7 Maximum Flow 4-2=2 1 1 1 1 1 C5C5 C4C4 C3C3 C2C2 C1C1 C B A 2-2=0 5-2=3 *Avg = 2 ABCABC 1 2 3 4 5

8 David Du’s Keynote Speech A New Era after the Convergence of Network Centric and Data Centric Computing Challenges in the new environment. ◦ Network is no longer just end-to-end. ◦ Data is no longer just structured. ◦ Data -> Information -> Knowledge ◦ Data/Network Security, Data/Information Privacy ◦ Long-Term Data Preservation ◦ Scalability (Internet, cloud storage and exascale computing)

9 Some More Challenges How to model dynamically changed data relationship? ◦ Relationship can be changed by an event, by situations, or by interests. How to decide data to be sent (to whom?), to be stored (for how long?) and to be dropped?

10 Intelligent Storage Based on Object-based Storage Devices(OSD)

11 Application of Data Dedupe and Cloud Storage Scalable storage architecture (including hierarchical network architecture) for data center Data Center-Wide Deduplication ◦ Distributed shared storage environment. HDFS Dedup

12 Application of Data Dedupe and Cloud Storage(Cont.) User and Task Dynamic Allocations Power Saving with Performance Guaranteed

13 His Conclusion Many Research Challenges New Internet Architecture Is Required Thinking Beyond Existing Storage Hierarchy Migrating Higher Level Functions into Lower Level Is the Key for Scalability Exascale and Cloud Computing Are Just Beginning Security and Privacy Issues Are Paramount Long-Term Data Preservation Is A Crisis Now

14


Download ppt "Storage Issues. Last Time Deduplication storage ◦ Read performance is critical to reconstruct the original data stream."

Similar presentations


Ads by Google