Download presentation
Presentation is loading. Please wait.
Published byAmice Welch Modified over 9 years ago
1
Reclaiming Space from Duplicate Files in a Serverless Distributed File System From Microsoft Research
2
Motivation Unused disk space on desktop computers A lot of files are identical Can be used to build a “central” file server Provide high availability & reliability Farsite –Convergent encryption –SALAD
3
Convergent encryption Identical files are still identical after encryption, even with different keys K1=Hash(P) C1=E1(P, K1) M =E2(K1, Ku) C = C1 are the same for identical files, but M are different for different users. Without Ku, nobody can read P.
4
THEX (Tree Hash EXchange format ) ROOT=H(E+F) / \ E =H(A+B) F=H(C+D) / \ / \ A=H(S1) B=H(S2) C=H(S3) D=H(S4)
5
SALAD Self-Arranging, Lossy, Associative Database Leaf: all nodes Cell: a set of nodes, full duplicate of all files Every file has a fingerprint Cell-ID width W= lg(L/۸) –L: system size, ۸: target redundancy factor Dimensionality parameter D
6
SALAD
7
Files are full duplicated inside cells, Each node maintains a routing table for all vector- aligned nodes
8
SALAD: properties Each node estimates the system size separately Inconsistent estimation doesn’t cause malfunction, but less efficiency Routing table is relatively small Robust to attack
9
A Demand based Algorithm for Rapid Updating of Replicas From Polytechnic University of Catalonia, Spain In weak consistency algorithms, updating replicas which have most demand, a greater number of clients would gain access to updated content in a shorter period of time. Anti-entropy Session: two servers mutually exchange summary vectors and then exchange data to build consistent content
10
Algorithm Each node has a number donating its demand for some replica Choose the neighbor which has highest demand to start the session After a session, the node (just get the new update) will continue this process if it has some neighbor which has higher demand than itself.
11
Algorithm Demand: number of request per unit time –What does it exactly mean? How to get it? Dynamic algorithm: –The demand of neighbors may change over time. So exchange the demand between neighbors periodically. –How does the static algorithm work? How and when does a node get the demand of its neighbors?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.