Samsara: Honor Among Thieves in Peer-to-Peer Storage Landon P. Cox and Brian D. Noble University of Michigan
Samsara From Wikipedia, the free encyclopedia Sa ṅ sāra or Sa ṃ sāra (Sanskrit: संसार ) Literally means "continuous flow" Is the cycle of birth, life, death, rebirth or reincarnation within many Eastern religions
Paper overview Proposes an incentive mechanism motivating participants in a P2P distributed file system to contribute as much space as they consume Addresses the tragedy of the commons Requires each peer that requests storage from another peer to hold a claim for same amount of storage Claims can be exchanged
The tragedy of the commons Assume a group of herders that a common pasture, on which they are entitled to let their cows graze To maximize his/her personal benefit, each herder will put as many cows as it can on the common pasture As a result, the common pasture becomes overgrazed and useless Happened to the Boston Common
Boston common
Introduction P2P file systems have many advantages Require users to consume storage according to their contribution Otherwise system will collapse Solution is a mechanism enforcing "storage fairness" Incentive mechanism
Extant solutions A trusted third-party enforcing quotas Requires a centralized administration Letting people buy and sell storage space Requires a trusted clearance infrastructure Using certified identities and trusted keys Requires a trusted certification authority Enforcing total symmetry within pairs of peers Unpractical
Samsara key idea (I) Manufacture symmetric relations through claim forwarding All exchanges of data for claims form symmetric contracts Each node periodically checks the other for compliance Done in a probabilistic fashion When a node breaches the contract, other node is free to drop the data of its partner
Samsara key idea (II) Nodes can forward claims rather than honoring them Still remain responsible for the claims they have forwarded Mechanism penalizes unresponsive nodes in a probabilistic fashion A node suffering a short outage may lose some replicas of its data
Background Samsara is an add-on to Pastiche a P2P cooperative backup system To be discussed later Built itself on top of Pastry network Pastiche SamsaraPastry OS + Disks
Overall design Objective is equal exchange If A stores data for B then B must store an equal-size claim for B If B discards A’s claim then A can discard B’s data Equal exchange is enforced by periodic queries Not answering a query is a sufficient reason to have you data dropped
The problem This simple claim model punishes nodes too severely for transient failures New approach Is probabilistic Takes into account transient failures When a node fails to answer a query, each of is replica sites drops data with some probability
Claim construction (I) Claims are “incompressible placeholders” Computing a claim requires a secret passphrase P a secret symmetric key K and a location in storage space
Claim construction (II) Assuming we have 512-byte claims The first claim C 0 would contain Twenty-five 20-bit hashes h i = SHA1(P, i) where P is the secret pass phrase and i the hash index First 12 bits of next hash in sequence all encrypted with the symmetric key K C 0 = {h 0, h 1, …, first 12 bits of h 25 } K
Claim construction (III) Successive claims are built using repeating the process C 1 = {h 26, h 27, …, first 12 bits of h 51 } K C i = {h j, h j+1, …, first 12 bits of h j+25 } K where j = 26i
Answering claim queries Can be done with a single SHA1 hash Querying party provides Unique value h 0 List of objects to verify Responding party Append h 0 to first object O 0 in list and compute h 1 = SHA1(O 0, h 0 ) Recursively computes h i+1 = SHA1(O i, h i ) Returns last h j
Example (I)
Example B has claim β 1 on A and B has claim γ 1 on B
Example Node B does not have enough space to hold claim γ1
Example Node B forwards its claim for space on node A to node C
Claim forwarding If a node X has a claim ξ on another node Y and owns a claim ζ to a third node Z It can forward its claim ζ to node Y Everything works fine until a node fail
Failures in dependency chains
Before failure, B stores data for A, C stores data for B … E stores data for D and hold a claim ε 1 on A When C fails and stop answering queries from B, B uses it storage rights on A and replaces claim ε 1 by its own claim β 1
Failures in dependency chains After that we have a cascade of damaging actions A fails to answer queries from E E holds D responsible for loss of claim ε 1 and discards the data it had stored for D D loses its backup data on E even though it had always operated in a correct fashion Forwarding claims increases the risk of data losses
Failures in dependency cycles
The effect of a failure is much less dramatic when we have a dependency cycle, where B stores data for A, C stores data for B … E stores data for D A stores data for E
Failures in dependency cycles When C fails and stop answering queries from B, B uses it storage rights on A and requests it to store its claim β 1 Since A stores data for E, it can forward claim β 1 to E Since E stores data for D, it can forward claim β 1 to E E keeps claim β 1 because it has data on E
Evaluation Samsara is faster than scp Most chain are short as long as there is free space Great news! Nodes should forward claims in a very conservative fashion to minimize data losses