Download presentation
Presentation is loading. Please wait.
Published byAutumn Mudgett Modified over 9 years ago
1
Sharding and the Isis 2 DHT Did you understand how the Isis 2 distributed hash table works?
2
DHT Basics Suppose we have a group containing 500 members, and decide to store data in shards of size 3. A.This isn’t going to work: the shard size needs to divide evenly into the group size. B.In Isis 2 some shards can be a little too big, or too small. The value you specify is more of a target
3
DHT Basics Suppose we have a group containing 500 members, and decide to store data in shards of size 3. A.This isn’t going to work: the shard size needs to divide evenly into the group size. B.In Isis 2 some shards can be a little too big, or too small. The value you specify is more of a target You do get to specify a minimum size for the group as a whole, below which Isis 2 temporarily disables the DHT functionality
4
DHT Basics With a DHT storing data A.Both Put and Get operations have costs roughly proportional to the time to do a remote procedure call: one RTT to each participant, issued in parallel B.Like other DHTs, the Isis 2 has costs proportional to the log of the group size. This relates to needing to route requests in a binary search manner: half way, then a quarter way, etc.
5
DHT Basics With a DHT storing data A.Both Put and Get operations have costs roughly proportional to the time to do a remote procedure call: one RTT to each participant, issued in parallel B.Like other DHTs, the Isis 2 has costs proportional to the log of the group size. This relates to needing to route requests in a binary search manner: half way, then a quarter way, etc. Isis 2 offers a so-called “1-hop” DHT. No indirect routing occurs and none of these log(N) delays arise in this approach. Indirect routing is perceived as a problem with many other DHTs, like Chorus or Pastry, but doesn’t apply to the Isis 2 version
6
DHT Basics Within a group, shard membership A.Counts off by rank: 0… NS-1, 0… NS-1, etc (where NS is the number of shards: the group size divided by the target shard size) B.The first shard will be on left and includes members 0..S-1, where S is the shard size. The second shard will include members S..(2*S-1) C.Membership is pretty random: you need to hash the address of the member onto a ring and then the shard is that member and the next S-1 along the edge
7
DHT Basics Within a group, shard membership A.Counts off by rank: 0… NS-1, 0… NS-1, etc (where NS is the number of shards: the group size divided by the target shard size) B.The first shard will be on left and includes members 0..S-1, where S is the shard size. The second shard will include members S..(2*S-1) C.Membership is pretty random: you need to hash the address of the member onto a ring and then the shard is that member and the next S-1 along the edge We use this scheme because it has low cost when a failure occurs. We ruled this approach out because “churn” after a failure is too expensive: large numbers of members might need to be reinitialized. This is how Chord and Pastry work, but not the way that the Isis 2 DHT works.
8
When inserting an item… The key is first hashed by computing the hashcode modulo the number of shards. This gives the desired shard number. Then… A.A multicast is sent to just the shard members. B.The value is multicast to the entire group and those members that are in the matching shard retain it. Others ignore the multicast.
9
When inserting an item… The key is first hashed by computing the hashcode modulo the number of shards. This gives the desired shard number. Then… A.A multicast is sent to just the shard members. B.The value is multicast to the entire group and those members that are in the matching shard retain it. Others ignore the multicast.
10
When inserting a list of items The list of keys is first hashed by computing the hashcode item by item modulo the number of shards. This yields a list of shards. A.Now each item is inserted in a separate parallel action, independently. B.Now all items are inserted using a single multicast that goes to only the full set of shard members in the list
11
When inserting a list of items The list of keys is first hashed by computing the hashcode item by item modulo the number of shards. This yields a list of shards. A.Now each item is inserted in a separate parallel action, independently. B.Now all items are inserted using a single multicast that goes to only the full set of shard members in the list By using a single multicast, we guarantee all-or-nothing behavior. But this is a special protocol that only reaches the subset of members that are in the target shards. Moreover, it is an exceptionally fast protocol, like a parallel unicast.
12
Consistency To obtain strong consistency guarantees from the Isis 2 DHT A.The insert and get should be done using the DHTOrderedPut and DHTOrderedGet methods B.An OrderedSend must be done targetting the entire group C.The Isis 2 system doesn’t offer consistency for DHT operations
13
Consistency To obtain strong consistency guarantees from the Isis 2 DHT A.The insert and get should be done using the DHTOrderedPut and DHTOrderedGet methods B.An OrderedSend must be done targetting the entire group C.The Isis 2 system doesn’t offer consistency for DHT operations
14
DHT Fault Tolerance True or False. The Isis 2 DHT automatically masks faults, retrieving data redundantly in DHTGet and deduplicating to return just one key-value pair for any given key.
15
DHT Fault Tolerance True. If DHTGet or DHTOrderedGet lack a response from some participant, the Isis 2 DHT automatically retries. An exception is thrown if an entire shard fails, since that would prevent the system from getting even a single response for keys mapped to that shard. When using OrderedGet, the entire OrderedGet will be reissued if necessary, to ensure that the responses are collected along a consistent cut.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.