Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two Nov. 24, 2003 Byung-Gon Chun.

Similar presentations


Presentation on theme: "1 High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two Nov. 24, 2003 Byung-Gon Chun."— Presentation transcript:

1 1 High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two Nov. 24, 2003 Byung-Gon Chun

2 2 Contents Introduction Basic Model Availability and Redundancy Discussion High Availability, Scalable Storage, Dynamic Peer Networks: Pick Three

3 3 Introduction Peer-to-peer lookup: robust, scalable with dynamic membership  Robust and scalable storage with dynamic membership ? Pick two –Lookup is not bottleneck. –(upstream) bandwidth limitation –Disk space grows faster than access bandwidth

4 4 Basic Model Assumptions –Simple redundancy maintenance mechanism (enter and exit) –Static data placement strategy (f: RB-> N) –Identical per-node space and bandwidth contributions –Constant rate of entering and exiting. –Independence of exit events –Constant steady-state number of nodes and total data size –Maintenance bandwidth Average case analysis

5 5 Basic Model N: number of hosts D: data S: data + redundancy (S = kD)  : entering rate : exiting rate (  = ) T: lifetime (T=N/ ) B: bandwidth

6 6 Understanding the Scaling - Short membership : enormous nodes to scale - How fast storage of systems can grow? (k = 20)

7 7 Availability & Redundancy Membership timeout: distinguish true departures from temporary downtime, delay its response to failures Counting offline hosts as members –Lifetime is longer –Hosts serve as a fraction of time (a: availability) –More redundancy is needed –Effective bandwidth is reduced Redundancy: replication vs. erasure coding

8 8 Model

9 9 Availability & Redundancy 33000 hosts Gnutella network, 1TB data, six nine data availability 30-fold savings by membership timeout Additional 8-fold savings by erasure coding –75Kbps maintenance bandwidth per node –500MB of disk per host contributed 5000 of 33000 hosts usually available –Aggregate bandwidth 500Mbps –5 dedicated, reliable PCs with 250GB drives and 50Mbps connection up 99% of the time

10 10 Membership Timeout

11 11 Replication vs. Coding

12 12 Admission Control, Load- Shifting Do not admit highly volatile nodes, Shift responsibility to non-volatile hosts 5% most available hosts - 40% of service years. –30Kbps per node per unique-TB using coding –1000-fold savings using delayed response, coding, and admission control Still bounded by bandwidth –100Kbps maintenance bandwidth, 3GB disk space –10 universities with 1/3 OC3 Two million cable modem users at 40% availability ~ 2000 universities with ½ OC3

13 13 Hardware Trends Participation should be more stable to contribute meaningful fraction of disks

14 14 Incentive Issues Stable membership is necessary. How to incent? –Added value of service guarantees –Allow client bandwidth usage to be only proportional to contributed bandwidth -- Prioritizing traffic

15 15 Discussion High availability, scale, dynamic membership: high service bandwidth  Current DHT research trajectory ??? Static membership – small lookup-state optimization do more harm than good (another approach - one-hop lookup) (another approach – distributed directory) Dynamic membership – why leverage many flaky nodes to serve data a few reliable ones

16 16 Discussion Why worry about lookup guarantees if storage guarantees are inappropriate? When anonymity or related security properties are the high, why not plan to include the defense from the beginning?

17 17 Availability [Bhagwan, Savage, and Voelker 2003]

18 18 Pick Three Distributed directory (DD) –Uses a level of indirection –Controls the data placement –Exploits heterogeneity (availability, lifetime, and bandwidth) Pick Three!!!

19 19 Discussion?


Download ppt "1 High Availability, Scalable Storage, Dynamic Peer Networks: Pick Two Nov. 24, 2003 Byung-Gon Chun."

Similar presentations


Ads by Google