Download presentation
Presentation is loading. Please wait.
Published byEdward Johnston Modified over 9 years ago
1
Distributed Storage Allocation Problems Derek Leong, Alexandros G. Dimakis, Tracey Ho California Institute of Technology NetCod 2009 2009-06-16
2
Motivation
3
0.1
4
A B C
5
Success probability = 0.9 0 × 0.1 5 × 0 successful 0-subsets + 0.9 1 × 0.1 4 × 2 successful 1-subsets + 0.9 2 × 0.1 3 × 7 successful 2-subsets + 0.9 3 × 0.1 2 × 9 successful 3-subsets + 0.9 4 × 0.1 1 × 5 successful 4-subsets + 0.9 5 × 0.1 0 × 1 successful 5-subsets = 0.99 A
6
Motivation Success probability = 0.9 0 × 0.1 5 × 0 successful 0-subsets + 0.9 1 × 0.1 4 × 0 successful 1-subsets + 0.9 2 × 0.1 3 × 0 successful 2-subsets + 0.9 3 × 0.1 2 × 10 successful 3-subsets + 0.9 4 × 0.1 1 × 5 successful 4-subsets + 0.9 5 × 0.1 0 × 1 successful 5-subsets = 0.99144 B
7
Motivation Success probability = 0.9 0 × 0.1 5 × 0 successful 0-subsets + 0.9 1 × 0.1 4 × 0 successful 1-subsets + 0.9 2 × 0.1 3 × 6 successful 2-subsets + 0.9 3 × 0.1 2 × 10 successful 3-subsets + 0.9 4 × 0.1 1 × 5 successful 4-subsets + 0.9 5 × 0.1 0 × 1 successful 5-subsets = 0.9963 C
8
MotivationA B C 0.99 0.99144 0.9963
9
0.1 accessmodel
10
Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? x Storage Allocation Access by the Data Collector Objective
11
Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? x Storage Allocation Source s has a data object of unit size It can use n storage nodes to store x 1, x 2, …, x n amount of data But faces an aggregate storage budget T, i.e. Access by the Data Collector Objective
12
Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? x Storage Allocation Access by the Data Collector Data collector t attempts to recover the data object by accessing a subset r of storage nodes It succeeds when the total amount of data accessed is at least the size of the data object, i.e. Objective
13
Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? x Storage Allocation Access by the Data Collector Objective We seek the optimal allocation that maximizes the probability of successful recovery
14
Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? x Difficulty Problem is nonconvex Large space of possible symmetric and nonsymmetric allocations (an allocation is symmetric if all its nonzero elements are equal, and nonsymmetric otherwise)
15
[1] Deterministic Allocation with Probabilistic Access Data collector accesses each storage node independently with constant probability p
16
Symmetric allocations can be suboptimal † Given n = 5 storage nodes, budget T = 12 / 5, and p = 0.9, the nonsymmetric allocation performs better than the optimal symmetric allocation Finding the optimal symmetric allocation is also nontrivial [1] Deterministic Allocation with Probabilistic Access † Originally from a discussion among R. Karp, R. Kleinberg, † C. Papadimitriou, E. Friedman, and others † at UC Berkeley
17
[2] Deterministic Allocation with Fixed Access Data collector accesses an r -subset of storage nodes, selected uniformly at random from the collection of all possible r -subsets, where r < n is a constant
18
[2] Deterministic Allocation with Fixed Access Equivalently, we can seek the allocation that minimizes the budget T, among all allocations that achieve a given probability of successful recovery
19
[2] Deterministic Allocation with Fixed Access Example: ( n, r ) = (6,2) Question: For any budget T, is there always a symmetric allocation that produces the maximum success probability?
20
[2] Deterministic Allocation with Fixed Access Question: What is the optimal symmetric allocation? For most choices of ( n, r, T ), the optimal allocation either concentrates the budget over a minimal number of nodes, or spreads it out maximally An example of an exception is ( n, r, T ) = (15, 3, 4.6) for which the optimal number of nodes to use, 9, is neither of the extremes
21
[2] Deterministic Allocation with Fixed Access For Probability-1 Recovery, the problem reduces to a simple LP Result 1: If we require all possible r -subsets to allow successful recovery, then we need a minimum budget of which corresponds to the allocation i.e. it is optimal to spread the budget maximally We can also bound the success probability above which this allocation is optimal
22
[3] Symmetric Probabilistic Allocation with Fixed Access Each storage node is used independently with constant probability s / n to store the same amount of data 1 / `, and the total storage used must be at most budget T in expectation
23
[3] Symmetric Probabilistic Allocation with Fixed Access Probability of successful recovery can be written as where “Bin( n, p )” denotes the binomial random variable with n trials and success probability p Reparameterizing in terms of budget T gives the success probability,, each nonempty node stores 1 / ` amount of data
24
[3] Symmetric Probabilistic Allocation with Fixed Access Result 2: For any r ≥ 2, and at any budget T large enough to support a success probability xXXxx P ( r, T, ` ) > 0.9 for some `, the choice of x x x x x x x x x x ` = r is optimal, i.e. it is best to spread the budget maximally each nonempty node stores 1 / ` amount of data
25
[3] Symmetric Probabilistic Allocation with Fixed Access As we increase the budget T, we observe a sharp change in the optimal allocation For small budgets and therefore low success probabilities, it is optimal to store the data object in its entirety ( ` = 1) and hope the data collector accesses at least one of the nonempty nodes For large budgets and therefore high success probabilities, it is optimal to store only 1 / r amount of data in each node used ( ` = r ) and hope the data collector accesses r of them r = 5
26
[3] Symmetric Probabilistic Allocation with Fixed Access We conjecture that for any r and T, the optimal choice of ` that maximizes success probability P ( r, T, ` ) is either ` = 1 or ` = r r = 5 each nonempty node stores 1 / ` amount of data
27
[3] Symmetric Probabilistic Allocation with Fixed Access We conjecture that for any r and T, the optimal choice of ` that maximizes success probability P ( r, T, ` ) is either ` = 1 or ` = r each nonempty node stores 1 / ` amount of data r = 5 store less store more increasing budget per node
28
Summary & Future Work [1] Deterministic Allocation with Probabilistic Access Suboptimality of symmetric allocations [2] Deterministic Allocation with Fixed Access Optimal allocation for high probability recovery Extreme point solutions not necessarily optimal for symmetric allocations Is there always a symmetric optimal allocation? [3]iSymmetric Probabilistic Allocation with Fixed Access Optimal allocation in high-probability regime Is there a phase transition in optimal allocation with increasing budget?
29
Distributed Storage Allocation Problems Derek Leong, Alexandros G. Dimakis, Tracey Ho California Institute of Technology NetCod 2009 2009-06-16
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.