1 / 29

Distributed Storage Allocation Problems

Distributed Storage Allocation Problems. Derek Leong, Alexandros G. Dimakis , Tracey Ho California Institute of Technology NetCod 2009 2009-06-16. Motivation. Motivation. 0.1. 2. ?. ?. ?. ?. ?. Σ ≥ 1?. Motivation. A. 1. 1. 0. 0. 0. B. 2 / 5. 2 / 5. 2 / 5. 2 / 5.

nanji
Download Presentation

Distributed Storage Allocation Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed Storage Allocation Problems Derek Leong, Alexandros G. Dimakis, Tracey Ho California Institute of Technology NetCod 2009 2009-06-16

  2. Motivation

  3. Motivation 0.1 2 ? ? ? ? ? Σ≥1?

  4. Motivation A 1 1 0 0 0 B 2/5 2/5 2/5 2/5 2/5 C 1/2 1/2 1/2 1/2 0

  5. Motivation A 1 1 0 0 0 Success probability = 0.90× 0.15×0 successful 0-subsets + 0.91× 0.14×2 successful 1-subsets+ 0.92× 0.13×7 successful 2-subsets+ 0.93× 0.12×9 successful 3-subsets+ 0.94× 0.11×5 successful 4-subsets+ 0.95× 0.10×1 successful 5-subsets =0.99

  6. Motivation B 2/5 2/5 2/5 2/5 2/5 Success probability = 0.90× 0.15×0 successful 0-subsets + 0.91× 0.14×0 successful 1-subsets+ 0.92× 0.13×0 successful 2-subsets+ 0.93× 0.12×10 successful 3-subsets+ 0.94× 0.11×5 successful 4-subsets+ 0.95× 0.10×1 successful 5-subsets =0.99144

  7. Motivation C 1/2 1/2 1/2 1/2 0 Success probability = 0.90× 0.15× 0 successful 0-subsets + 0.91× 0.14× 0 successful 1-subsets+ 0.92× 0.13×6 successful 2-subsets+ 0.93× 0.12×10 successful 3-subsets+ 0.94× 0.11×5 successful 4-subsets+ 0.95× 0.10×1 successful 5-subsets =0.9963

  8. Motivation A 0.99 1 1 0 0 0 B 0.99144 2/5 2/5 2/5 2/5 2/5 0.9963 C 1/2 1/2 1/2 1/2 0

  9. Motivation 0.1 2 allocationmodel access model ? ? ? ? ? Σ≥1?

  10. Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? • x • Storage Allocation • Access by the Data Collector • Objective

  11. Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? • x • Storage Allocation • Source s has a data object of unit size • It can use n storage nodes to store x1, x2, …, xn amount of data • But faces an aggregate storage budget T, i.e. • Access by the Data Collector • Objective

  12. Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? • x • Storage Allocation • Access by the Data Collector • Data collector t attempts to recover the data object by accessinga subset r of storage nodes • It succeeds when the total amount of data accessed is at least the size of the data object, i.e. • Objective

  13. Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? • x • Storage Allocation • Access by the Data Collector • Objective • We seek the optimal allocation that maximizes the probability of successful recovery

  14. Problem Description x How do we use storage nodes to store a data object reliably, subject to an aggregate storage budget? • x • Difficulty • Problem is nonconvex • Large space of possible symmetric and nonsymmetric allocations(an allocation is symmetric if all its nonzero elements are equal,and nonsymmetric otherwise)

  15. [1] Deterministic Allocation with Probabilistic Access Data collector accesses each storage node independentlywith constant probability p

  16. [1] Deterministic Allocation with Probabilistic Access • Symmetric allocations can be suboptimal • †Given n = 5 storage nodes,budget T = 12/5, and p = 0.9,the nonsymmetric allocationperforms better than the optimal symmetric allocation • Finding the optimal symmetric allocation is also nontrivial †Originally from a discussion among R. Karp, R. Kleinberg, †C. Papadimitriou, E. Friedman, and others†at UC Berkeley

  17. [2] Deterministic Allocation with Fixed Access Data collector accesses an r-subset of storage nodes,selected uniformly at random from the collection of all possible r-subsets, where r<n is a constant

  18. [2] Deterministic Allocation with Fixed Access • Equivalently, we can seek the allocation that minimizes the budget T, among all allocationsthat achieve a given probabilityof successful recovery

  19. [2] Deterministic Allocation with Fixed Access • Example: (n, r) = (6,2) • Question: For any budget T, is therealways a symmetric allocation thatproduces the maximum success probability?

  20. [2] Deterministic Allocation with Fixed Access • Question: What is the optimal symmetric allocation? • For most choices of (n, r, T), theoptimal allocation either concentrates the budget over a minimal number of nodes, or spreads it out maximally • An example of an exception is (n, r, T) = (15, 3, 4.6)for which the optimal number of nodes to use, 9, is neither of the extremes

  21. [2] Deterministic Allocation with Fixed Access • For Probability-1 Recovery, the problem reduces to a simple LP • Result 1:If we require all possible r -subsets to allow successful recovery, then we need a minimum budget ofwhich corresponds to the allocationi.e. it is optimal to spread the budget maximally • We can also bound the success probability above which this allocation is optimal

  22. [3] SymmetricProbabilisticAllocation with Fixed Access Each storage node is used independently with constant probability s/n to store the same amount of data 1/`, andthe total storage used must be at most budget T in expectation

  23. [3] SymmetricProbabilisticAllocation with Fixed Access • Probability of successful recovery can be written aswhere “Bin(n, p)” denotes the binomial random variable with n trials and success probability p • Reparameterizing in terms ofbudget T gives the success probability , each nonempty node stores1/` amount of data ,

  24. [3] SymmetricProbabilisticAllocation with Fixed Access • Result 2: For any r≥ 2, and at any budget T large enough to support a success probabilityxXXxxP(r, T,`)> 0.9for some `, the choice ofx x xxxxxxxx`=ris optimal, i.e. it is best to spread the budget maximally each nonempty node stores1/` amount of data

  25. [3] SymmetricProbabilisticAllocation with Fixed Access • As we increase the budget T, we observe a sharp change in the optimal allocation • For small budgets and thereforelow success probabilities,it is optimal to store the data object in its entirety (`= 1) and hope the data collector accesses at least one of the nonempty nodes • For large budgets and therefore high success probabilities, it is optimal to store only 1/r amount of data in each nodeused (`=r) and hope the data collector accesses r of them r= 5

  26. [3] SymmetricProbabilisticAllocation with Fixed Access • We conjecture that for any r and T, the optimal choice of ` that maximizessuccess probability P (r, T,`) is either `= 1 or`=r r= 5 each nonempty node stores1/` amount of data

  27. [3] SymmetricProbabilisticAllocation with Fixed Access • We conjecture that for any r and T, the optimal choice of ` that maximizessuccess probability P (r, T,`) is either `= 1 or`=r r= 5 increasing budgetper node each nonempty node stores1/` amount of data store less store more

  28. Summary & Future Work [1]Deterministic Allocation with Probabilistic Access • Suboptimality of symmetric allocations [2]Deterministic Allocation with FixedAccess • Optimal allocation for high probability recovery • Extreme point solutions not necessarily optimal for symmetric allocations • Is there always a symmetric optimal allocation? [3]iSymmetricProbabilisticAllocation withFixedAccess • Optimal allocation in high-probability regime • Is there a phase transition in optimal allocationwith increasing budget?

  29. Distributed Storage Allocation Problems Derek Leong, Alexandros G. Dimakis, Tracey Ho California Institute of Technology NetCod 2009 2009-06-16

More Related