1 / 54

Minimizing Stall Time in Single Disk

Minimizing Stall Time in Single Disk. Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu. Introduction. Prefetching and caching are powerful techniques for increasing performance in disk systems

Download Presentation

Minimizing Stall Time in Single Disk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Minimizing Stall Time in Single Disk Susanne Albers, Naveen Garg, Stefano Leonardi, Carsten Witt Presented by Ruibin Xu

  2. Introduction • Prefetching and caching are powerful techniques for increasing performance in disk systems • Prefetching: load memory blocks into the cache before the actual references (needs to evict blocks simultaneously) • Caching: maintain the most frequently accessed blocks in cache

  3. Introduction • Both techniques have been studied extensively, but separately • Now look at them in an integrated manner • Focus on the offline problem

  4. The problem definition • Assume all blocks reside on one disk • The cache size is k • Serving a request takes one time unit • Fetching a block takes F time units • Given a request sequence σ = r1, … , rn, how to schedule the prefetching to minimize the total stall time

  5. An example • k = 4 • F = 5 • Blocks a, b, c and d are initially in the cache The minimum stall time is 3

  6. Big question • Cao et. al. designed a 2-approximation algorithm. • Can this problem be solved exactly in polynomial time? • Yes, this paper answers this quesiton

  7. The idea • Use linear programming • At first thought, needs to prove that the optimum solution is integral by arguing that all vertices of the corresponding polytope are integral • By showing that the constraint matrix is total unimodular (ex. Bipartite matching) • By combinatorial argument(ex. Matching and matroid polytopes)

  8. Main novelty • At second thought, the polytope corresponding to the LP to this problem has nonintegral vertices • Now if we can show that any solution to the LP can be written as a convex combination of (polynomially many) integral solutions, ……

  9. The roadmap • Construct the LP • Solve the LP • Find the convex decomposition to integral solutions

  10. The LP formulation • This is a 0-1 LP • The length of the request sequence is n • The cache size is k • The fetching time is F • The cache initially contains k blocks never requested in the sequence

  11. The variables of the LP • Consider all the intervals of the request sequence of length at most F : interval I = (i, j) of length |I|=j – i – 1, i = 0, … , n-1, j = 1, … , n, i < j

  12. The variables of the LP • Associate each interval I with an indicator variable X(I)where X(I) =1 indicates a prefetch starting after request i and ending before request j and X(I) =0 indicates no prefetch is performed in this interval • With each interval I and distinct block a, associate variable fI,a ( eI,a), which is 1 if block a is fetched (evicted) in interval I and 0, otherwise

  13. The objective func. of the LP • The prefetch occuring in interval I has a stall time F - |I| • Thus the objective function is

  14. The constraints of the LP • There are 7 kinds of constraints • A definition: an interval (a, b) is contained in an interval (c, d) if c ≤a and d ≥b, denoted by (a, b) (c, d)

  15. The 1st constraint • To ensure that two prefetches are not performed simultaneously

  16. The 2nd constraint • For any interval, the total amount of fetch should be exactly equal to the total amount of eviction and this value should not exceed the value of the interval

  17. The 3rd constraint • A block should be in cache when it is referenced • After each reference to a block, the block is in cache. It can then be evicted at most once up until the next reference to that block, and if it is, it must be also be fetched back prior to that next reference

  18. The 4th and 5th constraint • To ensure that every block is in cache at its first reference, the total fetch of a block on intervals before its first reference should be 1 and the total evict of the block on these intervals should be 0

  19. The 6th constraint • A block is not evicted for more than 1 unit after its last reference

  20. The last constraint • On each request, the requested block is neither prefetched nor evicted • And

  21. Solving the LP relaxation • First solve the LP relaxation. If we get an integral solution, we are done. • If not, find the convex combination

  22. Modify the intervals • The goal: to obtain a total order of intervals • An interval I1 = (i1, j1) is properly contained in interval I2 = (i2, j2) iff i1 > i2 and j1 < j2 • We don’t want any interval is properly contained in any interval

  23. Modify the intervals • For each pair of nested intervals, remove one of them and add two new intervals

  24. Order the intervals • Now we can order the intervals by increasing starting points; • If two intervals have the same start point, then they are ordered by increasing end-points

  25. Properties of the optimum sol. • Let C denote the cache configuration after we have performed the fetches and evicts corresponding to the first i intervals; let I be the (i+1)-st interval • There exists an optimum solution for which the next two claims are satisfied

  26. Properties of the optimum sol. • Claim 1: In interval I, we fetch the block that is not completely in C and whose next reference is earliest • Claim 2: In interval I, we evict the block which is partially or completely in C whose next reference is furthest • Both claims can be proven by contradiction

  27. Properties of the optimum sol. • The amount of fetch of a block prescribed by claim 1 might be less than x(I). In this case, we apply the same rule to fetch another block in I • The same holds for the case of evictions

  28. Another view of the process of fetching/evicting • Define the distance of interval I • View the process of fetching/evicting as a process in time by associating the time interval [dist(I), dist(I)+x(I)) with interval I

  29. Another view of the process of fetching/evicting • There is a unique interval associated with each time instant • Also associate a unique fetch/evict with each time instant

  30. Properties of the optimum sol. • From claim 1&2 and the ordering of fetches/evicts within an interval, it follows that a block a is fetched continuously till it is fully in cache • But the eviction of a could be interrupted before it is completely out of cache

  31. Properties of the optimum sol. • Consider the fetches/evictions of a block a between two consecutive references to a • Lemma 1. Every interruption in the eviction of a is for some integral time units

  32. Properties of the optimum sol. • A block a is partially fetched/evicted if the total extent to which a is fetched/evicted between two consecutive references is strictly less than 1 • Lemma 2. If a is partially fetched/evicted, then the fetch of a begins some integral time units after the start of its eviction

  33. Properties of the optimum sol. • Lemma 3. If a is evicted at time t and referenced again, then there is a time t’ = t + i, for some integer i, at which a is fetched back

  34. The convex decomposition • Let t be in the range [0, 1) and let ti = i + t for every integer i, 0 ≤ i ≤ x(I) • Claim 3. Let t1, t2 be two time instants such that t2 = t1 + i for some positive integer i, and let I1, I2 be the intervals associated with these time instants. Then I1 and I2 are disjoint.

  35. The convex decomposition • Lemma 4. For any time t in [0,1), the set of intervals that correspond to ti forms a feasible solution • Note that each solution is obtained not for just one value of t but for a range of values, say for all t in the range [a, b]. We associate a weight b – a in the decomposition.

  36. Conclusion • An optimum prefetching/caching schedule for a single disk can be computed in polynomial time

  37. Open problem • Now the problem can be solved exactly in polynomial time by using LP, Does there exist a combinatorial, polynomial time algorithm? • Yes, by using multicommodity network flows

  38. The roadmap • Construct the LP • Solve the LP • Construct the multicommodity network • Solve the network • Find the convex decomposition to integral solutions

  39. Problem • No combinatorial polynomial-time algorithm for computing non-integral min-cost flow is known • But we know an approximation algorithm: for any ε ≥ 0, δ ≥ 0, the algorithm computes a flow such that a fraction of at least 1 - ε of each demand in the network is satisfied and the cost of the flow is at most (1 + δ ) times the optimum

  40. The network • Given a request sequence of length n, construct a network with n+1 commodities • Associate each request σ(i) with a commodity i, which has a source si, a sink ti and a demand di = 1 • For each request σ(i) , introduce two vertices xiand x’i

  41. An example network Sketch of the network for request sequence abcbc and F=2

  42. The problem of previous network • The construction allows a flow algorithm to saturate more than one of the edges that correspond to fetches executed simultaneously • Needs to make sure at most one fetch operation is executed at any time

  43. Solution • Split the “super edge” (si, xj) into several parts and add one more commodity • For any l, 1≤ l ≤ n-1, let [l, l+1) be the time interval starting at the service of σ(l) and ending immediately before the service of σ(l+1)

  44. Solution • For any fixed i and j, with 1 ≤ i ≤ n, and pi+1 ≤ j < i, introduce vertices vijl and wijl where l = j, … , min{j+F, i} -1 • For any fixed i , with 1 ≤ i ≤ n, introduce vertices viii-1 and wiii-1 • How to connect? How to assign cost and capacity?

  45. Solution • Now add the (n+1)-st commodity • Let fl be the number of prefetches whose execution overlaps with [l, l+1) • Commodity n+1 has a source sn+1, a sink tn+1 and a demand dn+1

  46. Solution • The flow from sn+1 to tn+1 is routed through the edges (vijl , wijl ) and newly introduced “subsinks” tn+1l, 1 ≤ l ≤ n-1 • How to connect? How to assign cost and weight?

  47. Optimal flows • Any feasible integral flow of cost C in the network correspond to a feasible prefetching/caching schedule with stall time C for σ, and vice versa • A non-integral flow correspond to a fractional prefetching/caching schedule

  48. Apply the approximation algo. • Unfortunately, the flow computed by the algorithm does not correspond to a feasible fractional prefetching/caching schedule • It is possible that(1) more than one block is fetched at any time and (2)blocks are not completely in cache when requested

  49. Apply the approximation algo. • The solution is to choose ε and δ properly and modify the flow • Choose ε=1/(4F2n3) and δ=1/(3nF) • Let Φ be the flow returned by the approximation algorithm

  50. Apply the approximation algo. • The flow out of each source si, i={1,…n}, is lower bounded by 1-ε. Moreover, commodity n+1 might lack an amount of εdn+1≤ εFn2 • Let ρ= 1-ε- εdn+1 , transform the flow Φ into a uniform flow Φ’ which directs exactly ρ units of flow from si to ti

More Related