250 likes | 357 Views
This paper explores innovative approaches to minimize flow time in scheduling jobs across multiple machines. It delves into the complexities of preempted scheduling, where jobs can be interrupted, and focuses on algorithms that achieve competitive performance in minimizing total flow time. Key findings include an approximation scheme offering a (1 + ε) factor to optimality, running in polynomial time relative to the number of machines and jobs. The research presents breakthroughs in dynamic programming techniques and discusses the challenges posed by NP-hardness in scheduling problems, emphasizing the importance of effective job processing strategies.
E N D
Minimizing Flow Time on Multiple Machines Nikhil Bansal IBM Research, T.J. Watson
C1 C3 C2 m=1 Job preempted Scheduling Collection of m machines, n jobs Arrival time or release time (rj) Service requirement or size (pj) r1 r2 t=0 r3
Scheduling Flow Time = Time job spends = Completion time – release time = Waiting + Processing r1 r2 t=0 r3 C1 C3 C2 m=1 Flow time of job 2
Scheduling Flow Time = Time job spends = Completion time – release time = Waiting + Processing minimize total flow time r1 r2 t=0 r3 C1 C3 C2 m=1 Flow time ofjob 3
Total Flow Time (Another View) Imagine each job costs $1 per unit time. Cost of a job = Its flow time Total cost = Total flow time Total cost = t cost at time t = t # jobs at time t
Total Flow Time (Single Machine) Total cost = t # jobs at time t Processor has a “to do” list of jobs Goal: Minimize number of jobs on list Work on the job it can finish earliest. Shortest remaining processing time (SRPT): Optimal algorithm
Flow Time on multiple Machines (m ¸ 2) NP-Hard: Breakthrough: O(log n) competitive [Leonardi, Raz 97] Works for arbitrary # of machines (m) Any online algorithm: (log n) competitive Improvements: No migrations[Awerbuch et al 99] Immediate dispatch [Avrahami and Azar 03]
Flow Time on Multiple Machines What about approximation algorithms? O(log n) best known, even for m=2 Lower bounds: NP-Hard, APX-Hard ?
Flow Time on Multiple Machines Main Result: A (1+) approximation scheme Running Time = nO(m log n) Or, nO(log n) for m=O(1) Suggests: PTAS likely for O(1) machines
Basic Idea Rounding:Simplify the input without losing quality too much Search: Dynamic Programming over some reasonable space of schedules
Related Problem Minimizing total completion time: ( i ci or equivalently i (ri + fi) ) Same as flow time wrt optimality But easier for approximation PTASes known with runtime poly(n,m) [Afrati et al 99] Techniques not applicable to flow time
Rounding for Flow Time Flow Time is quite sensitive Suppose round size to powers of (1+) Cannot distinguish between Job of size 1 arrives at t=1,2,…,n Job of size 1+ arrives at t=1,2,…,n Very Different: (n) vs(n2) !!!
Rounding for Flow Time Can show: Let B be largest size, Rounding ri, pi to multiples of B/n2is fine Proof: Each job affected by · B/n Opt ¸ B Implies: Sizes 2[1,n2/] ,Events at[1,n3/] Still bad for exhaustive search over all schedules.
Restricting possible schedules Jobs assigned to a machine, worked in SRPT order. Given a machine, which jobs assigned to it? (2n possibilities) Approx state under SRPT in O(log2 n) bits of info. Store for each machine. Dynamic program: For (state,t) whats the best flow time achievable.
State Properties 1) Enough information: State at t+1 computable from that at time t. 2) Gives number of jobs to within 1+ factor
Property of SRPT At any time, among jobs with size 2 [a,b], at most one has remaining processing < a.
Property of SRPT At any time, among jobs with size 2 [a,b], at most one has remaining processing < a. Proof: b Not executed until blue finishes a
Property of SRPT At any time, among jobs with size 2 [a,b], at most one has remaining processing < a. Proof: b Both cannot be < a at some time a
Property of SRPT At any time, among jobs with size 2 [a,b], at most one has remaining processing < a. Suppose a= (1+)i, b=(1+)i+1 Given, total remaining size (x) of jobs s.t. pi2 [a,b] x/b ·Estimate # of jobs· x/a + 1
Configuration on a machine Consider O(log n/)size-classes [(1+)i,(1+)i+1] For each class, • Total remaining processing times • 1/ largest remaining processing times x/(1+)i+1· # of jobs· x/(1+i) + 1 Class 1: (Total 1, x1,x2,…,x1/) … Class k: (Total k, y1,y2,…,y1/) k=O(log n) In all O(log2 n) bits
Updating a configuration At most O(m log2n) bits of information Gives number of jobs to within 1+ How to update, as time passes? Class 1: (Total 1, x1,x2,…,x1/) … Class j : (Total j, y1,y2,…,y1/) On arrival, guess the machine & update state m branches
Updating a configuration At most O(m log2n) bits of information Gives number of jobs to within1+ How to update, as time passes? Class 1: (Total 1, x1,x2,…,x1/) … Class j : (Total j, y1,y2,…,y1/) Working step: For each machine, guess class with smallest remaining time job [(log n)mchoices]
Fitting it all together At any time, O(m log2n/2) total bits of info. Know how to update. Dynamic program over all possible states.
Weighted Flow Time ( iwi fi) NP-Hard for m=1, No o(n) approximation known, even for m=1 m=1: (1+) approx, time nO(log B log W)[Chekuri, Khanna 02] B: max/min size W: max/min wt This paper: Extend to m=O(1), time nO(m log Bn log Wn) Hardness: Exponential dependence on m likely (1+ ) approx with running time 2O(polylog(n,m,W,B)) ) NP µ DTIME(npolylog(n))
Open Problems 1) PTAS or O(1) approx for minimizing flow time on O(1) machines? [Our QPTAS => PTAS likely] 2) For arbitrary number of machines. PTAS or APX-Hard?