1 / 45

Fair Queuing for Aggregated Multiple Links

Fair Queuing for Aggregated Multiple Links. Josep M. Blanquer and Banu Özden Proceedings of the ACM SIGCOMM , August 2001. ABSTRACT. Fair Queuing algorithms Proportionally sharing a single server among competing flows Do not address the problem of sharing multiple servers .

shirin
Download Presentation

Fair Queuing for Aggregated Multiple Links

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fair Queuing for Aggregated Multiple Links Josep M. Blanquer and Banu Özden Proceedings of the ACM SIGCOMM, August 2001

  2. ABSTRACT • Fair Queuing algorithms • Proportionally sharing a single server among competing flows • Do not address the problem of sharing multiple servers. • Multiserverapplications • Link aggregation • Multiprocessors • Multi-path storage I/O

  3. We introduce a new service discipline for multi-server systems, MSF2Q, that provides guarantees for competing flows. • We prove that this new service discipline is a close approximation of the idealized Generalized Processor Sharing (GPS) discipline. • We calculate its maximum packet delay and service discrepancy with respect to GPS.

  4. 1. INTRODUCTION • A large increase in networked services a much larger variety of trafficdifferentnetwork requirements to be met simultaneouslyover the same links. • High bandwidth guarantee  backups low jitter guaranteesvideo streaminglow delay guarantees network data acquisition • Network resources must be appropriately scheduled.

  5. FairQueuing service disciplines allocates bandwidth fairly among competing traffic. • Protection from “misbehaving” traffic • Effective congestioncontrol • Better services for rate-adaptive applications • Strict QoS guarantees, with admission control.

  6. Growing demand for bandwidth Incremental scaling techniques Grouping multiplelinks into a single logical interface [3] • Implementations • [1] 3Com’s Dynamic Access • [2] Adaptec Duralink Software Suite • [12] Hewlett Packard’s Auto-Port Aggregation • [14] Intel Load Balancing • [6] J. Blanquer, al. et. Resource Management for QoS in Eclipse/BSD, Proceedings of the First FreeBSD Conference, Berkeley, California, Oct. 1999.

  7. Adaptec Duralink

  8. HP Auto Port Aggregation

  9. Intel Load Balancing

  10. 2. BACKGROUND • GPS (Generalized Processor Sharing) • Guaranteed fairness • Wx(τ, t) = the amount of traffic for flow x served in the interval [τ, t], while any flow x that is continuously backlogged during [τ, t]. • ψx=weight of flow x =proportion of the server bandwidth that flow x receives when it is backlogged. • Guaranteed rate: • ri = rate of flow ir = server rate

  11. Generalized Processor Sharing (GPS) • An idealized system that serves as a reference model for the fair queuing disciplines. • The server transmits more than one flow simultaneously and that the traffic is infinitely divisible. • A number of packetized approximations to GPS have been devised. • WFQ (Weighted Fair Queueing) ’89 Demers et al. • VC (Virtual Clock) ’90 Zhang • GPS (General Processor Sharing) ’93 Parekh et al. • SCFQ (Self-Clocked Fair Queueing) ’94 Golestani • WF2Q (Worst-case Fair Weighted Fair Queueing) ’96 Bennett et al. • SFQ (Start Time Fair Queueing) ’96 Goyal et al.

  12. * A New Priority Calculation Method for Sorted-priority Fair Queuing – Liu et al., 2004 B. Current packet priority calculation methods • Three best known packet prioritycalculation methods are [9] • Smallest Finish time First (SFF) • Packet selection: PiX(t) + li/I (li= packet length) • WFQ and SCFQ • Smallest Start time First (SSF) • Packet selection: PiX(t) • SFQ • Smallest EligibleFinish time First (SEFF) • Pre-selection: sessions with session potentialssmaller than the system potential. • Packet selection: (SFF) PiX(t) + li/i • WF2Q

  13. 3. PROPORTIONAL SHARING OF MULTISERVER SYSTEMS • Numerous applications utilizing multi-server systems that can benefit from service guarantees: • Network: Multiple network adapters to a web or file server • Storage: Multiple I/O channels to a RAID server

  14. (MSFQ, N, r) • System Model WFQ

  15. (GPS, 1, Nr) WFQ

  16. 3.1 A Packetized Fair Queuing Discipline for Multi-Servers • MSFQ’s Scheduling discipline is the same asGPS: • When a server is idle and there is a packet waiting for service, MSFQ schedules the “next” packet. • The “next” packet is defined as the first packet that wouldcomplete service in the (GPS, 1,Nr) system if no more packets were to arrive. • To compare how well a (MSFQ ,N, r) system approximates a (GPS, 1,Nr) system, calculate: (i) the worst case delay (ii) the trafficdiscrepancy

  17. 3.2 Preliminary Properties • Delay and service properties of MSFQdo not trivially follow from the single server case, WFQ. • GPS and MSFQ busy periods do not coincide. Nr Finish TimeΔ1 = L / Nr (GPS, 1,Nr) Bits left= L – [r * (L/Nr)] = L – (L/N) = (N-1)L / N r r (MSFQ ,N, r) … r Finish Time Δ2 = L / r τ W(0, τ) ≥ W’(0, τ)

  18. When GPS is busy, MSFQ is busy. However, the converse is not true. • Thus for any τ ,W(0, τ) ≥ W’(0, τ), (2)where W(0, τ) and W’ (0, τ) denote the total number of bits serviced by GPS and MSFQ , respectively, by time τ. • We will use the term busy periodto refer to a busy period in the reference (GPS, 1,Nr) system.

  19. 1 2 3 4 5 6 7 • Work from previousbusy periods can accumulate under MSFQ. • This may happen either at the beginning or in the middle of a busy period. Arrival Time Delayed Finish Service Time

  20. 1 2 3 4 5 6 7 Arrival Time Delayed Start Service Time

  21. Theorem 1: For any τ, W(0, τ) − W’ (0, τ) ≤(N − 1) Lmaxwhere Lmaxdenote the maximum packet length. • Proof: • The slope of W(GPS) alternates between Nr(when a busy period resumes) and 0 (idle, between two consecutive busy periods). • The slope of W’ (MSFQ) is at mostNrat any given time,

  22. Assume 3 servers W(0, t) GPS Slope = 0 or nr MSFQ Slope = r, 2r, 3r t 0 a1 a2 a3 a4 a5 a6 a7 a8 a9 t0 t0

  23. [Case 1] At most N − 1 MSFQ servers are busy at t: • Since MSFQ is work-conserving, if a server is idle, we know that there is no packet waiting for transmission. • In the worst case, all the k busy servers have just started transmitting a packet of maximum length (Lmax). W(0, t) − W’ (0, t) ≤ k Lmax (a) where k = N – 1

  24. GPS server • Slope = Nr W(to, t) • all MSFQ servers are busy • Slope = Nr W’(to, t) W(to, t)W’(to, t) 0 t0 t • [Case 2] All MSFQ servers are busy at t: • Let [to, t] be the largest interval in which all MSFQ servers are busy. • Since in [to, t] the slope of W’ is Nr ,W(0, t) − W’(0, t) ≤W(0, to) − W’(0, to) (b)

  25. W(0, t)=W’(0, t) t0 = 0 t • If to= 0, then W(0, t) = W’(0, t).Otherwise, if to > 0, we know from (a), W(0, to) − W’(0, to) ≤(N − 1) Lmax (c) • From (b) and (c), we have W(0, τ) − W’ (0, τ) ≤(N − 1) Lmax  • This theorem implies the need for a buffer space of (N − 1) Lmax.

  26. The discrepancy of packet departure times (i.e. begin transmitting/servicing) between multi-server and single-server • Letdpbe the time at which packet p departs from (GPS, 1,Nr) system. • MSFQ packets may not departin increasing order of dp.

  27. Lemma 1:Packet k will be scheduled no later than: where akand bk be respectively the arrival time and scheduling time of packet kover Nservers, each with a rate of r, Pbe the set of packets scheduledbefore packet k since time ak, including the packets in service at ak, Libe the length of packet i.

  28. Packet arrivals from all flows ak bk • Proof: • Given a load that must be scheduled before packet k, a work conserving service discipline schedules packet k latest, if the load is equally divided among the N servers such that all of them finish the work at the same time. 

  29. 4. PACKET DELAY • Theorem 2: For all packets p, wheredp’ and dpbe the time at which packet pdeparts from the (MSFQ,N, r) and (GPS,1, Nr)system, respectively. • Proof: • Skipped

  30. 5. SERVICE PER-FLOW • Theorem 3: For any τ , Wi(0, τ) − Wi’(0, τ) ≤ NLmax • Proof: • Skipped

  31. 6. FAIRNESS • Example 3: • 4 servers: • 11 flows: (fixed packet length) • F1: Weight = 0.5, 10 packets at t = 0 • F2 ~ F11: Weight = 0.05, each with 1 packet at t = 0

  32. GPS Scheduled by WFQ ( finish time): F1A = 0 + L / 0.5 F1B = F1A + L / 0.5 = 2L / 0.5 …… F2 = 0 + L / 0.05 F3 = 0 + L / 0.05 ……

  33. MSFQ Scheduled by WFQ ( finish time):

  34. GPS Scheduled by WF2Q(eligible start time (HOL) + finish time): * Not Smooth? ?

  35. The direct application of WF2Q technique to multi-server systems does not fix the undesired burstiness problem and moreover, it makes the discipline non-workconserving. Not eligibleuntil the previous pkt is scheduled non-workconserving

  36. 6.1 MSF2Q • (MSF2Q,N, r) • A packet is outstanding if it is being transmitted. • Let ôi(t) denote the number of outstanding flow ipackets at the MSF2Q system at time t. • Ŵi(τ, t) = the work completed for flow i under MSF2Q over the interval [τ, t]

  37. At time t, when a server is idle and there is a packet waiting for service, MSF2Q schedules among the flows (eligible) that satisfyor [ and ] • That would complete service in the GPS system earliest Example 3: F1: r1 = 0.5 F2~F10: rx = 0.05 r = 1/4 = 0.25  ô1 = 0.5/0.25 = 2 ôx = 0.05/0.25 = 1

  38. The output of MSF2Q in Example 3: * Smooth scheduling Example 3: F1: r1 = 0.5 F2~F10: rx = 0.05 r = 1/4 = 0.25  ô1 = 0.5/0.25 = 2 ôx = 0.05/0.25 = 1

  39. 6.2 Properties of MSF2Q • Theorem 4: Let Li,maxdenote the maximum packet length of flowi. For any time τand flow i, the following property holds:(8) • Proof: • Skipped

  40. 7. APPLICATIONS • Link Aggregation • Logicalgrouping of several Ethernetnetwork interfaces to allow for cost-effective, load balancing, better scalability, and fault-tolerance. • IEEE 802.3ad • Currently ranges from two to eight Fast/Gigabit Ethernet ports in either servers or switching elements.

  41. Access of storage I/O • To connect the RAID system to a host (e.g., Web server) with multiple SCSI or Fiber Channels to improve the I/O performance. • Load balancing, failover

  42. 8. RELATEDWORK • Skipped

  43. 9. CONTRIBUTIONS AND FUTUREWORK • Link aggregation, or the aggregation of multiple interfaces into a single logical link, is becoming the predominant approach for bandwidth scaling. • Numerous fair queuing results previously obtained for single server systems do not directly apply to multi-server systems.

  44. We first analyzed the cumulative service, packet delay and per-flow cumulative service bounds for Weighted Fair Queuing (WFQ) applied to a multi-server system. • We then presented a new fair queuing algorithm - MSF2Q that leads to smooth and fair schedules in finer time scales.

  45. Our future plans include: • Investigation of implementationissues • Quantitative comparison of the approach presented in this paper to the alternative approach of partitioning flows among servers • Enhancing the algorithms for multiprocessorsand cluster of servers • Hierarchal GPS • Servers with different rates • Misorderingof packets

More Related