Open Issues in Buffer Sizing

Open Issues in Buffer Sizing Amogh Dhamdhere Constantine Dovrolis College of Computing Georgia Tech

Outline • Motivation and previous work • The Stanford model for buffer sizing • Important issues in buffer sizing • Simulation results for the Stanford model • Buffer sizing for bounded loss rate (Infocom’05)

Motivation • Router buffers are crucial elements of packet networks • Absorb rate variations of incoming traffic • Prevent packet losses during traffic bursts • Increasing the router buffer size: • Can increase link utilization (especially with TCP traffic) • Can decrease packet loss rate • Can also increase queuing delays

Common operational practices • Major router vendor recommends 500ms of buffering • Implication: buffer size increases proportionally to link capacity • Why 500ms? • Bandwidth Delay Product (BDP) rule: • Buffer size B = link capacity C x typical RTT T (B = CxT) • What does “typical RTT” mean? • Measurement studies showed that RTTs vary from 1ms to 10sec! • How do different types of flows (TCP elephants vs mice) affect buffer requirement? • Poor performance is often due to buffer size: • Under-buffered switches: high loss rate and poor utilization • Over-buffered DSL modems: excessive queuing delay for interactive apps

Previous work • Approaches based on queuing theory (e.g. M|M|1|B) • Assume a certain input traffic model, service model and buffer size • Loss probability for M|M|1|B system is given by • TCP is not open-loop; TCP flows react to congestion • There is no universally accepted Internet traffic model • Morris’ Flow Proportional Queuing (Infocom ’00) • Proposed a buffer size proportional to the number of active TCP flows (B = 6*N) • Did not specify which flows to count in N • Objective: limit loss rate • High loss rate causes unfairness and poor application performance

TCP window dynamics for long flows • TCP-aware buffer sizing must take into account TCP dynamics • Saw-tooth behavior • Window increases until packet loss • Single loss results in cwnd reduction by factor of two • Square-root TCP model • TCP throughput can be approximated by • Valid when loss rate p is small (less than 2-5%) • Average window size is independent of RTT Loss Rate RTT

Origin of BDP rule • Consider a single flow with RTT T • Window follows TCP’s saw-tooth behavior • Maximum window size = CT + B • At this point packet loss occurs • Window size after packet loss = (CT + B)/2 • Key step: Even when window size is minimum, link should be fully utilized • (CT + B)/2 ≥ CT which means B ≥ CT • Known as the bandwidth delay product rule • Same result for N homogeneous TCP connections

Outline • Motivation and previous work • The Stanford model for buffer sizing • Important issues in buffer sizing • Simulation results for the Stanford model • Buffer sizing for bounded loss rate (BSCL)

Stanford Model - Appenzeller et al. • Objective: Find the minimum buffer size to achieve full utilization of target link • Assumption: Most traffic is from TCP flows • If N is large, flows are independent and unsynchronized • Aggregate window size distribution tends to normal • Queue size distribution also tends to normal • Flows in congestion avoidance (linear increase of window between successive packet drops) • Buffer for full utilization is given by • N is the number of “long” flows at the link • CT: Bandwidth delay product

Stanford Model (cont’) • If link has only short flows, buffer size depends only on offered load and average flow size • Flow size determines the size of bursts during slow start • For a mix of short and long flows, buffer size is determined by number of long flows • Small flows do not have a significant impact on buffer sizing • Resulting buffer can achieve full utilization of target link • Loss rate at target link is not taken into account

What are the objectives ? • Network layer vs. application layer objectives • Network’s perspective: Utilization, loss rate, queuing delay • User’s perspective: Per-flow throughput, fairness etc. • Stanford Model: Focus on utilization & queueing delay • Can lead to high loss rate (> 10% in some cases) • BSCL: Both utilization and loss rate • Can lead to large queuing delay • Buffer sizing scheme that bounds queuing delay • Can lead to high loss rate and low utilization • A certain buffer size cannot meet all objectives • Which problem should we try to solve?

Saturable/congestible links • A link is saturable when offered load is sufficient to fully utilize it, given large enough buffer • A link may not be saturable at all times • Some links may never be saturable • Advertised-window limitation, other bottlenecks, size-limited • Small buffers are sufficient for non-saturable links • Only needed to absorb short term traffic bursts • Stanford model applicable: when N is large • Backbone links are usually not saturable due to over-provisioning • Edge links are more likely to be saturable • But N may not be large for such links

Which flows to count ? • N: Number of “long” flows at the link • “Long” flows show TCP’s saw-tooth behavior • “Short” flows do not exit slow start • Does size matter? • Size does not indicate slow start or congestion avoidance behavior • If no congestion, even large flows do not exit slow start • If highly congested, small flows can enter congestion avoidance • Should the following flows be included in N ? • Flows limited by congestion at other links • Flows limited by sender/receiver socket buffer size • N varies with time. Which value should we use ? • Min ? Max ? Time average ?

Which traffic model to use ? • Traffic model has major implications on buffer sizing • Early work considered traffic as exogenous process • Not realistic. The offered load due to TCP flows depends on network conditions • Stanford model considers mostly persistent connections • No ambiguity about number of “long” flows (N) • N is time-invariant • In practice, TCP connections have finite size and duration, and N varies with time • Open-loop vs closed-loop flow arrivals

Traffic model (cont’) • Open-loop TCP traffic: • Flows arrive randomly with average size S, average rate l • Offered load lS, link capacity C • Offered load is independent of system state (delay, loss) • The system is unstable if lS > C • Closed-loop TCP traffic: • Each user starts a new transfer only after the completion of previous transfer • Random think timebetween consecutive transfers • Offered load depends on system state • The system can never be unstable

Why worry about loss rate? • The Stanford model gives very small buffer if N is large • E.g., CT=200 packets, N=400 flows: B=10 packets • What is the loss rate with such a small buffer size? • Per-flow throughput and transfer latency? • Compare with BDP-based buffer sizing • Distinguish between large and small flows • Small flows that do not see losses: limited only by RTT • Flow size: k segments • Large flows depend on both losses & RTT:

Simulation setup • Use ns-2 simulations to study the effect of buffer size on loss rate for different traffic models • Heterogeneous RTTs (20ms to 530ms) • TCP NewReno with SACK option • BDP = 250 packets (1500 B) • Model-1: persistent flows + mice • 200 “infinite” connections – active for whole simulation duration • mice flows - 5% of capacity, size between 3 and 25 packets, exponential inter-arrivals

Simulation setup (cont’) • Flow size distribution for finite size flows: • Sum of 3 exponential distributions: Small files (avg. 15 packets), medium files (avg. 50 packets) and large files (avg. 200 packets) • 70% of total bytes come from the largest 30% of flows • Model-2: Closed-loop traffic • 675 source agents • Think time exponentially distributed with average 5 s • Time average of 200 flows in congestion avoidance • Model-3: Open-loop traffic • Exponentially distributed flow inter-arrival times • Offered load is 95% of link capacity • Time average of 200 flows in congestion avoidance

Simulation results – Loss rate • CT=250 packets, N=200 for all traffic types • Stanford model gives a buffer of 18 packets • High loss rate with Stanford buffer • Greater than 10% for open loop traffic • 7-8% for persistent and closed loop traffic • Increasing buffer to BDP or small multiple of BDP can significantly decrease loss rate Stanford buffer

Per-flow throughput • Transfer latency = flow-size / flow-throughput • Flow throughput depends on both loss rate and queuing delay • Loss rate decreases with buffer size (good) • Queuing delay increases with buffer size (bad) • Major tradeoff: Should we have low loss rate or low queuing delay ? • Answer depends on various factors • Which flows are considered: Long or short ? • Which traffic model is considered?

Persistent connections and mice • Application layer throughput for B=18 (Stanford buffer) and larger buffer B=500 • Two flow categories: Large (>100KB) and small (<100KB) • Majority of large flows get better throughput with large buffer • Large difference in loss rates • Smaller variability of per-flow throughput with larger buffer • Majority of short flows get better throughput with small buffer • Lower RTT and smaller difference in loss rates

Closed-loop traffic • Per-flow throughput for large flows is slightly better with larger buffer • Majority of small flows see better throughput with smaller buffer • Similar to persistent case • Not a significant difference in per-flow loss rate • Reason: Loss rate decreases slowly with buffer size

Open-loop traffic • Both large and small flows get much better throughput with large buffer • Significantly smaller per-flow loss rate with larger buffer • Reason: Loss rate decreases very quickly with buffer size

Our buffer sizing objectives • Full utilization: • The average utilization of the target link should be at least % when the offered load is sufficiently high • Bounded loss rate: • The loss rate p should not exceed , typically 1-2% for a saturated link • Minimum queuing delays and buffer requirement, given previous two objectives: • Large queuing delay causes higher transfer latencies and jitter • Large buffer size increases router cost and power consumption • So, we aim to determine the minimum buffer size that meets the given utilization and loss rate constraints

Why limit the loss rate? • End-user perceived performance is very poor when loss rate is more than 5-10% • Particularly true for short and interactive flows • High loss rate is also detrimental for large TCP flows • High variability in per-flow throughput • Some “unlucky” flows suffer repeated losses and timeouts • We aim to bound the packet loss rate to = 1-2%

Traffic classes • Locally Bottlenecked Persistent (LBP) TCP flows • Large TCP flows limited by losses at target link • Loss rate p is equal to loss rate at target link • Remotely Bottlenecked Persistent (RBP) TCP flows • Large TCP flows limited by losses at other links • Loss rate is greater than loss rate at target link • Window Limited Persistent TCP flows • Large TCP flows limited by advertised window, instead of congestion window • Short TCP flows and non-TCP traffic

Scope of our model • Key assumption: • LBP flows account for most of the traffic at the target link (80-90 %) • Reason: we ignore buffer requirement of non-LBP traffic • Scope of our model: • Congested links that mostly carry large TCP flows, bottlenecked at target link

Minimum buffer requirement for full utilization: homogenous flows • Consider a single LBP flow with RTT T • Window follows TCP’s saw-tooth behavior • Maximum window size = CT + B • At this point packet loss occurs • Window size after packet loss = (CT + B)/2 • Key step: Even when window size is minimum, link should be fully utilized • (CT + B)/2 >= CT which means B >= CT • Known as the bandwidth delay product rule • Same result for N homogeneous TCP connections

Minimum buffer requirement for full utilization: heterogeneous flows • Nb heterogeneous LBP flows with RTTs {Ti} • Initially, assume Global Loss Synchronization • All flows decrease windows simultaneously in response to single congestion event • We derive that: • As a bandwidth-delay product: • Te: “effective RTT” is the harmonic mean of RTTs • Practical Implication: • Few connections with very large RTTs cannot significantly increase buffer requirement, as long as most flows have small RTTs

Minimum buffer requirement for full utilization (cont’) • More realistic model: partial loss synchronization • Loss burst length L(Nb): number of packets lost by Nb flows during single congestion event • Assumption: loss burst length increases almost linearly with Nb, i.e., L(Nb) = α Nb • α: synchronization factor (around 0.5-0.6 in our simulations) • Minimum buffer size requirement: • : Fraction of flows that see losses in a congestion event • M: Average segment size • Partial loss synchronization reduces buffer requirement

Validation (ns2 simulations) • Heterogeneous flows (RTTs vary between 20ms & 530ms) • Partial synchronization model: accurate • Global synchronization (deterministic) model overestimates buffer requirement by factor 3-5

Relation between loss rate and N • Nb homogeneous LBP flows at target link • Link capacity: C, flows’ RTT: T • If flows saturate target link, then flow throughput is given by • Loss rate is proportional to square of Nb • Hence, to keep loss rate less than we must limit number of flows • But this would require admission control (not deployed)

Flow Proportional Queuing (FPQ) • First proposed by Morris (Infocom’00) • Bound loss rate by: • Increasing RTT proportionally to number of flows • Solving for T gives: • Where and Tp: RTT’s propagation delay • Set Tq  C/B, and solve for B: • Window of each flow should be Kp packets, consisting of • Packets in target link buffer (B term) • Packets “on the wire” (CTp term) • Practically, Kp=6 packets for 2% loss rate, and Kp=9 packets for 1% loss rate

Buffer size requirement for both full utilization and bounded loss rate • We previously showed separate results for full utilization and bounded loss rate • To meet both goals, provide enough buffers to satisfy most stringent of two requirements • Buffer requirement: • Decreases with Nb (full utilization objective) • Increases with Nb (loss rate objective) • Crossover point: • Previous result is referred to as the BSCL formula

Model validation • Heterogeneous flows • Utilization % and loss constraint % Utilization constraint Loss rate constraint

Parameter estimation • Number of LBP flows: • With LBP flows, all rate reductions occur due to packet losses at target link • RBP flows: some rate reductions due to losses elsewhere • Effective RTT: • Jiang et al. (2002): simple algorithms to measure TCP RTT from packet traces • Loss burst lengths or loss synchronization factor: • Measure loss burst lengths from packet loss trace or use approximation L(Nb) = α Nb

Results: Bound loss rate to 1%

Per-flow throughput with BSCL • BSCL can achieve network layer objectives of full utilization and bounded loss rate • Can lead to large queuing delay due to larger buffer • How does this affect application throughput ? • BSCL loss rate target set to 1% • BSCL buffer size is 1550 packets • Compare with the buffer of 500 packets • BSCL is able to bound the loss rate to 1% target for all traffic models

Persistent connections and mice • BSCL buffer gives better throughput for large flows • Also reduces variability of per-flow throughputs • Loss rate decrease favors large flows in spite of larger queuing delay • All smaller flows get worse throughput with the BSCL buffer • Increase in queuing delay harms small flows

Closed-loop traffic • Similar to persistent traffic case • BSCL buffer improves throughput for large flows • Also reduces variability of per-flow throughputs • Loss rate decrease favors large flows in spite of larger queuing delay • All smaller flows get worse throughput with the BSCL buffer • Increase in queuing delay harms small flows

Open-loop traffic • No significant difference between B=500 and B=1550 • Reason: Loss rate for open loop traffic decrease quickly • Loss rate for B=500 is already less than 1% • Further increase in buffer reduces loss rate to ≈ 0 • Large buffer does not increase queuing delays significantly

Summary • We derived a buffer sizing formula (BSCL) for congested links that mostly carry TCP traffic • Objectives: • Full utilization • Bounded loss rate • Minimum queuing delay, given previous two objectives • BSCL formula is applicable for links with more than 80-90% of traffic coming from large and locally bottlenecked TCP flows • BSCL accounts for the effects of heterogeneous RTTs and partial loss synchronization • Validated BSCL through simulations

Open Issues in Buffer Sizing