Soft Timers Efficient Microsecond Software Timer Support for Network Processing

Soft TimersEfficient MicrosecondSoftware Timer Support for NetworkProcessing MOHIT ARON and PETER DRUSCHEL Rice University Published in ACM Transactions on Computer Systems, vol. 18(3), pp. 197.228, 2000. Presented By Glenn Diviney

What’s wrong with “Hard” timers? • Polling vs. Interrupts • Interrupts have high overhead and low latency • Polling has high latency and low overhead • Interruption is is expensive • CPU pipeline gets disrupted, cache and TLB get dirty. This is expensive • Generally not significant, so long as it’s done on the ms frequency • Example: Network interrupts can occur at the rate of tens of microseconds • Gigabit Ethernet requires a packet transmission every 12 µs (1500 bytes each)! • This amounts to a significant burden on the system if a context switch is involved each time

Interrupts • Device interrupts have a low latency but high overhead due to the added context switching • The executing thread gets preempted • Can occur at inopportune times which will slow down other work due to the cache pollution, TLB pollution, and pipeline purge resulting in high indirect costs

Polling • Polling has low overhead, but can have high latency due to the frequency of the poll: • The OS’s timer granularity depends directly on the frequency of the timer interrupts, as well as the overhead incurred by the interrupt • The cache, TLB, and pipeline costs can be avoided if the polling is done at the right time

What’s a Soft Timer? • “An operating system facility that allows efficient scheduling of software events at microsecond granularity.” • Takes advantage of states where handlers can be invoked at low cost: “Trigger States” • As in the case when the system is already context-switched to the kernel… why not see if other work can be done “while you’re in there?” • Schedule future events probabilistically

How Soft Timers work: hardware • Pentiums are usually shipped with a programmable timer, which can be told how often to interrupt the CPU. • These interrupts are usually assigned the highest priority in the OS, which can lead to TLB and cache misses • Testing indicated a total cost to be 4.45 µs on a 300mhz web server, which is insignificant at ms intervals but terrible at 20 µs intervals • Timer chip programmed to interrupt at ms intervals

How Soft Timers work: software • At unpredictable intervals, the system will arrive at “trigger states” • End of a system call • End of an exception handler • End of an interrupt handler • CPU idle • In these states, invoking an event handler is just a function call’s worth of overhead • TLB and Cache are already “disturbed” due to the triggering event, so no additional cost should be incurred • In these states, the OS’s Soft Timer facility checks for any pending events without incurring the cost of the hardware timer • Checks the clock (usually a CPU register) and compares it to the scheduled time of the earliest soft timer event.

The catch • Events might get delayed past a scheduled time • Only the hardware interrupt is guaranteed to happen (providing an upper bound on execution) • Other trigger states appear as random events to the system, or may not happen at all between hardware interrupts

Implementation • Soft timers provide the following operations • measure_resolution(): returns a 64-bit value which represents the clock resolution in hertz • measure_time(): returns a 64 bit value representing the current time whose resolution is given by measure_resolution() • schedule_soft_event(T, handler): schedules “handler” to run “T” ticks in the future • interrupt_clock_resolution(): provides the minimal resolution, which is that of the hardware interrupter • When invoked, the Soft Timer facility executes all handlers which have a T that is less than the value given by a call to measure_time() by 1.

Implementation (cont) • If X is the resolution of the hardware interrupter, the events will be bounded by: • T < Actual Event Time < T + X + 1 • Just a reassurance that the event will happen eventually • Generally, the assumption is that the event will happen as: Actual Event Time = T + d • “d” is the “random” time between non-hardware triggers

Applications • Rate-based clocking • Recall 12µs interrupt for gigabit Ethernet • Transmission rate becomes variable, but the protocol could maintain an average “actual” rate and adjust the scheduling accordingly to achieve a target rate • Network polling • Pure polling reduces interrupts and the impact of memory access, but it also can induce latencies by delaying packet processing • Soft Timers are a perfect alternative to pure polling or a hybrid hardware approach with a network poll timer • Soft Timers show a latency close to interrupt driven processing in common case

Base overhead test setup • FreeBSD was extended to include the Soft Timer facilities • They also added support for an the-chip APIC timer in addition to the already-supported 8253 off-chip timer • Connected “a number” of 300 to 500 Mhz machines to a 100mpbs network • One acted as a web server • Others repeatedly requested a 6KB file to the point where the web server was saturated

Base overhead test results • Used a “null handler” to measure the per-timer event costs: • Of note: • The results suggest that the overhead does not scale with processor speed • Soft Timers caused no observable cost

Base overhead test results • What about TLB and Cache misses? • Touched 50 data cache lines • Touched 2 instruction cache lines on 2 separate pages • All lines touched were different each time, and occurred at 10µs then 20µs intervals • Results for events scheduled every 10µs could not be obtained for 8253-based timers due to the high overhead of that facility

Base overhead test results • Prior reasoning about Soft-Timers reducing TLB and Cache misses is confirmed • Data cache miss reduced by 20-31% • Instruction cache miss not reduced • Author assumes this is due to only 2 lines being touched • TLB misses reduced by 7-13%

Different workload test setup • Intended to induce variation in when the trigger events occur, which is the Achilles Heel of Soft Timers • Measured the distribution of times between successive trigger stats for various workloads on a 300MHz PII machine • Mean granularity in the tens-of-µs, with less than 6% over 100µs

Stats on the distributions

Rate-Based Clocking: Timer Overhead • Web server TCP implementation using Soft Timers vs. hardware timers • At 100mbps, 1500 byte packet takes 120µs so it has no observable impact on the network • Therefore, the metric to isolate is the timer overhead, but possible benefits of rate-based clocking are not exposed • Cache/TLB pollution is 4-8% better • Average time between transmissions only slightly higher with Soft Timers • Huge reduction in overhead

TCP: targeting average transmission interval • It was suggested that TCP could control transmission intervals by noting the average time since transmitting vs. the requested transmission interval and adjusting the next Soft Timer interval accordingly. • Two tests on a busy Apache Webserver (300MHz PII): one with a target of 40µs, the other with a target of 60µs • In most cases, the target rate was hit, although with more deviation than the same rate with the hardware timers. • at line speed of 12µs: • For the 60µs target, the ST transmit interval was 60µs with a std dev of 35.9 vs. the hardware at 63µs with a std dev of 27.7 • For the 40µs target, the ST transmit interval was 40µs with a std dev of 34.5 vs. the hardware at 43.6µs with a std dev of 26.8 • Delta in timers for hardware interval accounted for because of interrupt disabling in FreeBSD

Network Performance • Substantial improvements in response time and throughput with rate-based clocking

Network polling • Significant improvements across the board with Soft Timers

Using the on-chip Timer (APIC) … defeats “the catch” • Used to shorten the tail on the event-time distribution • This timer can be scheduled and cancelled at a very low cost • Invoked when a deadline is specified while scheduling the next Soft Timer event. • This is used to provide an upper bound on execution with low overhead because it gets cancelled when the Soft Timer “beats it to the punch”

Conclusions • Soft timers allow for high granularity and low overhead when compared to hardware timers • But they have a useful range between the highest granularity of the hardware timer and the Soft Timer trigger interval (~10µs-~100µs on 300 to 500MHz CPUs) • Useful range appears to widen as CPU gets faster, approximately linearly. • Should be used for events requiring this kind of granularity, assuming they can tolerate probabilistic delays • Can be integrated with the on-chip APIC to provide find-grained events with tight deadlines and low overhead • When restricted to the appropriate class of problems, they always seem to improve things

Q/A

Soft Timers Efficient Microsecond Software Timer Support for Network Processing

Soft Timers Efficient Microsecond Software Timer Support for Network Processing

Presentation Transcript

Compiler and Runtime Support for Efficient Software Transactional Memory

Timers for PowerPoint

Soft Timers: Efficient Microsecond Software Timer Support for Network Processing

Timers for PowerPoint

8254 SOFTWARE PROGRAMMABLE TIMER/COUNTER

Parallelizing Network Packet Processing in Software Router

Instruction for Presentation Timer and Prepared Timers for time 5 min … 60 min

Efficient Processing for Backlog Reduction: Applied Minimal Processing Strategies

Efficient Techniques for Software Testing

Soft constraint processing

Compiler and Runtime Support for Efficient Software Transactional Memory

Compiler Support for Efficient Software-only Checkpointing

IsoStack – Highly Efficient Network Processing on Dedicated Cores

Efficient Algorithm For Processing XPath Queries

Compiler Support for Efficient Processing of XML Datasets

Wins Soft PROCESSING ERP

Soft constraint processing

Soft Timers: Efficient Microsecond Software Timer Support For Network Processing

Digital timer| Timer switch | Digital timer switch | Programmable timers - GIC India

Soft Timers: Efficient Microsecond Software Timer Support For Network Processing

Compiler and Runtime Support for Efficient Software Transactional Memory

Support for Distributed Processing