1 / 17

Using Commodity OSs for Time Sensitive Processing Experiences from a Study of Windows NT

Using Commodity OSs for Time Sensitive Processing Experiences from a Study of Windows NT. Mike Jones John Regehr Microsoft Research Redmond. OR… The Problems You’re Having May Not Be the Problems You Think You’re Having. Preliminary NT Latency Results.

yeshaya
Download Presentation

Using Commodity OSs for Time Sensitive Processing Experiences from a Study of Windows NT

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Commodity OSs for Time Sensitive ProcessingExperiences from a Study of Windows NT Mike Jones John Regehr Microsoft Research Redmond

  2. OR…The Problems You’re Having May Not Be the Problems You Think You’re Having

  3. Preliminary NT Latency Results • Typically can schedule tasks every small number of milliseconds • But ill-behaved drivers, hardware can take many milliseconds • Software delays of up to 16ms observed • Hardware delays of up to 30ms observed Results from NT 5, Pentium II-333

  4. Deferred Procedure Call Properties DPCs: • Are preempted by interrupts • Preempt normal threads • May not block • Are run in FIFO order Typical Uses: • I/O Completion Processing • Background Driver Housekeeping

  5. Non-Problem: Interrupts • Never observed interrupt handler taking substantial fraction of ms • Despite conventional wisdom • Interrupts needing substantial work queue DPCs

  6. Non-Problem: Ethernet Packet Processing • With continuous, back-to-back 100Mbit incoming packets of UDP or TCP data: • Longest observed DPC 600µs • Longest delay of user code ~2ms • Cards tested: • Intel EtherExpress Pro 100b • SMC EtherPower II 10/100 • Compaq Netelligent 10/100 Tx • DEC dc21x4 Fast 10/100 Ethernet

  7. Problem: “Unimportant” Background Work • DEC dc21x4 PCI Fast 10/100 Ethernet • 6ms periodic DPC every 5s • Autosense processing • Most of 6ms in five 0.88ms calls to routine that reads device register that: • Writes a HW register – 1.5µs • Stalls for 5µs • Writes HW register again – 1.5µs • Stalls for 5µs • Reads a HW register – 1.5µs • Stalls for 5µs • And does this 16 times! (once per bit)

  8. Another Long DPC: Intel EE 16 • Intel EtherExpress 16 ISA Ethernet • 17ms DPC every 10s • Card reset for no received packets Amusing Observation • Unplugging Ethernet makes latency worse! • Despite conventional wisdom to the contrary

  9. Even Worse: Video Cards • Video cards and drivers conspire to hog the PCI bus • Dragging large window locks out interrupts for up to 30ms • Obliterates sound I/O, for instance • Can set registry key to ask drivers to behave, but not default • No problem when set correctly • Manufacturers’ motivation: WinBench ~ 5% improvement

  10. Video CardMisbehavior Details • Don’t check if card FIFO full before write • Eliminates one PCI read • Stalls PCI bus if full to prevent overflow • Uses “PCI disconnect” feature

  11. Offending Video Cards • Those we observed: • AccelStar II AGP – 3D Labs Permedia 2 • Matrox Millenium II • Those we’ve read about: • Tseng Labs ET 6000 • Hercules Dynamite 128 • S3 cards • ATI – no option to fix! • Number 9 – no option to fix! • For more details: • http://www.zefiro.com/vgakills.txt • http://www.zdnet.com/pcmag/news/trends/t980619a.htm

  12. Example Bug:Timer Wakeup Time • MP HAL uses 976µs interrupt period using RTC interrupts on IRQ8 • (UP HAL uses 1003µs interrupt period using 8254 clock chip interrupts on IRQ1) • Multimedia timers compute absolute time for next wakeup • Converted to relative wakeup time and passed to kernel

  13. Timer Wakeup Bug Cont. • Bug: Wakeup computed in whole ms • Consequence: 1ms wakeup request w/ 976µs interrupts occurs 1.952 ms later. • Callback Timings: • 0ms, 1.952ms, 1.952ms, 3.904ms, 3.904ms, … • Fix: Compute wakeup in full 100ns time units

  14. Lesson Your Intuition About Performance is Wrong Only Measurement Reveals the Truth!

  15. Schedulers • Yes, NT could use a RT scheduler • Have done prototypes • But will be of limited value if unscheduled activities continue taking tens of milliseconds

  16. Bottom Line • NT developed, tested for throughput • Not small numbers of ms of latency • Latency for time dependent tasks typically < 1ms • But outliers can be delayed for tens of ms • Already fine for some tasks, e.g. Tiger • Others not OK • Improvement will require: • Systematic latency testing • Latency requirements specifications

  17. Now For the Fun! Tell Us Your War Stories!

More Related