combining events and threads for scalable network services n.
Skip this Video
Loading SlideShow in 5 Seconds..
Combining Events and Threads for Scalable Network Services PowerPoint Presentation
Download Presentation
Combining Events and Threads for Scalable Network Services

Loading in 2 Seconds...

play fullscreen
1 / 18

Combining Events and Threads for Scalable Network Services - PowerPoint PPT Presentation

  • Uploaded on

Combining Events and Threads for Scalable Network Services. Peng Li and Steve Zdancewic University of Pennsylvania PLDI 2007, San Diego. A lazy, purely functional programming language Overview.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Combining Events and Threads for Scalable Network Services

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
combining events and threads for scalable network services

Combining Events and Threads for Scalable Network Services

Peng Li and Steve Zdancewic

University of Pennsylvania

PLDI 2007, San Diego


A lazy, purely functional programming language

  • A Haskell framework for massively concurrent network applications
    • Servers, P2P systems, load generators
  • Massive concurrency ::=

1,000 threads? (easy)

| 10,000 threads? (common)

|100,000 threads? (challenging)

|1,000,000 threads? (20 years later?)

|10,000,000 threads? (in 15 minutes)

  • How to write such programs?
    • The very first decision to make: the programming model

Shall we use threads or events?

threads vs events
The multithreaded model

One thread ↔ one client

Synchronous I/O

Scheduling: OS/runtime libs

int send_data(int fd1, int fd2) {

while (!EOF(fd1)) {

size = read_chunk(fd, buf, count);

write_chunk(fd, buf, size);


The event-driven model:

One thread ↔ 10000 clients

Asynchronous I/O

Scheduling: programmer

while(1) {

nfds=epoll_wait(kdpfd, events, MAXEVT,-1);

for(n=0; n<nfds; ++n)


“Why events are a bad idea

(for high-concurrency servers)”

[HotOS 2003]

“Why threads are a bad idea

(for most purposes)”


Threads vs. Events

can we get the best of both worlds
Can we get the best of both worlds?

One application program

  • Programming with each client: threads
    • Synchronous I/O
    • Intuitive control-flow primitives

The bridge between threads/events?

(some kind of “continuation” support)

  • Resource scheduling: events
    • Written as part of the application
    • Tailored to application’s needs
roads to lightweight application level concurrency
Roads to lightweight, application-level concurrency
  • Direct language support for continuations:
    • Good if you have them
  • Source-to-source CPS translations
    • Requires hacking on compiler/runtime
    • Often not very elegant
  • Other solutions?
    • (no language support)
    • (no compiler/runtime hacks)
the poor man s concurrency monad
The poor man’s concurrency monad
  • “A poor man’s concurrency monad” by Koen Claessen, JFP 1999. (Functional Pearl)
    • The thread interface:
      • The CPS monad
    • The event interface:
      • A lazy, tree-like data structure called “trace”


questions on the poor man s approach
Questions on the poor man’s approach

Does it work for high-performance network services?

(using a pure, lazy, functional language?)

  • How does the design scale up to real systems?
    • Symmetrical multiprocessing? Synchronization? I/O?
  • How cheap is it?
    • How much does a poor man’s thread cost?
  • How poor is it?
    • Does it offer acceptable performance?
our experiment
Our experiment

Ahigh-performance Haskell framework for massively-concurrent network services!!!

  • Supported features:
    • Linux Asynchronous IO (AIO)
    • epoll() and nonblocking IO
    • OS thread pools
    • SMP support
    • Thread synchronization primitives
  • Applications developed
    • IO benchmarks on FIFO pipes / Disk head scheduling
    • A simple web server for static files
    • HTTP load generator
    • Prototype of an application-level TCP stack

We used the Glasglow Haskell Compiler (GHC)

multithreaded code example
Multithreaded code example

Nested function calls

Exception handling

Conditional branches

Synchronous call to I/O lib


event driven code example
Event-driven code example

A wrapper function to the C library call using the Haskell Foreign Function Interface (FFI)

An event loop running in a separate OS thread

Put events in queues for processing in other OS threads

a complete event driven i o subsystem
A complete event-driven I/O subsystem

One “virtual processor” event loop for each CPU

Haskell Foreign Function Inteface (FFI)

Each event loop runs in a separate OS thread

modular and customizable i o system add a tcp stack if you like
Modular and customizable I/O system (add a TCP stack if you like)

Define / interpret TCP syscalls (22 lines)

Event loop for incoming packets (7 lines)

Event loop for timers (9 lines)

how cheap is a poor man s thread
How cheap is a poor man’s thread?

48 bytes

  • Minimal memory consumption: 48 bytes
    • Each thread just loops and does nothing
  • Actual size determined by thread-local states
    • Even an ethernet packet can be >1,000 bytes…
    • Pay as you go --- only pay for things needed

In contrast:

    • A Linux POSIX thread’s stack has 2MB by default
    • The state-of-the-art user-level thread system (Capriccio) use at least a few KBs for each thread


The poor man’s thread is extremely memory-efficient

(Challenging most event-driven systems)

i o scalability test
I/O scalability test
  • Comparison against the Linux POSIX Thread Library (NPTL)
    • Highly optimized OS thread implementation
    • Each NPTL thread’s stack limited to 32KB
  • Mini-benchmarks used:
    • Disk head scheduling (all threads running)
    • FIFO pipe scalability with idle threads (128 threads running)
how poor is the poor man s monad
How poor is the poor man’s monad?
  • Not too shabby
    • Benchmarks shows comparable(if not higher) performance to existing, optimized systems
  • An elegant design is more important than 10% performance improvement
  • Added benefit: type safety for many dangerous things
    • Continuations, thread queues, schedulers, asynchronous I/O
related work
Related Work
  • We are motivated by two projects:
    • Twisted: the python event-driven framework for scalable internet applications

- The programmer must write code in CPS

    • Capriccio: a high-performance user-level thread system for network servers

- Requires C compiler hacks

- Difficult to customize (e.g. adding SMP support)

  • Continuation-based concurrency
    • [Wand 80], [Shivers 97], …
  • Other languages and programming models:
    • CML, Erlang, …
  • Haskell and The Poor Man’s Concurrency Monad are a promising solution for high-performance, massively-concurrent networking applications:

Get the best of both threads and events!

  • This poor man’s approach is actually verycheap, and not so poor!