combining events and threads for scalable network services n.
Skip this Video
Loading SlideShow in 5 Seconds..
Combining Events and Threads for Scalable Network Services PowerPoint Presentation
Download Presentation
Combining Events and Threads for Scalable Network Services

Loading in 2 Seconds...

play fullscreen
1 / 18

Combining Events and Threads for Scalable Network Services - PowerPoint PPT Presentation

  • Uploaded on

Combining Events and Threads for Scalable Network Services. Peng Li and Steve Zdancewic University of Pennsylvania PLDI 2007, San Diego. A lazy, purely functional programming language Overview.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Combining Events and Threads for Scalable Network Services' - takoda

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
combining events and threads for scalable network services

Combining Events and Threads for Scalable Network Services

Peng Li and Steve Zdancewic

University of Pennsylvania

PLDI 2007, San Diego


A lazy, purely functional programming language

  • A Haskell framework for massively concurrent network applications
    • Servers, P2P systems, load generators
  • Massive concurrency ::=

1,000 threads? (easy)

| 10,000 threads? (common)

|100,000 threads? (challenging)

|1,000,000 threads? (20 years later?)

|10,000,000 threads? (in 15 minutes)

  • How to write such programs?
    • The very first decision to make: the programming model

Shall we use threads or events?

threads vs events
The multithreaded model

One thread ↔ one client

Synchronous I/O

Scheduling: OS/runtime libs

int send_data(int fd1, int fd2) {

while (!EOF(fd1)) {

size = read_chunk(fd, buf, count);

write_chunk(fd, buf, size);


The event-driven model:

One thread ↔ 10000 clients

Asynchronous I/O

Scheduling: programmer

while(1) {

nfds=epoll_wait(kdpfd, events, MAXEVT,-1);

for(n=0; n<nfds; ++n)


“Why events are a bad idea

(for high-concurrency servers)”

[HotOS 2003]

“Why threads are a bad idea

(for most purposes)”


Threads vs. Events

can we get the best of both worlds
Can we get the best of both worlds?

One application program

  • Programming with each client: threads
    • Synchronous I/O
    • Intuitive control-flow primitives

The bridge between threads/events?

(some kind of “continuation” support)

  • Resource scheduling: events
    • Written as part of the application
    • Tailored to application’s needs
roads to lightweight application level concurrency
Roads to lightweight, application-level concurrency
  • Direct language support for continuations:
    • Good if you have them
  • Source-to-source CPS translations
    • Requires hacking on compiler/runtime
    • Often not very elegant
  • Other solutions?
    • (no language support)
    • (no compiler/runtime hacks)
the poor man s concurrency monad
The poor man’s concurrency monad
  • “A poor man’s concurrency monad” by Koen Claessen, JFP 1999. (Functional Pearl)
    • The thread interface:
      • The CPS monad
    • The event interface:
      • A lazy, tree-like data structure called “trace”


questions on the poor man s approach
Questions on the poor man’s approach

Does it work for high-performance network services?

(using a pure, lazy, functional language?)

  • How does the design scale up to real systems?
    • Symmetrical multiprocessing? Synchronization? I/O?
  • How cheap is it?
    • How much does a poor man’s thread cost?
  • How poor is it?
    • Does it offer acceptable performance?
our experiment
Our experiment

Ahigh-performance Haskell framework for massively-concurrent network services!!!

  • Supported features:
    • Linux Asynchronous IO (AIO)
    • epoll() and nonblocking IO
    • OS thread pools
    • SMP support
    • Thread synchronization primitives
  • Applications developed
    • IO benchmarks on FIFO pipes / Disk head scheduling
    • A simple web server for static files
    • HTTP load generator
    • Prototype of an application-level TCP stack

We used the Glasglow Haskell Compiler (GHC)

multithreaded code example
Multithreaded code example

Nested function calls

Exception handling

Conditional branches

Synchronous call to I/O lib


event driven code example
Event-driven code example

A wrapper function to the C library call using the Haskell Foreign Function Interface (FFI)

An event loop running in a separate OS thread

Put events in queues for processing in other OS threads

a complete event driven i o subsystem
A complete event-driven I/O subsystem

One “virtual processor” event loop for each CPU

Haskell Foreign Function Inteface (FFI)

Each event loop runs in a separate OS thread

modular and customizable i o system add a tcp stack if you like
Modular and customizable I/O system (add a TCP stack if you like)

Define / interpret TCP syscalls (22 lines)

Event loop for incoming packets (7 lines)

Event loop for timers (9 lines)

how cheap is a poor man s thread
How cheap is a poor man’s thread?

48 bytes

  • Minimal memory consumption: 48 bytes
    • Each thread just loops and does nothing
  • Actual size determined by thread-local states
    • Even an ethernet packet can be >1,000 bytes…
    • Pay as you go --- only pay for things needed

In contrast:

    • A Linux POSIX thread’s stack has 2MB by default
    • The state-of-the-art user-level thread system (Capriccio) use at least a few KBs for each thread


The poor man’s thread is extremely memory-efficient

(Challenging most event-driven systems)

i o scalability test
I/O scalability test
  • Comparison against the Linux POSIX Thread Library (NPTL)
    • Highly optimized OS thread implementation
    • Each NPTL thread’s stack limited to 32KB
  • Mini-benchmarks used:
    • Disk head scheduling (all threads running)
    • FIFO pipe scalability with idle threads (128 threads running)
how poor is the poor man s monad
How poor is the poor man’s monad?
  • Not too shabby
    • Benchmarks shows comparable(if not higher) performance to existing, optimized systems
  • An elegant design is more important than 10% performance improvement
  • Added benefit: type safety for many dangerous things
    • Continuations, thread queues, schedulers, asynchronous I/O
related work
Related Work
  • We are motivated by two projects:
    • Twisted: the python event-driven framework for scalable internet applications

- The programmer must write code in CPS

    • Capriccio: a high-performance user-level thread system for network servers

- Requires C compiler hacks

- Difficult to customize (e.g. adding SMP support)

  • Continuation-based concurrency
    • [Wand 80], [Shivers 97], …
  • Other languages and programming models:
    • CML, Erlang, …
  • Haskell and The Poor Man’s Concurrency Monad are a promising solution for high-performance, massively-concurrent networking applications:

Get the best of both threads and events!

  • This poor man’s approach is actually verycheap, and not so poor!