Concurrency checking with chess learning from experience
This presentation is the property of its rightful owner.
Sponsored Links
1 / 57

Concurrency Checking with CHESS: Learning from Experience PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on
  • Presentation posted in: General

Concurrency Checking with CHESS: Learning from Experience. Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer. Outline. What is CHESS? a testing tool, plus a test methodology (concurrency unit tests) a platform for research and teaching Chess design decisions

Download Presentation

Concurrency Checking with CHESS: Learning from Experience

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Concurrency checking with chess learning from experience

Concurrency Checking with CHESS: Learning from Experience

Tom Ball, Sebastian Burckhardt, Chris Dern, Madan Musuvathi, Shaz Qadeer


Outline

Outline

  • What is CHESS?

    • a testing tool, plus

    • a test methodology (concurrency unit tests)

    • a platform for research and teaching

  • Chess design decisions

  • Learnings from CHESS user forum, champions


What is chess

What is CHESS?

  • CHESS is a user-mode scheduler

  • Controls all scheduling nondeterminism

    • “Hijacks” scheduling control from the OS

  • Guarantees:

    • Every run takes a different thread schedule

    • Reproduce the schedule for every run


Concurrency unit tests

Concurrency Unit Tests

“Generally, in our test environment, we want to test what we call scenarios.  A scenario might be a specific feature or API usage.  In my case I am trying to test the scenario of a user canceling a command execution on a different thread.”

Steve Hale, Microsoft 


A concurrency unit test pattern fork join

A Concurrency Unit Test Pattern:Fork-Join

void ForkJoinTest() {

var t1 = new Thread(() => { S1 });

var t2 = new Thread(() => { S2 });

t1.Start(); t2.Start();

t1.Join(); t2.Join();

Debug.Assert(...);

}


Concurrency unit tests1

Concurrency Unit Tests

  • Small scope hypothesis

    • For most bugs, there exists a short-running scenario with only a few threads that can find it

  • Unit tests provide

    • Better coverage of schedules

    • Easier debugging, regression, etc.


Chess as research teaching platform http research microsoft com chess

CHESS as Research/Teaching Platformhttp://research.microsoft.com/chess/

  • Source code release

    • chesstool.codeplex.com

  • Courseware with CHESS

    • Practical Parallel and Concurrent Programming

    • coming this fall!

  • Preemption bounding [PLDI07]

    • speed search for bugs

    • simple counterexamples

  • Fair stateless exploration [PLDI08]

    • scales to large programs

  • Architecture [OSDI08]

    • Tasks and SyncVars

    • API wrappers

  • Store buffer simulation [CAV08]

  • Preemption sealing [TACAS10]

    • orthogonal to preemption bounding

    • where (not) to search for bugs

  • Best-first search [PPoPP10]

  • Automatic linearizability checking [PLDI10]

  • More features

    • Data race detection

    • Partial order reduction

    • More monitors…


Chess design decisions

CHESS Design Decisions

  • Stateless state space exploration

  • No change to underlying scheduler

  • Ability to enumerate all/only feasible schedules

  • Schedule points = synchronization points and use race detection to make up the difference

  • Serialize concurrent behavior

  • Suite of search/reduction strategies

    • preemption bounding, sealing

    • best-first search

  • Monitor API to easily add new checking capability


Stateless model checking verisoft

Stateless model checking [Verisoft]

  • Given a program with an acyclic state space

  • Systematically enumerate all paths

  • Don’t capture program states

    • Not necessary for termination

    • Precisely capturing states is hard and expensive

  • At the cost of potentially revisiting states

    • Partial-order reduction alleviates redundant exploration


Chess architecture

CHESS architecture

Unmanaged

Program

Win32

Wrappers

CHESS

Exploration

Engine

Windows

CHESS

Scheduler

Managed

Program

  • Capture scheduling nondeterminism

  • Drive the program along an interleaving of choice

.NET

Wrappers

CLR


Running example

Running Example

Thread 1

Thread 2

Lock (l);

bal += x;

Unlock(l);

Lock (l);

t = bal;

Unlock(l);

Lock (l);

bal = t - y;

Unlock(l);


Introduce schedule points

Introduce Schedule() points

Thread 1

Thread 2

  • Instrument calls to the CHESS scheduler

  • Each call is a potential preemption point

Schedule();

Lock (l);

bal += x;

Schedule(); Unlock(l);

Schedule(); Lock (l);

t = bal;

Schedule(); Unlock(l);

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);


First cut solution random sleeps

First-cut solution: Random sleeps

  • Introduce random sleep at schedule points

  • Does not introduce new behaviors

    • Sleep models a possible preemption at each location

    • Sleeping for a finite amount guarantees starvation-freedom

Thread 1

Thread 2

Sleep(rand());

Lock (l);

bal += x;

Sleep(rand());

Unlock(l);

Sleep(rand());

Lock (l);

t = bal;

Sleep(rand());

Unlock(l);

Sleep(rand());

Lock (l);

bal = t - y;

Sleep(rand());

Unlock(l);


Improvement 1 capture the happens before graph

Improvement 1:Capture the “happens-before” graph

Thread 1

Thread 2

Schedule();

Lock (l);

bal += x;

Schedule(); Unlock(l);

Schedule(); Lock (l);

t = bal;

Schedule(); Unlock(l);

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);

Schedule(); Lock (l);

t = bal;

Schedule(); Unlock(l);

  • Delays that result in the same “happens-before” graph are equivalent

  • Avoid exploring equivalent interleavings

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);

Sleep(5)

Sleep(5)


Improvement 2 understand synchronization semantics

Improvement 2:Understand synchronization semantics

  • Avoid exploring delays that are impossible

  • Identify when threads can make progress

  • CHESS maintains a run queue and a wait queue

    • Mimics OS scheduler state

Thread 1

Thread 2

Schedule();

Lock (l);

bal += x;

Schedule(); Unlock(l);

Schedule(); Lock (l);

t = bal;

Schedule(); Unlock(l);

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);

Schedule();

Lock (l);

t = bal;

Schedule(); Unlock(l);

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);


Emulate execution on a uniprocessor

Emulate execution on a uniprocessor

Thread 1

Thread 2

  • Enable only one thread at a time

  • Linearizes a partial-order into a total-order

  • Controls the order of data-races

Schedule(); Lock (l);

t = bal;

Schedule(); Unlock(l);

Schedule();

Lock (l);

bal += x;

Schedule(); Unlock(l);

Schedule(); Lock (l);

bal = t - y;

Schedule(); Unlock(l);


Chess modes speed vs coverage

CHESS modes: speed vs coverage

  • Fast-mode

    • Introduce schedule points before synchronizations, volatile accesses, and interlocked operations

    • Finds many bugs in practice

  • Data-race mode

    • Repeat

      • Find data races

      • Introduce schedule points before racing memory accesses

    • Captures all sequentially consistent (SC) executions


Capture all sources of nondeterminism no

Capture all sources of nondeterminism?No.

  • Scheduling nondeterminism? Yes

  • Timing nondeterminism? Yes

    • Controls when and in what order the timers fire

  • Nondeterministic system calls? Mostly

    • CHESS uses precise abstractions for many system calls

  • Input nondeterminism? No

    • Rely on users to provide inputs

      • Program inputs, files read, packets received,…

    • Good tradeoff in the short term

      • But can’t find race-conditions on error handling code


Chess architecture1

CHESS architecture

Unmanaged

Program

Win32

Wrappers

CHESS

Exploration

Engine

Windows

CHESS

Scheduler

Managed

Program

.NET

Wrappers

CLR


Chess wrappers

CHESS wrappers

  • Translate Win32/.NET synchronizations

  • Into CHESS scheduler abstractions

    • Tasks : schedulable entities

      • Threads, threadpool work items, async. callbacks, timer functions

    • SyncVars : resources used by tasks

      • Generate happens-before edges during execution

  • Executable specification for complex APIs

    • Most time consuming and error-prone part of CHESS

  • Enables CHESS to handle multiple platforms


Learning from experience user forum champions

Learning from Experience:User forum, Champions

http://msdn.microsoft.com/en-us/devlabs/cc950526.aspx

http://social.msdn.microsoft.com/Forums/en-US/chess/threads/


Chess doesn t scale

“CHESS Doesn’t Scale”

  • Hmm… we just ran CHESS on the Singularity operating system (and found bugs in the bootup/shutdown sequence)

  • What they usually mean:

    • “CHESS isn’t very effective on a long-running test”

    • “There are a lot of possible schedules!”

  • Time for enumerative model checking

    • (Time to execute one test) x (# schedules)


Find lots of bugs with 2 preemptions

Find lots of bugs with 2 preemptions


Chess isn t push button

“CHESS Isn’t Push Button”

Concurrency

Unit

Tests

  • “The more I look at CHESS the more I realize that I could use some general guidance on how to author test code that will actually help CHESS reveal concurrency bugs.”

  • Daniel Stolt


Challenge opportunity new push button concurrency tools

Challenge -> Opportunity: New “Push button” concurrency tools

  • Cuzz [ASPLOS 2010]: Concurrency Fuzzing

    • Attach to any running executable

    • Find concurrency bugs faster through smart fuzzing

  • Lineup [PLDI 2010]: Automatic Linearizability Checking

    • Generate “thread-safety” tests for a class automatically

    • Use sequential behavior as oracle for concurrent behavior

    • CHESS underneath


Chess doesn t find this bug

“CHESS Doesn’t Find This Bug”

void ForkJoinTest() {

int x = 0;

var t1 = new Thread(() => { x=x+1; });

var t2 = new Thread(() => { x=x+1; });

t1.Start(); t2.Start();

t1.Join(); t2.Join();

Debug.Assert(x==2);

}

  • RTFM is not helpful

  • Instead, generate helpful warning messages

    • “Warning: running CHESS without race detection can miss bugs”

  • Or, turn race detection on for a few executions.


Chess can t avoid finding bugs

“CHESS Can’t Avoid Finding Bugs”

“Solution is working  and found two bug with CHESS . To get the second bug, I had to fix first bug first”

“That liveness bug is such a minor performance problem that I won’t fix it.”


Concurrency checking with chess learning from experience

Playing CHESS with George


Chess is confusing me

“CHESS is Confusing Me”

RunTest is Not Idempotent


The nondeterminism saga static data lazily initialized

The Nondeterminism Saga: static data, lazily initialized

If replay of p.E fails, yielding p.F, then try again and see if p.F replays

Report lost coverage

p

F

E


Nondeterminism junkie too much information

Nondeterminism Junkie: Too much information

“Why does this test pass instead of say ‘Detected nondeterminism’ outside the control of CHESS"?


Concurrency checking with chess learning from experience

!?!

“Is this good behavior for CHESS to return three different results for the same code?”


Chess time isn t real time it s a feature not a bug

“CHESS Time Isn’t Real Time”: It’s a feature, not a bug.

“The call to WaitOne(60000, false) immediately returns false, which isn’t correct. If I use WaitOne() or WaitOne(Timeout.Infinite, false) instead of WaitOne(60000, false), the WaitHandle waits till the Event is set, returns true and everything goes fine. But waiting without a timeout isn't an option in my case.”


The expected i can t play chess on

The expected: “I can’t play CHESS on”

  • x64

  • Multi-process programs

  • Message passing, distributed systems

  • The Boost library

  • .NET without the CLR Profiler

  • Java

  • Unix


Learning from experience forums champions

Learning from Experience:Forums, Champions

Chris Dern, Steve Hale,

Ram Natarajan, Roy Tan


Concurrency checking with chess learning from experience

“Congratulations CHESS team!!!!!  I have proven outside of CHESS that the issue it is finding in our product on the 106th thread schedule looks like a valid product bug!!

I wrote a quick application to launch my CHESS test outside of CHESS and by freezing/thawing threads I was able to reproduce the issue independently.  This is incredibly exciting!!!  Many thanks for your patience, perseverance, and CHESS bug fixes as I’ve struggled to understand CHESS.”

Steve Hale, Microsoft , 2/12/2009

More

Great Quotes

Like This…


Concurrency checking with chess learning from experience

BORING!


Concurrency checking with chess learning from experience

Learning By

Flailing…

With PFX


Concurrency checking with chess learning from experience

PLINQ

Parallel.For

TaskScheduler

Task

ConcurrentBag

BlockingCollection

ConcurrentDictionary

Barrier

SemaphoreSlim

ManualResetEventSlim


Concurrency checking with chess learning from experience

“As the true value of a test is in its ability to find bugs, let’s take a look at how our CHESS tests did. Over the development cycle to date, the CHESS test found seven bugs, and was used to reproduce another seven for a total of 14, out of the 276 high priority bugs over the same time. While only 14 bugs against 276 appear sadly anemic, it’s important to dig a bit deeper. If we address each of the issues raised, would we find more bugs?”

Chris Dern, PFX_CHESS_Review_Final.docx


Concurrency checking with chess learning from experience

“Early on the adoption of CHESS, we made a fatal mistake. Perhaps it was wishful thinking on our part, or perhaps we believed too much in the marketing hype and didn’t read the fine print. We believed early on that CHESS was a turnkey solution capable of using existing tests and test approaches and ‘finding the bugs’. “

C. Dern


Concurrency checking with chess learning from experience

“The schedule for any product group is always under attack. Over the life cycle of a product, features are in constant flux, with managers always balancing risk and reward. In the face of this pressure, any untried tool, methodology, or approach faces an uphill battle.”

C. Dern


Concurrency checking with chess learning from experience

“For tool developers, it’s important that once you engage with a customer you help find then drive to some level of success. Finding a single bug is a priceless commodity when arguing to continue the time investment in a specific tool. Take small bites, set modest goals and drive to success. Perfect is the enemy of good, or at least good enough right now.”

C. Dern


Dern s do s and don ts

Dern’s DO’s and DON’Ts

Do not expect that CHESS will ‘magically’ find your bugs. CHESS is a tool, mainly focused at enumerating schedules for a given bound. While it can find specific types of concurrency bugs, e.g. deadlocks, for ‘free’ the value and benefit of CHESS comes with deliberate tests.


Concurrency checking with chess learning from experience

Do develop an understanding of what properties, invariants, and behaviors your test is testing

Do run your tests. While this may seem a silly tip, but it’s important to remember that CHESS enables the familiar write, run, refactor test experience for concurrent tests, which we enjoy with sequential tests today.


Concurrency checking with chess learning from experience

Do noT add artificial spinning/busy work in the test. CHESS will explore all schedules for your specified bound. Adding busy work, like you may find in a ‘stress’ test to increase coverage, only increases the test runtime when under CHESS.


Concurrency checking with chess learning from experience

Avoid blindly converting an existing ‘stress’ style unit test into a CHESS test. The size, scale, and assertions that one tends to find in those types of tests make for a weak CHESS test at best, or a unusable CHESS test at worst.


Stepping back from the fray high level learnings

Stepping Back from the Fray: High-level Learnings

  • Proper expectation setting

  • Good methodology

  • Good default behavior

  • Good warnings and messages

  • Minimize cognitive dissonance

  • Cultivate champions

  • Listen to them and learn!


Three chess learnings

Three CHESS Learnings

1. If you want

  • deterministic scheduling

  • with ability to explore all schedules

  • without changing the underlying scheduler

    Then its hard to achieve

  • high APIcoverage

  • robustness

    Action: we need observable and controllable schedulers!

  • 2. Concurrency unit testing

    • can be effective, but

    • requires careful planning and scoping

  • 3. Search/reduction strategies

    • are absolutely essential


Uplifting message and blatant advertisement for lineup talk

Uplifting Message andBlatant Advertisement for LineUp Talk

“Partnerships and Collaborations

The success of the LineUp work is a perfect example of [the benefits of] an open dialog between the teams along with continual experimentation by both sides. Combining innovations from both research and product testing group, we create[d] a complete solution to one area of concurrency testing.”

C. Dern


  • Login