1 / 14

DRF x

DRF x. Dan Marino Abhay Singh Todd Millstein Madan Musuvathi Satish Narayanasamy. UC Los Angeles. University of Michigan. A Simple and Efficient Memory Model for Concurrent Programming Languages. UC Los Angeles. MSR, Redmond. University of Michigan.

Download Presentation

DRF x

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DRFx Dan Marino Abhay Singh Todd Millstein MadanMusuvathi SatishNarayanasamy UC Los Angeles University of Michigan A Simple and Efficient Memory Modelfor Concurrent Programming Languages UC Los Angeles MSR, Redmond University of Michigan

  2. State of the Art:SC for Data Race Free Memory Models • sequential consistency [Lamport 79] • intuitive for programmers • limits compiler and hardware optimizations • DRF0 [Adve&Hill 90] models balance performance and ease of programming • SC behavior guaranteed for race-free programs • most optimizations allowed • e.g. Java and C++0x memory models[Manson et al. 2005] [Boehm et al. 2008]

  3. Program Behavior under DRF0 X* x = null; bool init = false; atomic // Thread t // Thread u A: x = new X();C: if(init) B: init = true;D: x->f++; A: x = new X(); C: if(init) D: x->f++; B: init = true; Optimizing Compiler and Hardware NullPointer! B doesn’t depend on A. It might be faster to reorder them!

  4. Deficiencies of DRF0 weak or no semantics for racy programs unintentional data races easy to introduce problematic for debuggability programmer must assume non-SC behavior for all programs safety compiler correctness [Boehm et al., PLDI 2008] optimization + data race = jump to arbitrary code! Java must maintain safety at the cost of complexity [Ševčík&Aspinall, ECOOP 2008]

  5. Our Solution: The DRFxMemory Model Memory Model Exception data race Programming Error Fatal Runtime Error • debuggabilitySC for all executions • safetyhalt program before non-SC behavior exhibited • compiler correctnessmost sequentially-valid optimization permitted

  6. DRFx Allows Relaxed Data Race Detection source program observed behavior data race free SC Behavior simplify detection MM Exception has data races precise runtime data race detection is slow in software and complex in hardware[Flanagan & Freund 2009] [Prvulovic & Torrelas 2003]

  7. Detecting an SC Violation X* x = null; bool init = false; // Thread t // Thread u A: x = new X();C: if(init) B: init = true;D: x->f++; Races need not be reported between regionsthat do not execute concurrently!region serializable for compiled ⇒ SC for source MMException region fence B: init = true; region fence C: if(init) D: x->f++; region fence A: x = new X(); region fence data race,but no SC violation Insight: compiler can communicate to runtime the regions in which reordering may have occurred runtime must detect conflicting accessesin regions that execute concurrently.

  8. DRFxCompiler and Runtime Requirements • DRFx Compiler • communicate regions in which optimizations were made by using fence instructions • synchronization in their own region • no speculative memory accesses • DRFx Execution Environment • trap on conflicting accesses in concurrent regions • global order on region fences • memory order consistent with fence order

  9. Formalization • compiler requirements • how program is split into regions • permitted optimizations • all non-speculative, sequentially valid optimizations • execution environment requirements • when conflict may/must be reported • memory orderings allowed w.r.t. fences • prove • no MM exception ⇒ SC behavior for source program • MM exception ⇒ data race in source program

  10. Efficient & Simple Conflict Detection • perform detection in hardware • like transactional memory hardware – but simpler • no rollback • we control region boundaries • compiler bounds number of memory locations dynamically accessed in a region • limits optimization opportunities • distinguish “bounding” region fence • hardware can merge regions separated by a bounding fence when resources available

  11. Compiler Implementation • built conservative DRFx-compliant compiler • LLVM [Lattner & Adve 2004] • naïve bounding analysis • bounding fence at all loop back edges • disable speculative optimizations • measured performance • PARSEC benchmark suite • stock x86 hardware – no architectural simulator

  12. DRFxOverhead on Parsec Benchmarks slowdown over unmodified, fully optimizing LLVM

  13. Related Work • memory modelse.g. [Lamport 1979], [Dubois et al. 1986], [Adve & Hill 1990] • hardware race detection[Adve et al.1991], [Muzahid et al. 2009], [Prvulovic & Torrelas 2003] • software race detection e.g. [Yu et al. 2005 ],[Flanagan & Freund 2009],[Elmas et al. 2007] • detecting SC violations [Gharachorloo&Gibbons, SPAA 1991] • conflict exception [Lucia et al., ISCA 2010] • stronger guarantee : serializability of sync-free regions • requires unbounded detection scheme • focused on hardware

  14. DRFx Conclusion regions lightweight form of data race detection MM Exception easy-to-understand programmer gets understandable behavior for all programscompiler may perform most sequentially valid optimizations within regions efficient straightforward hardware supportcompiler restrictions ⇒ only 0% - 7% slowdown

More Related