1 / 83

EFFICIENT DYNAMIC VERIFICATION ALGORITHMS FOR MPI APPLICATIONS

EFFICIENT DYNAMIC VERIFICATION ALGORITHMS FOR MPI APPLICATIONS. Dissertation Defense Sarvani Vakkalanka. Committee: Prof. Ganesh Gopalakrishnan (advisor), Prof. Mike Kirby (co-advisor), Prof. Suresh Venkatasubramanian , Prof. Matthew Might, Prof. Stephen Siegel (Univ. of Delaware).

trygg
Download Presentation

EFFICIENT DYNAMIC VERIFICATION ALGORITHMS FOR MPI APPLICATIONS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EFFICIENT DYNAMIC VERIFICATION ALGORITHMS FOR MPI APPLICATIONS Dissertation Defense Sarvani Vakkalanka Committee: Prof. GaneshGopalakrishnan (advisor), Prof. Mike Kirby (co-advisor), Prof. Suresh Venkatasubramanian, Prof. Matthew Might, Prof. Stephen Siegel (Univ. of Delaware)

  2. Necessity for Verification • Software testing is ad-hoc. • Software Errors expensive - $59.5 Billion/yr (2001 NTSI Study). • Software written today is complex and uses many existing libraries. • Our focus – contribute to • Parallel scientific software written using MPI

  3. Motivation • Concurrent software debugging is hard! • Very little formal support for Message Passing concurrency. • Active testing (schedule enforcement) is important. • Reducing redundant (equivalent) verification runs is crucial. • Verification for portability – another important requirement.

  4. Approaches to Verification Testing methods suffer from bug omissions. Static analysis based methods generate many false alarms. Model based verification is tedious. Dynamic verification – no false alarms

  5. Contributions • New dynamic verification algorithmsfor MPI. • New Happens-Before models for Message Passing concurrency. • Verification to handleresource dependency. • MPI dynamic verification tool ISP that handles non-trivial codes for safety properties.

  6. Agenda • Intro to Dynamic Verification • Intro to MPI • Four MPI Operations (S, R, W, B). • MPI Ordering Guarantees. • Applying DPOR to MPI • Dynamic verification algorithms avoiding redundant searches and handling resource dependencies • Formal MPI Transition System • Experimental Results • Conclusions

  7. EFFICIENT DYNAMIC VERIFICATION

  8. Growing Importance of Dynamic Verification Code written using mature libraries (MPI, OpenMP, PThreads, …) API calls made from real programming languages (C, Fortran, C++) Dynamic Verification abstracts verification details. (static analysis and model based verification can play important supportive roles) Runtime semantics determined by realistic compilers and runtimes

  9. Exponential number of TOTAL Interleavings – most are EQUIVALENT – generate only RELEVANT ones !! P0 P1 P2 P3 P4 TOTAL > 10 Billion Interleavings !! a++ b-- g=2 g=3

  10. Dynamic Partial Order Reduction P0 P1 P2 P3 P4 TOTAL > 10 Billion Interleavings !! g=2 g=3 Only these 2 are RELEVANT!!! Dependent actions All other actions are pairwise independent

  11. DPOR • A state σ consists of the following sets: • enabled(σ) • backtrack(σ) : sufficient subset of enabled(σ) • enabled(σ) = backtrack(σ) , then the full state space is explored. • Co-enabledness of transitions • Dependence among transitions σ

  12. Co-enabledness & Dependence {t1, t2} t2 t1 {t2} {t1} t1 t2

  13. DPOR Concepts • DPOR requires the identification of dependence and co-enabledness among transitions • Identifying dependence is simple • Two lock accesses on the same mutex. • Two writes to the same global variable. • Similar concepts for MPI. • Identifying co-enabledness is difficult (like will happen in parallel).

  14. Illustration of DPOR Concepts P1 P2 lock(l) x = 1 x = 2 unlock(l) lock(l) y = 1 x = 2 unlock(l)

  15. Illustration of DPOR Concepts P1 P2 lock(l) x = 1 x = 2 unlock(l) lock(l) y = 1 x = 2 unlock(l)

  16. Thread Verification vs MPI Verification • Thread verification – well studied! . • Well known dynamic verification tools on thread verification [CHESS, INSPECT]. • Thread verification follows traditional dynamic partial order reduction. DPOR does not extend directly for MPI • MPI Verification – not so! • requires a formal definition. • out-of-order completion semantics. • Must define dependence

  17. INTRODUCTION TO MPI

  18. The Ubiquity of MPI IBM Blue Gene (Picture Courtesy IBM) LANL’sPetascale machine “Roadrunner” (AMD Opteron CPUs and IBM PowerX Cell) • The choice for ALL large-scale parallel simulations (earthquake, weather..) • Runs “everywhere”. • Very mature codes exist in MPI – tens of person years. • Performs critical simulations in science and engineering.

  19. Overview of Message Passing Interface (MPI) API • One of the major Standardization Successes. • Lingua franca of Parallel Computing • Runs on parallel machines of a WIDE range of sizes • Standard is published at www.mpi-forum.org • MPI 2.0 includes over 300 functions

  20. MPI Execution Environment • MPI execution environment consists of two main components: • MPI processes. • The MPI runtime daemon. • All processes statically created. • Process rank between 0 and n-1. • The MPI processes issue instructions into MPI runtime. • The MPI runtime implements and executes the MPI library.

  21. MPI Execution Contd… • Every process starts execution with MPI_Init(intargc, char **argv); • MPI_Finalize – at the end

  22. MPI_Isend (void *buff, …, int dest, int tag, MPI_Commcomm, MPI_Request handle); • Abbreviated as S

  23. MPI_Irecv (void *buff, …, intsrc, int tag, MPI_Commcomm, MPI_Request *handle); • Abbreviated as R

  24. MPI_Wait (MPI_Request *handle, MPI_Status *status); • Abbreviated as W

  25. MPI_Barrier (MPI_Commcomm); • Abbreviated as B. • All processes must invoke B before any can get past.

  26. MPI Ordering Guarantees

  27. MPI Ordering Guarantees

  28. MPI Ordering Guarantee

  29. Applying DPOR to MPI Programs like this – almost impossible to test on real platforms.

  30. Why DPOR does not work!

  31. Modifying Runtime Doesn’t Help! • Assume that the MPI runtime is modified to support verification • The sends are matched with receives in the order they are issued to the MPI runtime • Is this sufficient?

  32. Crooked Barrier Example P0 P1 P2 Isend(1, req) Irecv(*, req) Barrier Barrier Barrier Isend(1, req) Wait(req) Irecv(2, req1) Wait(req) Verification Support does not work! Wait(req1) Wait(req)

  33. Our Main Algorithms • Partial Order avoiding Elusive Interleavings (POE). • POEOPT : Reduced interleavings even further. • POEMSE: Handle resource dependencies.

  34. Illustration of POE Scheduler P0 P1 P2 Isend(1) sendNext Barrier Isend(1, req) Barrier Irecv(*, req) Barrier Isend(1, req) Barrier Wait(req) Wait(req) Recv(2) Wait(req) MPI Runtime

  35. Illustration of POE Scheduler P0 P1 P2 Isend(1) Barrier sendNext Isend(1, req) Irecv(*, req) Irecv(*) Barrier Barrier Barrier Barrier Isend(1, req) Wait(req) Recv(2) Wait(req) Wait(req) MPI Runtime

  36. Illustration of ISP’s Verification Algorithm Scheduler P0 P1 P2 Isend(1) Barrier Barrier Isend(1, req) Irecv(*, req) Barrier Barrier Irecv(*) Barrier Barrier Barrier Isend(1, req) Barrier Wait(req) Recv(2) Wait(req) Wait(req) Barrier MPI Runtime

  37. Illustration of POE Scheduler P0 P1 P2 Isend(1) Irecv(2) Barrier Isend Wait (req) Isend(1, req) Irecv(*, req) Barrier No Matching Send Irecv(*) Barrier Barrier Isend(1, req) Barrier Recv(2) SendNext Wait(req) Recv(2) Wait(req) Wait(req) Barrier Deadlock! Isend(1) Wait Wait (req) MPI Runtime IntraCB

  38. Notations • MPI_Isend : Si,j(k), where • i is the process issuing the send, • jis the dynamic execution count of S in process i and • k is the destination process rank where the message is to be sent • MPI_Irecv: Ri,j(k) • k is the source • MPI_Barrier: Bi,j • MPI_Wait : Wi,j’(hi,j) • hi,jis the request handle of Si,j(k) or Ri,j(k)

  39. POE Issue: Redundancy POE explores both the match-sets resulting in 2 interlevings while just 1 interleaving is sufficient. SOLUTION : Explore only match-sets for single wildcard receive. DOES NOT WORK! BREAKS PERSISTENCE.

  40. POE and Persistent Sets Add only this match-setto bactrack Maintaining Persistent backtrack sets is important. Otherwise, verification algorithm is broken

  41. POE Issue: Buffering Deadlocks When no sends are buffered Deadlock!

  42. POE Issue: Redundancy Simple Optimization: If there is no more sends targeting a wildcard receive, then add only of of the match-sets to backtrack set.

  43. Redundancy : POEOPT P0 P1 P2 P3 S0,1(1) R1,1(*) S3,1(*) R2,1(1) W0,2(h0,1) W1,2(h1,1) W3,2(h3,1) W2,2(h2,1) S1,3(3) R3,3(1) W0,4(h1,3) W3,4(h3,3) R1,5(*) S3,5(1) W1,6(h1,5) W3,6(h3,5)

  44. Detecting Matching • Exploring all non-deterministic matchings in a state is not a solution • The IntraHB relation is not sufficient to detect matchings across processes • We introduce the notion of Inter-HB

  45. InterHB Relation

  46. Redundancy : POEOPT P0 P1 P2 P3 S0,1(1) R1,1(*) S3,1(2) R2,1(*) W0,2(h0,1) W1,2(h1,1) W3,2(h3,1) W2,2(h2,1) S1,3(3) R3,3(1) W0,4(h1,3) W3,4(h3,3) R1,5(*) S3,5(1) W1,6(h1,5) W3,6(h3,5)

  47. Redundancy : POEOPT P0 P1 P2 P3 P4 P5 R2,1(*) S3,1(2) R1,1(*) R4,1(*) S5,1(1) S0,1(1) W3,2(h3,1) W4,2(h4,1) W5,2(h5,1) W0,2(h0,1) W1,2(h1,1) W2,2(h2,1) S3,3(1) R1,3(3) W3,4(h3,3) W1,4(h1,3) NO PATH

  48. Slack/Buffering Deadlocks Deadlocks only when S0,1 or S1,1 or both are buffered

  49. Buffer All Sends ??? ZERO SLACK

  50. Buffer All Sends ??? ZERO SLACK

More Related