1 / 24

Karma: Scalable Deterministic Record-Replay

Karma: Scalable Deterministic Record-Replay. Arkaprava Basu Jayaram Bobba Mark D. Hill Work done at University of Wisconsin-Madison. Executive summary. Applications of deterministic record-replay Debugging Fault tolerance Security Existing hardware record- replayer Fast record but

zahir-irwin
Download Presentation

Karma: Scalable Deterministic Record-Replay

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Karma:Scalable Deterministic Record-Replay ArkapravaBasu JayaramBobba Mark D. Hill Work done at University of Wisconsin-Madison

  2. Executive summary • Applications of deterministic record-replay • Debugging • Fault tolerance • Security • Existing hardware record-replayer • Fast record but • Slow replay or • Requires major hardware changes • Karma: Faster Replay with nearly-conventional h/w • Extends Rerun • Records more parallelism

  3. Outline • Background & Motivation • Rerun Overview • Karma Insights • Karma Implementation • Evaluation • Conclusion

  4. Deterministic Record-Replay • Multi-threaded execution non-deterministic • Deterministic record-replay to reincarnate past execution • Record: • Record selective events in a log • Replay: • Use the log to reincarnate past execution • Key Challenge: Memory races

  5. Record-Replay Motivation • Debugging • Ensures bugs faithfully reappear (no heisenbugs) • Fault-Tolerance • Enable hot backup for primary server toshadow primary & take over on failure • Security • Real time intrusion detection & attack analysis Replay speed matters

  6. Previous work • Record Dependence • Wisconsin Flight Data Recorder [ISCA’03,etc.]: Too much state • UCSD Strata [ASPLOS’06]: Log size grows rapidly w #cores • Record Independence • UIUC DeLorean [ISCA’08]: Non-conventional BulkSC H/W • Wisconsin Rerun [ISCA’08]: Sequential replay • Intel MRR [MICRO’09]: Only for snoop based systems • Timetraveler [ISCA’10]: Extends Rerun to lower log size • Our Goal • Retain Rerun’s near-conventional hardware • Enable Faster Replay

  7. Outline • Background & Motivation • Rerun Overview • Karma Insights • Karma Implementation • Evaluation • Conclusion

  8. Rerun’s Recording • Most code executes without races • Use race-free regions for ordering • Episodes: independent execution regions • Defined per thread T0 T1 T2 ST V LD A ST E ST Z ST B LD B LD W ST C ST X LD J LD F LD R LD J LD X ST T LD V LD Q ST C ST Q ST E ST K ST X LD Z Partially adopted from ISCA’08 talk

  9. Rerun’s Recording (Contd.) • Capturing causality: • Timestamp via Lamport scalar clock [Lamport ‘78] • Replay in timestamp order • Episodes with same timestamp can be replayed in parallel T0 T1 T2 60 43 22 61 23 23 44 44 62 45

  10. Rerun’s Replay T0 T1 T2 22 TS=22 43 TS=43 44 44 TS=44 45 TS=45 TS=60 60 TS=61 61

  11. Outline • Background & Motivation • Rerun Overview • Karma Insights • Karma Implementation • Evaluation • Conclusion

  12. Karma’s Insight 1: • Capture order with DAG (not scalar clock) T0 T1 T2 Recording: DAG captured with episode predecessor & successor sets 60 43 22 61 23 23 44 44 62 45

  13. Karma’s Insight 1: T0 T1 T2 T0 T1 T2 60 22 22 61 43 43 Karma’s Replay Rerun’s Replay 44 44 44 44 62 45 60 61

  14. Karma’s Insight 1: (Contd.) • Naïve approach: DAG arcs point to episodes • Episode represented by integers • Too much log size overhead !! • Our approach:DAG arcs point to cores • Recording: Only one “active” episode per core • Replay: Send wakeup message(s) to core(s) of successor episode(s)

  15. Karma’s Insight 1: T0 T1 T2 Anatomy of a log entry 60 22 61 43 84 0|0|1 0|0|1 44 44 62

  16. Karma Insight 2: • Not necessary to end the episode on every conflict: • As long as the episodes can be ordered during replay T0 T1 T2 LD A ST V ST B ST E ST Z ST C LD B LD W ST X LD F LD J LD X LD R LD J LD Q ST T LD V ST Q ST C ST K ST E ST X LD Z

  17. Outline • Background & Motivation • Rerun Overview • Karma Insights • Karma Implementation • Evaluation • Conclusion

  18. Karma Hardware Data Tags Rerun L2/Memory State Directory Coherence Controller Base System Total State: 148 bytes/core L20 L2 1 L2 14 L2 15 … DRAM DRAM Interconnect Address Filter(FLT) Core 0 Core 1 … Core 14 Core 15 Reference (REFS) Predecessor(PRED) Coherence Controller Successor(SUCC) Timestamp(TS) L1 I L1 D Karma’s Per-Core State Pipeline

  19. Outline • Background & Motivation • Rerun Overview • Karma Insights • Karma Implementation • Evaluation • Conclusion

  20. Evaluation: • Were we able to speed up the replay?

  21. Evaluation: • Were we able to speed up the replay? On Average ~4X improvement in replay speed over Rerun

  22. Evaluation • Did we blowup log size? On average Karma does not increase the size of the log but instead improves it by as much as 40% as we allow larger episodes

  23. Conclusion • Applications of deterministic replay • Debugging • Fault tolerance • Security • Existing hardware record-replayer • Slow replay or • Requires major hardware changes • Karma: Faster Replay with nearly-conventional h/w • Extends Rerun • Uses DAG instead of Scalar clock • Extend episodes past conflicts • Widen Application + Lower Cost  More Attractive

  24. Questions?

More Related