1 / 60

Samira Khan

Rethinking System Support for Persistent Memory. Samira Khan. TWO-LEVEL STORAGE MODEL. CPU. Ld/St. VOLATILE. MEMORY. FAST. DRAM. BYTE ADDR. FILE I/O. NONVOLATILE. STORAGE. SLOW. BLOCK ADDR. TWO-LEVEL STORAGE MODEL. CPU. Ld/St. VOLATILE. MEMORY. FAST. DRAM. BYTE ADDR. NVM.

carissad
Download Presentation

Samira Khan

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rethinking System Support for Persistent Memory Samira Khan

  2. TWO-LEVEL STORAGE MODEL CPU Ld/St VOLATILE MEMORY FAST DRAM BYTE ADDR FILE I/O NONVOLATILE STORAGE SLOW BLOCK ADDR

  3. TWO-LEVEL STORAGE MODEL CPU Ld/St VOLATILE MEMORY FAST DRAM BYTE ADDR NVM FILE I/O PCM, STT-RAM NONVOLATILE STORAGE SLOW BLOCK ADDR Non-volatile memories combine characteristics of memory and storage

  4. VISION: UNIFY MEMORY AND STORAGE CPU Ld/St PERSISTENTMEMORY NVM Provides an opportunity to manipulate persistent data directly in memory Avoids reading and writing back data to/from storage

  5. CHALLENGE: MEMORY & STORAGE SYSTEM SUPPORT APPLICATION APPLICATION Crash Consistency OS/SYSTEM OS/SYSTEM Ld/St Ld/St Availability NVM MEMORY FILE I/O Compression PERSISTENT MEMORY Integrity Check STORAGE Encryption Overhead in OS/storage layer overshadows the benefit of nanosecond access latency of NVM

  6. CHALLENGE: MEMORY & STORAGE SYSTEM SUPPORT APPLICATION APPLICATION Crash Consistency OS/SYSTEM Ld/St Ld/St Availability NVM MEMORY FILE I/O PERSISTENT MEMORY STORAGE Not the operating system, Application layer is responsible for crash consistency in PM

  7. CHALLENGE: MEMORY & STORAGE SYSTEM SUPPORT APPLICATION APPLICATION Crash Consistency Software Software Ld/St OS/SYSTEM Ld/St Availability NVM PERSISTENT MANAGER MEMORY FILE I/O Compression Hardware Hardware PERSISTENT MEMORY Integrity Check STORAGE Encryption Not the operating system, hardware is responsible for many system support in PM

  8. GOAL: END-TO-END SYSTEM FOR PERSISTENT MEMORY CPU PROBLEM How to write consistent code? APPLICATION Software Ld/St Howto test the code is correct? PERSISTENTMEMORY COMPILER OS How to recover and resume application and OS? PERSISTENT MANAGER ARCHITECTURE How to provide efficient hardware support? Hardware CIRCUITS A full stack support for persistent memory applications

  9. CURRENT WORKS SPAN THE WHOLE STACK CPU PROBLEM Efficient Persistent Programming (WEED’15) APPLICATION Software Ld/St PERSISTENTMEMORY COMPILER Runtime Consistency Testing (ASPLOS’19) Resumption of the System (Submitted to ASPLOS’20) OS Pre-Execution of System Support (ISCA’19) PERSISTENT MANAGER Efficient Logging Mechanisms (HPCA’18, MICRO’15) ARCHITECTURE Hardware CIRCUITS Programming and testing techniques for persistent memory applications Efficient hardware and ISA support for persistent memory

  10. CURRENT WORKS SPAN THE WHOLE STACK CPU PROBLEM Efficient Persistent Programming (WEED’15) APPLICATION Software Ld/St PERSISTENTMEMORY COMPILER Runtime Consistency Testing (ASPLOS’19) Resumption of the System (Submitted to ASPLOS’20) OS Pre-Execution of System Support (ISCA’19) PERSISTENT MANAGER Efficient Logging Mechanisms (HPCA’18, MICRO’15) ARCHITECTURE Hardware CIRCUITS Programming and testing techniques for persistent memory applications Efficient hardwareand ISA support for persistent memory

  11. Rethinking System Support PMTEST: Testing for Correctness NON-VOLATILE MEMORY PERSISTENT MEMORY ASPLOS’19 JANUS: Optimizing for Efficiency Unified Memory and Storage ISCA’19 Conclusion

  12. PERSISTENT MEMORY PROGRAMMING • Support for crash consistency have two fundamental guarantees • Durability:writes become persistent in PM • Ordering:one write becomes persistent in PM before another Core • Durability Guarantee: • writeback data from cache • Flush A Volatile Cache Persistent PM-DIMM

  13. PERSISTENT MEMORY PROGRAMMING • Support for crash consistency have two fundamental guarantees • Durability:writes become persistent in PM • Ordering:one write becomes persistent in PM before another Core • Ordering Guarantee: • Write A before B • Writeback A • Barrier • Writeback B Volatile Cache Persistent B A PM-DIMM

  14. PERSISTENT MEMORY PROGRAMMING Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm Two different ways to program persistent applications

  15. PERSISTENT MEMORY PROGRAMMING (LOW-LEVEL) • Hardware provides low-level primitives for crash consistency • Exposes instructions for cache flush and barriers • sfence, clwbfrom x86 • dc cvapfrom ARM • Academic proposals, e.g., ofence, dfence. x86 ARM clwb sfence dc cvap dsb New Instr PM-DIMM PM-DIMM PM-DIMM [Kiln’13, ThyNVM’15, DPO’16, JUSTDOLogging’16, ATOM’17, HOPS’17, etc.]

  16. PROGRAMMING USING LOW-LEVEL PRIMITIVES 1 void listAppend(item_tnew_val) { 2  node_t* new_node = new node_t(new_val); 3  new_node->next = head; 4  head = new_node; 5  persist_barrier(); 6 } Createnew_node   2 node_t* new_node = new node_t(new_val); Updatenew_node 3 new_node->next = head; Update head pointer Writeback updates 4 head = new_node; Writes to PM can reorder 5 persist_barrier(); Head In cache new_nodeis lost after failure Inconsistent linked list

  17. PROGRAMMING USING LOW-LEVEL PRIMITIVES 1 void listAppend(item_tnew_val) { 2  node_t* new_node = new node_t(new_val); 3  new_node->next = head; Enforce writeback before changing head persist_barrier(); 4  head = new_node; 5  persist_barrier(); 6 } Head In cache In PM Ensuring crash consistency with low-level primitives is HARD! Consistent linked list

  18. PERSISTENT MEMORY PROGRAMMING Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm

  19. PERSISTENT MEMORY PROGRAMMING (HIGH-LEVEL) • Libraries provide transactions on top of low-level primitives • Intel’s PMDK • Academic proposals AtomicBegin { Append a new node; } AtomicEnd; Uses logging mechanisms to atomically commit the updates [NV-Heaps’11, Mnemosyne’11, ATLAS’14, REWIND’15, NVL-C’16, NVThreads’17 LSNVMM’17, etc.]

  20. PROGRAMMING USING TRANSACTIONS 1 void ListAppend(item_tnew_val) { 2 TX_BEGIN { 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; 6 List.length++; 7 } TX_END 8 } Createnew_node backuphead Update head Update length 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; 6 List.length++; length is not backed up before update!

  21. PROGRAMMING USING TRANSACTIONS 1 void ListAppend(item_tnew_val) { 2 TX_BEGIN { 3 node_t *new_node = makeNode(new_val); 4 TX_ADD(list.head, sizeof(node_t*)); 5 List.head = new_node; TX_ADD(list.length, sizeof(unsigned)); 6 List.length++; 7 } TX_END 8 } Backup length before update Ensuring crash consistency with transactions is still HARD!

  22. PERSISTENCE MEMORY PROGRAMMING IS HARD Normal Expert PM Programming • Uses low-level primitives • Understands the hardware • Understands the algorithm • Uses a high-level interface • Does not need to know details of hardware or algorithm Both expert and normal programmers can make mistakes

  23. PERSISTENT MEMORY PROGRAMMING IS HARD Detect crash consistency bugs We need a tool to detect crash consistency bugs!

  24. REQUIREMENTS OF THE TOOL Flexible Fast PM Libraries Kernel Modules Custom Programs Existing HW Future HW and Models [PMDK, NV-Heaps’11, Mnemosyne’11, ATLAS’14, REWIND’15, NVL-C’16, NVThreads’17 LSNVMM’17, etc.] [PMFS’14, BPFS’09, NOVA’16, NOVA-Fortis’17, Strata’17, SCMFS’11 etc.] [DPO’16, HOPS’17, etc.] E.g., custom database, key-value store, etc. [x86, ARM, etc.]

  25. Our work: Flexible PM Libraries Kernel Modules Custom Programs Academic Proposals Existing HW Fast Less than 2X overhead in real workloads PMTest PMTest has detected new bugs in PMFS and PMDK applications Artifact available at pmtest.persistentmemory.org

  26. PMTEST KEY IDEAS: FLEXIBLE • Many different programming models and hardware primitives available PM Program PM Kernel Module PM Program Call library Call library Mnemosyne Library PMDK Library write, sfence, clwb write, dc cvap, dsb write, sfence, clwb ARM x86 x86 The challenge is to support different hardware and software models

  27. PMTEST KEY IDEAS: FLEXIBLE Operations that maintain crash consistency are similar: orderinganddurability guarantees PM Program PM Kernel Module PM Program Call library Call library Mnemosyne Library PMDK Library write, sfence, clwb write, dc cvap, dsb write, sfence, clwb ARM x86 x86 Our key idea is to test for these two fundamental guarantees which in turn can cover all hardware-software variations

  28. PMTEST KEY IDEAS: FAST sfence write C write A write B ... sfence sfence write B write A write C ... sfence sfence write A write B write C ... sfence sfence write A write C write B ... sfence sfence write B write C write A ... sfence sfence write C write B write A ... sfence • Prior work [Yat’14] uses exhaustive testing n O(n!) Recoverable? Exhaustive testing is time consuming and not practical

  29. PMTEST KEY IDEAS: FAST sfence write C write B write A ... sfence • Reduce test time by using only one dynamic trace Runtime Trace Persistent Memory Application Recoverable? A significant improvement over O(n!) testing

  30. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A A clwb A sfence A persists before B write B B clwb B sfence Trace Timeline A disjoint interval indicates that no re-ordering in the hardware will lead to a case where A does not persist before B

  31. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A Interleaving A write B B clwb A sfence A may NOT persist before B clwb B sfence Trace Timeline An overlapping interval indicates that there is a case where A does not persist before B

  32. PMTEST KEY IDEAS: FAST • PMTestinfers the persistence intervalfrom PM operation traceThe interval in which a write can possibly become persistent write A A write B B clwb A sfence clwb B sfence A persists before B? No Trace Timeline Querying the trace can detect any violation in ordering and durability guarantee at runtime

  33. PMTEST OVERVIEW Testing Annotation Checking Rules Testing Results PMTest Persistent Memory Application Offline Online

  34. SUMMARY SO FAR • It is hard to guarantee crash consistency in persistent memory applications • Our tool PMTestis fast and flexible • Flexible: Supports kernel modules, custom PM programs, transaction-based programs • Fast: Incurs < 2X overhead in real-workload applications • PMTest has detected 3 new bugs in PMFS and PMDK applications pmtest.persistentmemory.org PMTest

  35. CHALLENGE: MEMORY & STORAGE SYSTEM SUPPORT APPLICATION Crash Consistency Software Ld/St NVM PERSISTENT MANAGER Compression Hardware PERSISTENT MEMORY Integrity Check PMTest Encryption Not the operating system, hardware is responsible for many system support in PM

  36. Rethinking System Support PMTEST: Testing for Correctness NON-VOLATILE MEMORY PERSISTENT MEMORY ASPLOS’19 JANUS: Optimizing for Efficiency Unified Memory and Storage ISCA’19 Conclusion

  37. MEMORY AND STORAGE SUPPORT The memory and storage support is designed for Prevent attackers from stealing or tampering data Encryption, integrity verification, etc. Security Improve NVM’s limited bandwidth Deduplication, compression, etc. Bandwidth Extend NVM’s limited lifetime Wear-leveling, error correction, etc. Endurance We refer to the memory and storage support as backend memory operations

  38. BACKEND MEMORY OPERATION LATENCY Cache Writeback Core Memory Controller Cache Cache Cache Memory Controller Memory Controller NVM Access Write Access Timeline NVM

  39. BACKEND MEMORY OPERATION LATENCY Cache Writeback Core Memory Controller Backend Memory Operations Cache Cache Cache Memory Controller NVM Access NVM Write Access Timeline Non-volatile Volatile Recent NVM support guarantees writes accepted by memory controller is non-volatile

  40. BACKEND MEMORY OPERATION LATENCY Cache Writeback Core Memory Controller Backend Memory Operations Cache Cache Cache Memory Controller ~15 ns NVM Access >100 ns NVM Write Access Timeline Non-volatile Volatile Latency to Persistence

  41. WHY WRITE LATENCY IS IMPORTANT? • NVM programs need to use crash consistency mechanisms that enforces data writeback Core Volatile Cache persist_barrier Non-volatile NVM

  42. WRITE LATENCY IN NVM PROGRAMS Writeback from cache Backup persist_barrier Update Commit Timeline Example: Steps in undo logging transaction Execution cannot continue until writeback completes

  43. WRITE LATENCY IN NVM PROGRAMS Backup Update Commit Write latency is on the critical path Timeline Example: Steps in undo logging transaction Crash consistency mechanism puts write latency on the critical path

  44. WRITE LATENCY IN NVM PROGRAMS Backup Backup Update Update Commit Backend memory operations Commit Increased latency Timeline Backend memory operations increase the writeback latency

  45. Backend memory operations are on the critical path How to reduce the latency?

  46. OBSERVATION Each backend memory operation seems indivisible Integration leads to serializedoperations Counter-mode Encryption Integrity Verification Deduplication

  47. OBSERVATION However, it is possible to decompose them into sub-operations Generate counter Decompose Encrypt counter Data Encrypted counter Generate MAC (for integrity verification) Counter-mode Encryption

  48. KEY IDEA I: PARALLELIZATION After decomposing the example operations: Counter-mode Encryption Integrity Verification Deduplication

  49. KEY IDEA I: PARALLELIZATION There are two types of dependencies: Inter-operation dependency Intra-operationdependency Counter-mode Encryption Integrity Verification 1. Dependency within each operation 2. Dependency across different operations when they cooperate Deduplication

  50. KEY IDEA I: PARALLELIZATION There are two types of dependencies: Inter-operation dependency Intra-operation dependency Parallelizable Counter-mode Encryption Integrity Verification Sub-operations without dependency can execute in parallel Deduplication

More Related