Rethinking Database Algorithms for Phase Change Memory
This presentation explores the impact of phase change memory (PCM) on database management systems and the algorithms that can optimize their performance. PCM is a non-volatile memory technology that offers advantages over traditional DRAM and NAND flash, such as better latency and energy efficiency. Key topics include the transition from DRAM to PCM, the challenges faced in memory organization, and the design of PCM-friendly database algorithms like B+ Trees and hash joins. It highlights the potential of PCM to reshape data management practices in the future.
Rethinking Database Algorithms for Phase Change Memory
E N D
Presentation Transcript
Rethinking Database Algorithms forPhase Change Memory Shimin Chen Intel Labs Pittsburgh shimin.chen@intel.com Phillip B. Gibbons Intel Labs Pittsburgh phillip.b.gibbons@intel.com SumanNath Microsoft Research sumann@microsoft.com Presented by: Pradeep Kumar Gali
Outline What is PCM? PCM vs. Other technologies Why PCM? PCM in main memory organization Challenges with PCM PCM-Friendly DB Algorithms B+ Tree Index Hash Joins Conclusion
What is PCM? Phase change memory – a byte addressable non volatile memory Amorphous state (0) <=> Crystalline state (1) SET & RESET the cell
Why PCM? “how should database systems be modified to best take advantage of this emerging trend towards PCM?” Non-volatile Byte-addressable 2-4X denser than DRAM More energy efficient than DRAM Far better than NAND flash in read/write latency and endurance “PCM will replace DRAM to be in main memory”
PCM in Main Memory Organization Replace DRAM with PCM PCM + software controlled DRAM buffer PCM + DRAM buffer as transparent hardware cache
Challenges with PCM • Major disadvantage – Writes • High energy consumption • Incur high voltage, high current • High latency and low bandwidth • Longer SET time • Limited number of bits per iteration • Limited endurance • Wear leveling at the memory controller
Challenges with PCM (continued…) Cache line 0 0 1 1 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 PCM 0 0 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 0 0 1 1 1 1 1 1 0 0 0 Animation Courtesy: http://www.cidrdb.org/cidr2011/Talks/CIDR11_Chen.pptx • Hardware optimizations to reduce writes • “read-modify-write” • Partial writes for only dirty words
PCM-Friendly DB Algorithms • Design goals • Low computation complexity • Good CPU cache performance • Power efficiency (more recent) • Minimize PCM writes (PCM specific) • Algorithm analysis & Granularity of writes • Bits • Words • Cache lines • Analytical Metrics • Total wear • Energy • Total PCM Access Latency
B+ Tree Index • B+ tree • Records at leaf nodes • High fan out • Suitable for file systems • Cache-friendly B+-Tree • Node - one or a few cache lines • Less number of pointers per node • Problem • Writes!! CSB+ Tree Order 1 Node of a cache friendly B+ tree
B+ Tree Index • Unsorted node with bitmap • Leaf nodes are organized with bitmaps • Unsorted node • Sorted non-leaf nodes and unsorted non-leaf nodes bitmap keys 1011 1010 8 2 9 4 7 keys num 5 8 2 9 4 7 pointers pointers
B+ Tree Index Total wear Energy Execution time • Unsorted leaf gives best performance Image Courtesy: http://www.cidrdb.org/cidr2011/Talks/CIDR11_Chen.pptx
Hash Join Hash Table • Simple Hash Join • Build and probe • Problem – Cache misses • Build and hash table exceeds CPU cache size • Small record size R S Build Phase Probe Relation
Hash Join Join Phase • Cache partitioning • Hash partitioning • Problems – Too many writes! S R R1 S1 Partition Phase R2 S2 Partition Phase R3 S3 R4 S4
Hash Join Join Phase • Virtual partitioning • Compressed Record ID lists* • Advantages • Reduction in writes • Good CPU cache performance S S R R Virtual partitioning Virtual Partitioning *It is assumed that there is a simple mapping between a record ID and the record location in memory
Hash Join Total wear Energy Execution time • Best – virtual partitioning • Worst – cache partitioning Image Courtesy: http://www.cidrdb.org/cidr2011/Talks/CIDR11_Chen.pptx
Conclusion • High expectations on PCM • Adopting DBMS to PCM • New B+ tree and hash join designs are proposed • Future work • Optimizing PCM writes for different aspects of DBMS • Study fine grain non-volatility of PCM
PCM Metrics • General Terms • Key PCM metrics