Heterogeneous Compute Platforms : Data management

Heterogeneous Compute Platforms:Data management Dan Tsafrir May 2013, ICRI-CI Retreat Data Sharing

Data sharing – the problem • Sharing data between heterogeneous devices • Oftentimes cumbersome & device-specific • In OS, apps, or both • Programmers need to address questions like • Can the device work directly on app memory? Or must it have its own copy of the data? • Can the device deal with app virtual addresses?Or must the memory be mapped in some other way? • Should the memory be pinnedbefore passing it to the device? Or can the device withstand I/O page faultsThereby allowingmemory overcommitment? Data Sharing

Data sharing – goal • Big goal • Data sharing between heterogeneous PEs should "just work” • HW/SW interfaces should allow to keep app programmers mostly ignorant of details • Need to develop interfaces & runtime layer that • Abstract away details of each device, • Present to apps a simplified, efficient programming model • Concrete goal • Focusing on MMU and IOMMU Data Sharing

Unifying MMU and IOMMU spaces Ilya Lesokhin Muli Ben-Yehuda Assaf Schuster Dan Tsafrir Data Sharing

IOMMU in a nutshell • IOMMU vs. MMU • IOMMU serves I/O devices that perform DMAs • Like MMU serves processes that access virtual memory • But • No I/O page faults (IOPFs) • If memory isn’t there => crash Data Sharing

No IOPFs – consequences • IOMMU management crippled compared to MMU • Virtual-memory must be pre-allocated & pinned to physical-memory • Can’t do memory overcommitment • Consider a set of uncooperative VMs with assigned NICs (SR-IOV) • Must pin their entire memory images! • Kernel’s MMU & IOMMU management subsystems • Developed separately & used differently • Causes numerous headaches and performance penalties • E.g., can’t use apps virtual memory space to do I/O • Thus, to be able to unify (and get rid of above drawbacks) • Must have IOPFs Data Sharing

IOPFs support – current state of affairs • Recently defined industry spec for supporting IOPFs: • In “PRI” (Page Request Interface) • Part of the PCI-SIG ATS (Address Translation Services) specification • Bleeding edge I/O devices do (experimentally) support IOPFs • We are working on such experimental NICs Data Sharing

Research • Status • Have a working environment • Handling send-IOPFs (currently NIC drops receive-IOPFs) • Measured IOPF handling (breakdown to HW and SW components) • Next steps • Attempt to reduce overhead • Develop a strategy to handle receive-IOPFs (10 Gb/sec => 1.25 MB/ms) • Characterizing IOPFs • How often? Performance penalty? Dropped packets? • Show I/O memory space overcommitment is possible & advantageous • Longer term • Unify process & I/O address spaces • Processes use their VA buffers, I/O subsystem works directly on them • Does the PRI spec make sense? Optimal? Could be improved? How? Data Sharing

Rethink the IOMMU Moshe Malka Nadav Amit Dan Tsafrir Data Sharing

IOMMU architected similarly to MMU |------------------------------------------- virtual address ------------------------------------| • Has IOTLB • Upon IOTLB miss, => HW walks the table CR3 Data Sharing

Does this make sense? • We submit that it does not… • Specifically, it seems that • Since NICs work with rings, IOTLBaccesses are completely predictable(more important than TLB becausepage-tables are un-cached) • Since NICs map each DMA descriptorjust before using it, and un-maps itjust after, no needfor a page-tablehierarchy • Performance can begreatly improved ifredesigning the IOMMUto take advantage of the above Data Sharing

Research • Status • Working hard towards proving all claims from previous slide • Environment: KVM/QEMU setup (10Gb/s NICs) logs all IOMMU accesses • Future • Not just NICs (have reason to believe other I/O devices too) • Reducing overheads for virtualization (vIOMMU) • What would be the impact of unifying I/O and process spaces? (previous project) Data Sharing

Heterogeneous Compute Platforms : Data management