1 / 31

VIA and Its Extension To TCP/IP Network

VIA and Its Extension To TCP/IP Network. Yingping Lu ( lu@cs.umn.edu ) Based on Paper “Queue Pair IP, …” by Philip Buonadonna. Outline. Motivation VIA Overview QP/IP Architecture QP/IP Performance Summary. Motivation.

vevina
Download Presentation

VIA and Its Extension To TCP/IP Network

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VIA and Its Extension To TCP/IP Network Yingping Lu (lu@cs.umn.edu) Based on Paper “Queue Pair IP, …” by Philip Buonadonna

  2. Outline • Motivation • VIA Overview • QP/IP Architecture • QP/IP Performance • Summary

  3. Motivation • High performance computing, clustering applications require high-throughput, low-latency communications facility • Traditional TCP/IP is not designed for high-throughput, low-latency communications • Application software has not kept pace with the increase of I/O speed • Memory copy • Checksum Computation • Interrupt • Context Switching

  4. Typical Communication Data Path

  5. Bandwidth Comparison

  6. VIA Solution • VIA is a industry standard convened by Microsoft, Compaq, Intel. • Key features of VIA: • Reduce memory copy (Zero-copy) • Direct user level access to NIC hardware • Eliminate OS kernel from critical path • Collapse ISO/OSI model • Offload CPU processing to intelligent NIC

  7. VIA Architecture

  8. VIA Components • Consumer • The end entity to use VIA function to communicate, can be user-level or kernel • Use VIPL for programming • VI User Agent • Implements OS bypassing agent • Kernel Agent • Device driver, handle security and OS-related issues • VIA-capable NIC (Channel Adapter) • Implements VIA communications

  9. Programming Abstraction • Queue Pairs • Components • Send queue • Receive queue • Completion queue (status) • Data Movement Operations • Send/Receive • RDMA Read • RDMA Write

  10. Virtual Interface (Queue Pair)

  11. Memory Access • Memory Registration • Memory must registered before use • System pins out the memory region • Nic use DMA to transfer data from memory to Nic • Memory Protection • Registered memory are associated with a VI consumer and only valid to the VI consumer • Gather/Scatter list • Gather list: a list of registered source data buffers (read) • Scatter List: a list of registered destination data buffers (write)

  12. Memory Model Page 0 Virtual Memory Space Registered Memory Region Page 1 Physical Memory Page n-1

  13. Descriptor • A work queue element to be placed into queue pair (send or receive queue) • Contains control segment and a list of address segment • Specifies operation command, memory address, size

  14. Descriptor Door Bell VIPL • An asynchronous mechanism to notify VI NIC of a new work queue post • Door Bell can be a register in NIC accessed by both CPU and NIC VI NIC 0 1

  15. Sender: Consumer: Register send buffer Post a Send work queue element Channel Adapter: Send out the data and header, data are retrieved directly from consumer memory Receiver Consumer: Register receive buffer Post a receive buffer in the receive queue Channel Adapter: Receive packets from sender Find out a receive queue element in the receive queue Move data directly to the buffer specified in the receive queue element Operation Example –Send/Receive

  16. Initiator Consumer: Register sending buffer address Get receiver’s address Post a RDMA Write Channel Adapter Send out data with header(the operation, receiving address), data are retrieved directly from sender buffer Receiver Consumer Register receiving buffer address Send the address, R-key and length to initiator Channel Adapter Receive data Check the validity of address in RDMA header Move data directly to the memory specified in the RDMA header Operation Example - RDMA Write

  17. Summary of VIA • Goal: low-latency, high-throughput by offering direct access to NIC, Zero copy • Architecture components: consumer (VIPL), UA, KA, VI-NIC • Main concepts: queue pairs, memory pin, gather/scatter, descriptor, door bell • Operations: Send/Receive, RDMA Read, RDMA Write

  18. Why QP/IP • TCP/IP network is robust, ubiquitous • However, TCP/IP is not designed for high-performance, low-latency purpose • Queue Pair abstraction provides a way to offload CPU processing, reduce the critical data path, provide memory zero copy • The Integration of QP and IP may be able to reduce the latency, improve the throughput between end-end node applications connected through TCP/IP network

  19. Challenges to QP/IP • Provide a VIPL supporting QP/IP • Integration of connection setup • Handle message segmentation • Implement TCP/IP mechanism at NIC • Handle message boundary for TCP • Handle zero-copy in the event of packet loss

  20. QP/IP Architecture

  21. QPIP Components • FSM: • Doorbell FSM • Sched/XMT FSM • RECV FSM • Mgmt FSM • Major Data Abstract • QPs • CQs • TCP Control Block (TCB)

  22. QP/IP State Machines

  23. QPIP Prototype • Three components • Application Library • PostSend(), PostRecv(), Poll(), Wait() • Kernel driver • Initialization • Address mapping mechanism • Interrupt service • Network interface firmware • Implement TCP, UDP, IPV6 protocols

  24. Application-Application RTT

  25. Application Throughput & CPU Utilization

  26. Network Interface Processing Cost

  27. QPIP Based on NBD

  28. NDB Client Throughput and CPU Effectiveness

  29. Summary • Integrate the QP concept from VIA with the ubiquitous TCP/IP network • Provide low-latency, high throughput for SAN • QP/IP contains doorbell FSM, Sched/XMT FSM, RECV FSM, Mgmt FSM. It also contains QPs, CQs, TCB data structure. • Demonstrate comparable performance, much lower CPU utilization with modest hardware. • The programmability also adds flexibility to adapt with the evolvement of TCP/IP and scheduling requirements.

  30. Issues • How to integrate TOE in the mechanism? • How to effectively handle message boundary in TCP to support upper level application, I.e. iSCSI? How to handle segmentation? • How to support zero-copy in the case of packet loss? • How to extend this into a WAN environment (more unpredictability, fluctuation of latency, available bandwidth, congestion, LFN)? • How to effectively support OSD communication?

  31. Questions?

More Related