170 likes | 268 Views
Explore how Remote Direct Memory Access over IP boosts network performance, minimizes CPU utilization, and eliminates data copying bottlenecks. Learn about the motivation, architecture, protocols, and potential applications of RDMA.
E N D
Remote Direct Memory Access (RDMA) over IPPFLDNet 2003, GenevaStephen Bailey, Sandburst Corp., steph@sandburst.comAllyn Romanow, Cisco Systems, allyn@cisco.com
RDDP Is Coming Soon “ST [RDMA] Is The Wave Of The Future” – S Bailey & C Good, CERN 1999 • Need: • standard protocols • host software • accelerated NICs (RNICs) • faster host buses (for > 1G) • Vendors are finally serious: Broadcom, Intel, Agilent, Adaptec, Emulex, Microsoft, IBM, HP (Compaq, Tandem, DEC), Sun, EMC, NetApp, Oracle, Cisco & many, many others
Overview • Motivation • Architecture • Open Issues
CFP SigComm Workshop • NICELI SigComm 03 Workshop Workshop on Network-I/O Convergence: Experience, Lessons, Implications • http://www.acm.org/sigcomm/sigcomm2003/workshop/niceli/index.html
High Speed Data Transfer • Bottlenecks • Protocol performance • Router performance • End station performance, host processing • CPU Utilization • The I/O Bottleneck • Interrupts • TCP checksum • Copies
What is RDMA? • Avoids copying by allowing network adapter under control of application to steer data directly into application buffers • Bulk data transfer or kernel bypass for small messages • Grid, cluster, supercomputing, data centers • Historically, special purpose fabrics – Fibre Channel, VIA, Infiniband, Quadrics, Servernet
application Ethernet/ IP Storage Network (Fibre Channel) A Machine Database Intermachine Network (VIA, IB, Proprietary) Traditional Data Center The World Servers
Why RDMA over IP? Business Case • TCP/IP not used for high bandwidth interconnection, host processing costs too high • High bandwidth transfer to become more prevalent – 10 GE, data centers • Special purpose interfaces are expensive • IP NICs are cheap, volume
The Technical Problem- I/O Bottleneck • With TCP/IP host processing can’t keep up with link bandwidth, on receive • Per byte costs dominate, Clark (89) • Well researched by distributed systems community, mid 1990’s. Industry experience. • Memory bandwidth doesn’t scale, processor memory performance gap– Hennessy(97), D.Patterson, T. Anderson(97), • Stream benchmark
Copying Using IP transports (TCP & SCTP) requires data copying 1 NIC Packet Buffer 2 Packet Buffer User Buffer Data copies
Why Is Copying Important? • Heavy resource consumption @ high speed (1Gbits/s and up) • Uses large % of available CPU • Uses large fraction of avail. bus bw – min 3 trips across the bus 64 KB window, 64 KB I/Os, 2P 600 MHz PIII, 9000 B MTU
What’s In RDMA For Us? Network I/O becomes `free’ (still have latency though) 1750 machines using 0% CPU for I/O 2500 machines using 30% CPU for I/O
Approaches to Copy Reduction • On-host – Special purpose software and/or hardware e.g., Zero Copy TCP, page flipping • Unreliable, idiosyncratic, expensive • Memory to memory copies, using network protocols to carry placement information • Satisfactory experience – Fibre Channel, VIA, Servernet • FOR HARDWARE, not software
RDMA over IP Standardization • IETF RDDP Remote Direct Data Placement WG • http://ietf.org/html.charters/rddp-charter.html • RDMAC RDMA Consortium • http://www.rdmaconsortium.org/home
ULP RDMA control DDP Transport IP RDMA over IP Architecture Two layers: • DDP – Direct Data Placement • RDMA - control
Upper and Lower Layers • ULPs- SDP Sockets Direct Protocol, iSCSI, MPI • DAFS is standardized NFSv4 on RDMA • SDP provides SOCK_STREAM API • Over reliable transport – TCP, SCTP
Open Issues • Security • TCP order processing, framing • Atomic ops • Ordering constraints – performance vs. predictability • Other transports, SCTP, TCP, unreliable • Impact on network & protocol behaviors • Next performance bottleneck? • What new applications? • Eliminates the need for large MTU (jumbos)?