mapping of scalable rdma protocols to asic fpga platforms l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Mapping of scalable RDMA protocols to ASIC/FPGA platforms PowerPoint Presentation
Download Presentation
Mapping of scalable RDMA protocols to ASIC/FPGA platforms

Loading in 2 Seconds...

play fullscreen
1 / 16

Mapping of scalable RDMA protocols to ASIC/FPGA platforms - PowerPoint PPT Presentation


  • 290 Views
  • Uploaded on

Mapping of scalable RDMA protocols to ASIC/FPGA platforms. Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist Castel Systems Inc. & Dept. Physics and Astronomy George Mason University Fairfax, VA yosefgavriel@computer.org. Presentation Overview. Motivation

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Mapping of scalable RDMA protocols to ASIC/FPGA platforms' - blanca


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mapping of scalable rdma protocols to asic fpga platforms

Mapping of scalable RDMA protocols to ASIC/FPGA platforms

Yosef Gavriel Tirat-Gefen, PhD

Senior Member IEEE

Chief Scientist

Castel Systems Inc.

& Dept. Physics and Astronomy

George Mason University

Fairfax, VA

yosefgavriel@computer.org

presentation overview

Presentation Overview

Motivation

TCP Off-loading

Zero-copying

RDMA protocol

RDMA protocol stack

Structure of a RDMA card

Results

Conclusion

motivation

Motivation

Supercomputer or Server farm

Supercomputer

or Server farm

WAN

Terabyte storage

Terabyte storage

Workstation

Enabling high-bandwidth WAN applications

applications

Applications

Distributed Command and Control.

Signal processing (e.g. RADAR)

Sharing of intelligence data real-time.

Distributed large scale computation/ simulation of aerospace problems.

Extension of storage area networks over a wide area network (WAN).

Enabling technology for modern supercomputing installations.

traditional tcp ip networking

Layer 3

Layer 2

Layer 1

Layer 3

Layer 2

Layer 1

Traditional TCP/IP Networking

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (PHY)

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (PHY)

Router

standard data flow on tcp ip

L3 L2 L1

Standard Data Flow on TCP/IP

Application A

Memory Space

Application B

Memory Space

WAN/LAN

TCP Buffer/Stack

Memory Space

TCP Buffer/Stack

Memory Space

L1 L2 L3

standard data flow on tcp ip7
Standard Data Flow on TCP/IP
  • Traditional TCP/IP copies data from application to TCP memory buffer
  • Leads to CPU lost cycles in buffer copying
  • CPU gets overwhelmed to rates above 2.5 Gbps
  • TCP/IP off-loading is a help but it does not solve the problem on the receiver side
tcp ip off load processing

Application/O.S.

TCP

Layer 3 (IP)

Layer 2 (MAC)

Layer 1 (Phy)

TCP/IP off-load processing

Application/O.S.

TCP/IP offload

Processor (TOE)

Mapped to hardware

zero copying and tcp offloading processing
Zero-copying and TCP offloading processing

Host CPU Cache Memory

TCP off-load Processor

TOE/NIC Card

Host CPU

Host Main Memory

Receive Buffer

Network buffer

WAN/LAN

zero copying and tcp offloading processing10
Zero-copying and TCP offloading processing
  • Zero-copying is still not achieved as receiver buffer is still copied back to application memory space
  • TCP/IP off-loading is not scalable
  • RDMA protocols provide a solution
rdma data flow for wan applications
RDMA data-flow for WAN applications

Host Memory

Host Memory

Host CPU B

Host CPU A

Application Memory

Space

Application Memory

Space

WAN

RDMA NIC Card

RDMA NIC Card

scalable wan rdma for bandwidths above 10 gbps
Scalable WAN-RDMA for bandwidths above 10 Gbps

10 Gbps links

RDMA NIC Card for WAN

Tx Buffer

PHY

Host

MAC

> 10 Gbps

WAN

RDMA Engine

Rx Buffer

DMA channel

the rdma protocol layers and our prototype
The RDMA protocol layers and our prototype

Running on Host

CPU

ULP (e.g. iSCSI, NFS)

RDMA

DDP

MPA SCTP

TCP

Layer 3 (e.g. IP)

Layer 2 (MAC)

Layer 1 (PHY)

FPGA

implementation

FPGA and

off-the-shelf

MAC/PHY chips

overall hardware firmware organization of the wan rdma card

PCI-Express/Hyper-transport Interface

Overall Hardware/Firmware Organization of the WAN RDMA card

IP/Firmware

module

RDMA Protocol Engine

Rx Memory

controller

Tx Memory

controller

SCTP Protocol Engine

Rx Memory

Bank

Layer 3 (IP) Processor

Rx Memory

Bank

Data stream split/join unit

SAR

SAR

SAR

SAR

10GE/OC-192

framer

10GE/OC-192

framer

10GE/ OC-192 framer

10GE/OC-192 framer

PHY

PHY

PHY

PHY

present results

Present Results

Currently using Virtex-II/Virtex-IIPro (Xilinx) as target devices

for our cores

Data indicate that most of the key cores will fit one FPGA device (Virtex-II)

Aggregate of all cores is spanning several FPGAs

Intra-device communication is a issue, need to be careful with PCB design.

We are currently trying to accommodate most of the cores in one FPGA.

Most of the cores will be made available free-of-charge to researchers in non-profit or government organizations.

conclusion

Conclusion

Advent of Hyper-transport/ PCI-Express and VITA (embedded computing) standards will enable I/0 bandwidths above 10 Gbps locally

Extension of RDMA protocol enables large bandwidths over wide area networks

The proposed cores will fulfill the natural growth of bandwidth requirements in commercial/defense/aerospace applications.