BittWare Partner IP Core

RDMA

Low-Latency RoCE v2 at 100Gbps

XUP-VV8 card with RDMA IP
Grovf logo

The GROVF RDMA IP core and host drivers provide RDMA over Converged Ethernet (RoCE v2) system implementation and integration with standard Verbs API. The RDMA IP is delivered with a reference design that includes the IP subsystem itself, the 100G MAC IP subsystem, the DMA subsystem, host drivers, and example application on software. The system drivers are integrated with OFED standard Verbs API and are compatible with well-known RNIC cards and software. The IP core also provides a low-latency FPGA implementation of RoCE v2 at 100 Gbps throughput.

Key Features

Compatible with RNIC and soft RoCE v2

100Gb/s throughput, 2µs latency

1023 or more configurable RDMA queue pairs

Features

  • Hardware operated RC, XRC, RD, UC, UD services
  • Incoming and outgoing SEND, RDMA READ, RDMA WRITE
  • Memory protection domains implemented in FPGA and ECN
  • 3rd party MAC and DMA IPs
  • Standard Verbs API on Host Machine
  • Dynamic configuration using Verbs API
  • Hardware retransmission and reordering
  • Customizable IP

Enables RNIC use cases with FPGA-based SmartNIC

Block Diagram, Data Sheet and Product Details

Product Operation

The solution is a soft IP implementing RDMA over Converged Ethernet protocol. It consists of FPGA IP integrated with MAC and DMA, plus the host CPU drivers. The IP is compatible with BittWare’s IA-840f and IA-420f FPGA cards featuring Altera Agilex 7 and with XUP-VV8 and XUP-P3R FPGA cards featuring AMD UltraScale+. The solution complies with Channel Adapter and RoCE v2 requirements as stated in the IB specification. The diagram on page 1 shows a simplistic architectural overview of the system. The data plane and reliable communication is hardware offloaded and the implementation does not include CPU cores in the FPGA.

Detailed Feature List

  • Fully compatible with known RNIC products and soft RoCE implementations (RoCE v2)
  • Under 2.0 µs software to software latency (roundtrip) and under 300 ns hardware to hardware latency (roundtrip)
  • 100 Gb/s throughput
    • Configurable RDMA queue pairs
    • 1023 or more
  • Hardware retransmission management
  • Memory protection domains implemented in FPGA
  • Congestion control using ECN, PFC
  • Can work with 3rd party MAC and DMA IPs
  • Dynamic configuration using Verbs API
  • Standard Verbs API on host machine user / kernel space
  • Hardware implemented Reliable Connection (RC), Extended Reliable Connection (XRC), Reliable Datagram (RD), Unreliable Connection (UC), and Unreliable Datagram (UD)
  • Incoming and outgoing SEND, RDMA READ, RDMA WRITE (RDMA Atomic is not supported)

Reference Designs

The reference example consists of three parts:

  • Encrypted FPGA IPs with reference design which implement RDMA protocol
  • Software drivers which provide standard Verbs API support for the FPGA based RDMA adapter
  • Example application build on top of the Verbs API demonstrating ping-pong test results: latency and bandwidth

Sample Implementation Results

Device LUTs On-Chip Memory
UltraScale+ VU9P 170K 6Mb
Agilex 7 AGF014 170K 6Mb

Compatible FPGA Cards

Interested in Pricing or More Information?

Our technical sales team is ready to provide availability and configuration information, or answer your technical questions.