Line-rate forwarding of 100Gbps Ethernet traffic has been a huge challenge—if not impossible—for general-purpose CPUs. However, the major obstacle in the system is not CPU power (modern multicore CPU performance is gigantic); the problem lies in the PCI Express interface itself.
The fastest PCIe Gen3 variant with 16 full-duplex lines, each running at 8 G transfers/sec in each direction, is a bidirectional 128Gbps. However, due to 128/130 bit encoding and other protocol overhead, the actual realized throughput drops to something like 100Gbps. Still good enough for full-duplex 100G Ethernet operation. However, there’s currently no FPGA in production that supports a PCIe Gen3 x16 interface. That’s a problem for 100G Ethernet designs, and yes there is now a solution to that problem.
CESNET and NEtcope Technologies have developed an FPGA-based, full-duplex PCIe Gen3 x16 interface (two bifurcated PCIe Gen3 x8 interfaces) for 100G full-duplex Ethernet designs using a Virtex-7 H580T 3D FPGA and appropriate IP instantiated in the FPGA. (Note: You can read about a similar half-duplex design in “Need to get 100G Ethernet data stream into a host Intel CPU? PCIe bifurcation is the answer.”) Here’s a block diagram of the system:
This full-duplex PCIe Gen3 x16 bifurcated interface didn’t just fall out of a box of standard IP. The final, successful design required some tweaking and optimization. The length and ordering of PCI Express transactions were fine-tuned by hand using a PCIe protocol analyzer to achieve optimal performance with Intel Xeon CPUs. Additional PCIe transaction buffers to extend standard Xilinx PCIe core capacity and compensate for long-latency PCIe reads were added as was extra transaction-tag space. Eight independent ring buffers for RX and eight for TX were allocated in RAM to allow multiple Xeon CPU cores to work independently in parallel without the need for inter-core communications.