The 800Gb and beyond connectivity conundrum

The last couple of months there has been a lot of noise about the expected boom of 400Gb, 800Gb and 1.6Tb in the next 2-3 years. Yet it seems only yesterday we made the jump of 40Gb to 100Gb. Similar with latency where requirements increased from us to ms, yet latency in some of my latest project, related to the gateways to the cloud, was in ms. And I thought we were quite advanced in these things, was I so wrong or behind in my assumptions?

 


Figure 1: 2021 Lightcounting study on transceiver speed market growth

I think there are a few aspects currently making noise that need to be put in the right perspective. Indeed there is advanced requirement around AI with increased bandwidth demand and low latency expectations. But is this going to impact every aspect of the data center?
 
First of all the AI clusters will become the brain of the IT and you will still need the customer facing applications that will run in your DC or cloud environment. Secondly not all applications have a need for AI, e.g. the application responsible for paying your wages once a month, does not necessarily has to be AI driven. Third there is the difference between training and inference, where the amount of AI clusters needed to train models is a multiple of the hardware needed to apply the inference. All this will impact the amount of AI hardware needed, so were are the sudden expectations of massive 800G and beyond transceiver sales over the next couple of years come from.
 
 
Mainly from the C2C (compute to compute) interconnects, these are the scale-up links that interconnect the GPUs with each other and form a cluster where each GPU is connected with the other GPUs of the nodes. These connections need a large bandwidth and low latency as this will
increase the training efficiency of the AI. So an 18 Node AI Cluster with 4 GPU’s per Node will need 648 connections, this explains the disproportionate increase of very high bandwidth transceivers. In comparison, the requirements for frontend, backend and OoBM links are lower in amount, bandwidth and latency.
 
 

Source: amax.com

How does this impact the structured cabling in the DC? The current trend, mainly driven by NVIDIA, is that the intra-rack C2C links will be supported over Active Copper Cable (ACC) to ensure 800G and above is supported at the lowest latency, power and cost. The cost and power requirements of optic transceivers for 800G and above is at this point the multiple of ACC solutions. The disadvantage of ACC is that it supports only short links. For inter-rack C2C links the industry is more looking to AOC cables from a cost and power usage perspective. This means that the C2C cabling will be part of the AI Cluster solution and not hit your structured cabling. The frontend connectivity  will be in the magnitude of 100GbE to 200GbE resilient connections per compute tray, with most likely using parallel MMF, for distances less then 100m, and SMF for longer distances. For the backend connectivity, used to scale out the AI cluster, we are looking more at 800Gb over InfiniBand with 2 redundant connections per compute tray, these will most likely be supported over a ToR topology to allow the use of more cost and power efficient PCC (Passive Copper Cable) and ACC DAC Cables and just the 800G or 1.6T spine-leaf links would go over SMF and structured cabling.
 

So yes you have to start thinking in your structured cabling design about has to migration to 800Gb and above, but don’t expect a shock change in the next 2 years, this will follow probably the same time trajectory as the previous generations. Which structured cabling supports which data center speeds, how a AI network architecture looks like and much more will be covered in our DC Handbook which we will release in a few weeks. 

Leave a Reply

Your email address will not be published. Required fields are marked *

Contact US

If you want to know more about us, you can fill out the form to contact us and we will answer your questions at any time.