Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Posted on January 4, 2025 by Alan Weissberger

A growing portion of the billions of dollars being spent on AI data centers will go to the suppliers of networking chips, lasers, and switches that integrate thousands of GPUs and conventional micro-processors into a single AI computer cluster. AI can’t advance without advanced networks, says Nvidia’s networking chief Gilad Shainer. “The network is the most important element because it determines the way the data center will behave.”

Networking chips now account for just 5% to 10% of all AI chip spending, said Broadcom CEO Hock Tan. As the size of AI server clusters hits 500,000 or a million processors, Tan expects that networking will become 15% to 20% of a data center’s chip budget. A data center with a million or more processors will cost $100 billion to build.

The firms building the biggest AI clusters are the hyperscalers, led by Alphabet’s Google, Amazon.com, Facebook parent Meta Platforms, and Microsoft. Not far behind are Oracle, xAI, Alibaba Group Holding, and ByteDance. Earlier this month, Bloomberg reported that capex for those four hyperscalers would exceed $200 billion this year, making the year-over-year increase as much as 50%. Goldman Sachs estimates that AI data center spending will rise another 35% to 40% in 2025. Morgan Stanley expects Amazon and Microsoft to lead the pack with $96.4bn and $89.9bn of capex respectively, while Google and Meta will follow at $62.6bn and $52.3bn.

AI compute server architectures began scaling in recent years for two reasons.

1.] High end processor chips from Intel neared the end of speed gains made possible by shrinking a chip’s transistors.

2.] Computer scientists at companies such as Google and OpenAI built AI models that performed amazing feats by finding connections within large volumes of training material.

As the components of these “Large Language Models” (LLMs) grew to millions, billions, and then trillions, they began translating languages, doing college homework, handling customer support, and designing cancer drugs. But training an AI LLM is a huge task, as it calculates across billions of data points, rolls those results into new calculations, then repeats. Even with Nvidia accelerator chips to speed up those calculations, the workload has to be distributed across thousands of Nvidia processors and run for weeks.

To keep up with the distributed computing challenge, AI data centers all have two networks:

The “front end” network which sends and receives data to/from external users —like the networks of every enterprise data center or cloud-computing center. It’s placed on the network’s outward-facing front end or boundary and typically includes equipment like high end routers, web servers, DNS servers, application servers, load balancers, firewalls, and other devices which connect to the public internet, IP-MPLS VPNs and private lines.
A “back end” network that connects every AI processor (GPUs and conventional MPUs) and memory chip with every other processor within the AI data center. “It’s just a supercomputer made of many small processors,” says Ram Velaga, Broadcom’s chief of core switching silicon. “All of these processors have to talk to each other as if they are directly connected.” AI’s back-end networks need high bandwidth switches and network connections. Delays and congestion are expensive when each Nvidia compute node costs as much as $400,000. Idle processors waste money. Back-end networks carry huge volumes of data. When thousands of processors are exchanging results, the data crossing one of these networks in a second can equal all of the internet traffic in America.

Nvidia became one of today’s largest vendors of network gear via its acquisition of Israel based Mellanox in 2020 for $6.9 billion. CEO Jensen Huang and his colleagues realized early on that AI workloads would exceed a single box. They started using InfiniBand—a network designed for scientific supercomputers—supplied by Mellanox. InfiniBand became the standard for AI back-end networks.

While most AI dollars still go to Nvidia GPU accelerator chips, back-end networks are important enough that Nvidia has large networking sales. In the September quarter, those network sales grew 20%, to $3.1 billion. However, Ethernet is now challenging InfiniBand’s lock on AI networks. Fortunately for Nvidia, its Mellanox subsidiary also makes high speed Ethernet hardware modules. For example, xAI uses Nvidia Ethernet products in its record-size Colossus system.

While current versions of Ethernet lack InfiniBand’s tools for memory and traffic management, those are now being added in a version called Ultra Ethernet [1.]. Many hyperscalers think Ethernet will outperform InfiniBand, as clusters scale to hundreds of thousands of processors. Another attraction is that Ethernet has many competing suppliers. “All the largest guys—with an exception of Microsoft—have moved over to Ethernet,” says an anonymous network industry executive. “And even Microsoft has said that by summer of next year, they’ll move over to Ethernet, too.”

Note 1. Primary goals and mission of Ultra Ethernet Consortium (UEC): Deliver a complete architecture that optimizes Ethernet for high performance AI and HPC networking, exceeding the performance of today’s specialized technologies. UEC specifically focuses on functionality, performance, TCO, and developer and end-user friendliness, while minimizing changes to only those required and maintaining Ethernet interoperability. Additional goals: Improved bandwidth, latency, tail latency, and scale, matching tomorrow’s workloads and compute architectures. Backwards compatibility to widely-deployed APIs and definition of new APIs that are better optimized to future workloads and compute architectures.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………….

Ethernet back-end networks offer a big opportunity for Arista Networks, which builds switches using Broadcom chips. In the past two years, AI data centers became an important business for Arista. AI provides sales to Arista switch rivals Cisco and Juniper Networks (soon to be a part of Hewlett Packard Enterprise), but those companies aren’t as established among hyperscalers. Analysts expect Arista to get more than $1 billion from AI sales next year and predict that the total market for back-end switches could reach $15 billion in a few years. Three of the five big hyperscale operators are using Arista Ethernet switches in back-end networks, and the other two are testing them. Arista CEO Jayshree Ullal (a former SCU EECS grad student of this author/x-adjunct Professor) says that back-end network sales seem to pull along more orders for front-end gear, too.

The network chips used for AI switching are feats of engineering that rival AI processor chips. Cisco makes its own custom Ethernet switching chips, but some 80% of the chips used in other Ethernet switches comes from Broadcom, with the rest supplied mainly by Marvell. These switch chips now move 51 terabits of data a second; it’s the same amount of data that a person would consume by watching videos for 200 days straight. Next year, switching speeds will double.

The other important parts of a network are connections between computing nodes and cables. As the processor count rises, connections increase at a faster rate. A 25,000-processor cluster needs 75,000 interconnects. A million processors will need 10 million interconnects. More of those connections will be fiber optic, instead of copper or coax. As networks speed up, copper’s reach shrinks. So, expanding clusters have to “scale-out” by linking their racks with optics. “Once you move beyond a few tens of thousand, or 100,000, processors, you cannot connect anything with copper—you have to connect them with optics,” Velaga says.

AI processing chips (GPUs) exchange data at about 10 times the rate of a general-purpose processor chip. Copper has been the preferred conduit because it’s reliable and requires no extra power. At current network speeds, copper works well at lengths of up to five meters. So, hyperscalers have tried to “scale-up” within copper’s reach by packing as many processors as they can within each shelf, and rack of shelves.

Back-end connections now run at 400 gigabits per second, which is equal to a day and half of video viewing. Broadcom’s Velaga says network speeds will rise to 800 gigabits in 2025, and 1.6 terabits in 2026.

Nvidia, Broadcom, and Marvell sell optical interface products, with Marvell enjoying a strong lead in 800-gigabit interconnects. A number of companies supply lasers for optical interconnects, including Coherent, Lumentum Holdings, Applied Optoelectronics, and Chinese vendors Innolight and Eoptolink. They will all battle for the AI data center over the next few years.

A 500,000-processor cluster needs at least 750 megawatts, enough to power 500,000 homes. When AI models scale to a million or more processors, they will require gigawatts of power and have to span more than one physical data center, says Velaga.

The opportunity for optical connections reaches beyond the AI data center. That’s because there isn’t enough power. In September, Marvell, Lumentum, and Coherent demonstrated optical links for data centers as far apart as 300 miles. Nvidia’s next-generation networks will be ready to run a single AI workload across remote locations.

Some worry that AI performance will stop improving as processor counts scale. Nvidia’s Jensen Huang dismissed those concerns on his last conference call, saying that clusters of 100,000 processors or more will just be table stakes with Nvidia’s next generation of chips. Broadcom’s Velaga says he is grateful: “Jensen (Nvidia CEO) has created this massive opportunity for all of us.”

References:

https://www.barrons.com/articles/ai-networking-nvidia-cisco-broadcom-arista-bce88c76?mod=hp_WIND_B_1_1 (PAYWALL)

https://www.msn.com/en-us/news/technology/networking-companies-ride-the-ai-wave-it-isn-t-just-nvidia/ar-AA1wJXGa?ocid=BingNewsSerp

https://www.datacenterdynamics.com/en/news/morgan-stanley-hyperscaler-capex-to-reach-300bn-in-2025/

https://ultraethernet.org/ultra-ethernet-specification-update/

Using a distributed synchronized fabric for parallel computing workloads- Part I

Using a distributed synchronized fabric for parallel computing workloads- Part II

3 thoughts on “Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections”

Anonymous says:

January 5, 2025 at 18:41

While the Ultra Ethernet Consortium (UEC) is collaborating with the IEEE 802.3 working group (WG) and has a liaison relationship with them, the “Ultra Ethernet” standard is NOT currently being standardized by the IEEE 802.3 WG. Instead, the UEC is developing its own specifications focused on optimizing Ethernet for high-performance AI and HPC workloads, building upon existing IEEE 802.3 standards at the physical layer.
Current IEEE 802.3 standards address high-speed Ethernet options, but the UEC’s focus on “Ultra Ethernet” indicates a potential need for even higher data rates in the future, which could be addressed by future revisions of the 802.3 standard.
IEEE Member says:

January 6, 2025 at 19:38

Dec 3, 2024 remarks by Nvidia’s Colette Kress – Executive Vice President and Chief Financial Officer:

Our Networking business is one of the most important additions that we added when we went to a datacenter scale. The ability to think through, not just the time where the data, the work is being done at a data processing or the use of the compute and/or the GPU is essential to think through the Networking’s position inside of that datacenter.

So we have two different offerings. We have InfiniBand and InfiniBand had tremendous success with many of the largest supercomputers in the world for decades. And that has been very important in terms of the size of data, the size of speed of data going through. It had different views in terms of how to deal with the traffic that will be there.

Ethernet, a great configuration that is the standard for many of the enterprises. But Ethernet was not built for AI. Ethernet was just built for the networking inside of datacenters. So we are taking some of the best of the breeds of what you see in terms of inter InfiniBand and creating Ethernet for AI. That allows customers now, both the choice between those. We can be full end-to-end systems with InfiniBand and now you have your choice in terms of what we do with Ethernet.

Both of these things are a growth option for us. In terms of this last quarter, we had some timing issues. But now, what you will see in terms of the continuation of our Networking will definitely grow. With our designs in terms of Networking with our compute, we have some of the strongest clusters that are being built and also using our Networking. That connection that we have done has been a very important part of the work that we’ve done since the acquisition of Mellanox. Folks do represent and understand our use of networking and how that can help their overall system as a whole.

https://seekingalpha.com/article/4741811-nvidia-corporation-nvda-ubs-global-technology-conference-transcript
Anonymous says:

July 10, 2025 at 19:57

Amazon Building Huge, Super Sized AI Data Centers
A year ago, a 1,200-acre stretch of farmland outside New Carlisle, Ind., was an empty cornfield. Now, seven Amazon data centers rise up from the rich soil, each larger than a football stadium.

Over the next several years, Amazon plans to build around 30 data centers at the site, packed with hundreds of thousands of specialized computer chips. With hundreds of thousands of miles of fiber connecting every chip and computer together, the entire complex will form one giant machine intended just for artificial intelligence.

The facility will consume 2.2 gigawatts of electricity — enough to power a million homes. Each year, it will use millions of gallons of water to keep the chips from overheating. And it was built with a single customer in mind: the A.I. start-up Anthropic, which aims to create an A.I. system that matches the human brain.

The complex — so large that it can be viewed completely only from high in the sky — is the first in a new generation of data centers being built by Amazon, and part of what the company calls Project Rainier, after the mountain that looms near its Seattle headquarters. Project Rainier will also include facilities in Mississippi and possibly other locations, like North Carolina and Pennsylvania.

Advertisement

SKIP ADVERTISEMENT

Project Rainier is Amazon’s entry into a race by the technology industry to build data centers so large they would have been considered absurd just a few years ago. Meta, which owns Facebook, Instagram and WhatsApp, is building a two-gigawatt data center in Louisiana. OpenAI is erecting a 1.2-gigawatt facility in Texas and another, nearly as large, in the United Arab Emirates.

These data centers will dwarf most of today’s, which were built before OpenAI’s ChatGPT chatbot inspired the A.I. boom in 2022. The tech industry’s increasingly powerful A.I. technologies require massive networks of specialized computer chips — and hundreds of billions of dollars to build the data centers that house those chips. The result: behemoths that stretch the limits of the electrical grid and change the way the world thinks about computers.

Amazon, which has invested $8 billion in Anthropic, will rent computing power from the new facility to its start-up partner. An Anthropic co-founder, Tom Brown, who oversees the company’s work with Amazon on its chips and data centers, said having all that computing power in one spot could allow the start-up to train a single A.I. system.
https://www.nytimes.com/2025/06/24/technology/amazon-ai-data-centers.html?searchResultPosition=1

Comments are closed.

IEEE ComSoc Technology Blog

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

References:

Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Canalys & Gartner: AI investments drive growth in cloud infrastructure spending

AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms

AI wave stimulates big tech spending and strong profits, but for how long?

Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029

Using a distributed synchronized fabric for parallel computing workloads- Part II

3 thoughts on “Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections”

Archives

Archives

Recent Posts