The case for and against AI-RAN technology using Nvidia or AMD GPUs

Nvidia is proposing a new approach to telco networks dubbed “AI radio access network (AI-RAN).”  The GPU king says: “Traditional CPU or ASIC-based RAN systems are designed only for RAN use and cannot process AI traffic today. AI-RAN enables a common GPU-based infrastructure that can run both wireless and AI workloads concurrently, turning networks from single-purpose to multi-purpose infrastructures and turning sites from cost-centers to revenue sources. With a strategic investment in the right kind of technology, telcos can leap forward to become the AI grid that facilitates the creation, distribution, and consumption of AI across industries, consumers, and enterprises. This moment in time presents a massive opportunity for telcos to build a fabric for AI training (creation) and AI inferencing (distribution) by repurposing their central and distributed infrastructures.”

One of the first principles of AI-RAN technology is to be able to run RAN and AI workloads concurrently and without compromising carrier-grade performance. This multi-tenancy can be either in time or space: dividing the resources based on time of day or based on percentage of compute. This also implies the need for an orchestrator that can provision, de-provision, or shift workloads seamlessly based on available capacity.

Image Credit:  Pitinan Piyavatin/Alamy Stock Photo

ARC-1, an appliance Nvidia showed off earlier this year, comes with a Grace Blackwell “superchip” that would replace either a traditional vendor’s application-specific integrated circuit (ASIC) or an Intel processor. Ericsson and Nokia are exploring the possibilities with Nvidia.  Developing RAN software for use with Nvidia’s chips means acquiring competency in compute unified device architecture (CUDA), Nvidia’s instruction set. “They do have to reprofile into CUDA,” said Soma Velayutham, the general manager of Nvidia’s AI and telecom business, during a recent interview with Light Reading. “That is an effort.”

Proof of Concept:

SoftBank has turned the AI-RAN vision into reality, with its successful outdoor field trial in Fujisawa City, Kanagawa, Japan, where NVIDIA-accelerated hardware and NVIDIA Aerial software served as the technical foundation.  That achievement marks multiple steps forward for AI-RAN commercialization and provides real proof points addressing industry requirements on technology feasibility, performance, and monetization:

  • World’s first outdoor 5G AI-RAN field trial running on an NVIDIA-accelerated computing platform. This is an end-to-end solution based on a full-stack, virtual 5G RAN software integrated with 5G core.
  • Carrier-grade virtual RAN performance achieved.
  • AI and RAN multi-tenancy and orchestration achieved.
  • Energy efficiency and economic benefits validated compared to existing benchmarks.
  • A new solution to unlock AI marketplace integrated on an AI-RAN infrastructure.
  • Real-world AI applications showcased, running on an AI-RAN network.

Above all, SoftBank aims to commercially release their own AI-RAN product for worldwide deployment in 2026. To help other mobile network operators get started on their AI-RAN journey now, SoftBank is also planning to offer a reference kit comprising the hardware and software elements required to trial AI-RAN in a fast and easy way.

SoftBank developed their AI-RAN solution by integrating hardware and software components from NVIDIA and ecosystem partners and hardening them to meet carrier-grade requirements. Together, the solution enables a full 5G vRAN stack that is 100% software-defined, running on NVIDIA GH200 (CPU+GPU), NVIDIA Bluefield-3 (NIC/DPU), and Spectrum-X for fronthaul and backhaul networking. It integrates with 20 radio units and a 5G core network and connects 100 mobile UEs.

The core software stack includes the following components:

  • SoftBank-developed and optimized 5G RAN Layer 1 functions such as channel mapping, channel estimation, modulation, and forward-error-correction, using NVIDIA Aerial CUDA-Accelerated-RAN libraries
  • Fujitsu software for Layer 2 functions
  • Red Hat’s OpenShift Container Platform (OCP) as the container virtualization layer, enabling different types of applications to run on the same underlying GPU computing infrastructure
  • A SoftBank-developed E2E AI and RAN orchestrator, to enable seamless provisioning of RAN and AI workloads based on demand and available capacity

AI marketplace solution integrated with SoftBank AI-RAN.  Image Credit: Nvidia

The underlying hardware is the NVIDIA GH200 Grace Hopper Superchip, which can be used in various configurations from distributed to centralized RAN scenarios. This implementation uses multiple GH200 servers in a single rack, serving AI and RAN workloads concurrently, for an aggregated-RAN scenario. This is comparable to deploying multiple traditional RAN base stations.

In this pilot, each GH200 server was able to process 20 5G cells using 100-MHz bandwidth, when used in RAN-only mode. For each cell, 1.3 Gbps of peak downlink performance was achieved in ideal conditions, and 816Mbps was demonstrated with carrier-grade availability in the outdoor deployment.

……………………………………………………………………………………………………………………………………..

Could AMD GPU’s be an alternative to Nvidia AI-RAN?

AMD is certainly valued by NScale, a UK business with a GPU-as-a-service offer, as an AI alternative to Nvidia. “AMD’s approach is quite interesting,” said David Power, NScale’s chief technology officer. “They have a very open software ecosystem. They integrate very well with common frameworks.” So far, though, AMD has said nothing publicly about any AI-RAN strategy.

The other telco concern is about those promised revenues. Nvidia insists it was conservative when estimating that a telco could realize $5 in inferencing revenues for every $1 invested in AI-RAN. But the numbers met with a fair degree of skepticism in the wider market. Nvidia says the advantage of doing AI inferencing at the edge is that latency, the time a signal takes to travel around the network, would be much lower compared with inferencing in the cloud. But the same case was previously made for hosting other applications at the edge, and they have not taken off.

Even if AI changes that, it is unclear telcos would stand to benefit. Sales generated by the applications available on the mobile Internet have gone largely to hyperscalers and other software developers, leaving telcos with a dwindling stream of connectivity revenues. Expect AI-RAN to be a big topic for 2025 as operators carefully weigh their options.  Many telcos are unconvinced there is a valid economic case for AI-RAN, especially since GPUs generate a lot of power (they are perceived as “energy hogs”). 

References:

AI-RAN Goes Live and Unlocks a New AI Opportunity for Telcos

https://www.lightreading.com/ai-machine-learning/2025-preview-ai-ran-would-be-a-paradigm-shift

Nvidia bid to reshape 5G needs Ericsson and Nokia buy-in

Softbank goes radio gaga about Nvidia in nervy days for Ericsson

T-Mobile emerging as Nvidia’s big AI cheerleader

AI cloud start-up Vultr valued at $3.5B; Hyperscalers gorge on Nvidia GPUs while AI semiconductor market booms

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

Nvidia enters Data Center Ethernet market with its Spectrum-X networking platform

FT: New benchmarks for Gen AI models; Neocloud groups leverage Nvidia chips to borrow >$11B

Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

*