Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?

An increasing focus on deploying AI into radio access networks (RANs) was among the key findings of NVIDIA’s third annual “State of AI in Telecommunications” survey of 450 telecom professionals, as more than a third of respondents indicated they’re investing or planning to invest in AI-RAN. The survey polled more than 450 telecommunications professionals worldwide, revealing continued momentum for AI adoption — including growth in generative AI use cases — and how the technology is helping optimize customer experiences and increase employee productivity.  The percentage of network operators planning to use open source tools increased from 28% in 2023 to 40% in 2025. AvidThink Founder and Principal Roy Chua said one of the biggest challenges network operators will have when using open source models is vetting the outputs they get during training.

Of the telecommunications professionals surveyed, almost all stated that their company is actively deploying or assessing AI projects. Here are some top insights on impact and use cases:

  • 84% said AI is helping to increase their company’s annual revenue
  • 77% said AI helped reduce annual operating costs
  • 60% said increased employee productivity was their biggest benefit from AI
  • 44% said they’re investing in AI for customer experience optimization, which is the No. 1 area of investment for AI in telecommunications
  • 40% said they’re deploying AI into their network planning and operations, including RAN

The percentage of respondents who indicated they will build AI solutions in-house rose from 27% in 2024 to 37% this year.  “Telcos are really looking to do more of this work themselves,” Nvidia’s Global Head of Business Development for Telco Chris Penrose [1.] said. “They’re seeing the importance of them taking control and ownership of becoming an AI center of excellence, of doing more of the training of their own resources.”

With respect to using AI inferencing, Chris said, “”We’ve got 14 publicly announced telcos that are doing this today, and we’ve got an equally big funnel.”  Penrose noted that the AI skills gap remains the biggest hurdle for operators. Why? Because, as he put it, just because someone is an AI scientist doesn’t mean they are also necessarily a generative AI or agentic AI scientist specifically. And in order to attract the right talent, operators need to demonstrate that they have the infrastructure that will allow top-tier employees to do amazing work. See also: GPUs, data center infrastructure, etc.

Note 1.  Penrose represented AT&T’s IoT business for years at various industry trade shows and events before leaving the company in 2020.

Rather than the large data centers processing AI Large Language Models (LLMs),  AI inferencing could be done more quickly at smaller “edge” facilities that are closer to end users. That’s where telecom operators might step in.  “Telcos are in a unique position,” Penrose told Light Reading. He explained that many countries want to ensure that their AI data and operations remain inside the boundaries of that country. Thus, telcos can be “the trusted providers of [AI] infrastructure in their nations.”

“We’ll call it AI RAN-ready infrastructure. You can make money on it today. You can use it for your own operations. You can use it to go drive some services into the market. … Ultimately your network itself becomes a key anchor workload,” Penrose said.

Source: Skorzewiak/Alamy Stock Photo

Nvidia proposes that network operators can not only run their own AI workloads on Nvidia GPUs, they can also sell those inferencing services to third parties and make a profit by doing so.  “We’ve got lots of indications that many [telcos] are having success, and have not only deployed their first [AI compute] clusters, but are making reinvestments to deploy additional compute in their markets,” Penrose added.

Nvidia specifically pointed to AI inferencing announcements by SingtelSwisscomTelenorIndosat and SoftBank.

Other vendors are hoping for similar sales.  “I think this vision of edge computing becoming AI inferencing at the end of the network is massive for us,” HPE boss Antonio Neri said last year, in discussing HPE’s $14 billion bid for Juniper Networks.

That comes after multi-access edge computing (MEC) has not lived up to its potential, partially because a 5G SA core network is needed for that and few have been commercially deployed.  Edge computing disillusionment is clear among hyperscalers and also network operators. For example, Cox folded its edge computing business into its private networks operation. AT&T no longer discusses the edge computing locations it was building with Microsoft and Google. And Verizon has admitted to edge computing “miscalculations.”

Will AI inferencing be the savior for MEC?  The jury is out on that topic.  However, Nvidia said that 40% of its revenues already come from AI inferencing. Presumably that inferencing is happening in larger data centers and then delivered to nearby users. Meaning, a significant amount of inferencing is being done today without additional facilities, distributed at a network’s edge, that could enable speedier, low-latency AI services.

“The idea that AI inferencing is going to be all about low-latency connections, and hence stuff like AI RAN and and MEC and assorted other edge computing concepts, doesn’t seem to be a really good fit with the current main direction of AI applications and models,” argued Disruptive Wireless analyst Dean Bubley in a Linked In post.

References:

https://blogs.nvidia.com/blog/ai-telcos-survey-2025/

State of AI in Telecommunications

https://www.lightreading.com/ai-machine-learning/telcos-profiting-from-ai-inferencing-we-ve-been-here-before

https://www.fierce-network.com/premium/whitepaper/edge-computing-powered-global-ai-inference

https://www.fierce-network.com/cloud/are-ai-services-telcos-magic-revenue-bullet

The case for and against AI-RAN technology using Nvidia or AMD GPUs

Ericsson’s sales rose for the first time in 8 quarters; mobile networks need an AI boost

AI RAN Alliance selects Alex Choi as Chairman

Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029

AI sparks huge increase in U.S. energy consumption and is straining the power grid; transmission/distribution as a major problem

Tata Consultancy Services: Critical role of Gen AI in 5G; 5G private networks and enterprise use cases

The case for and against AI-RAN technology using Nvidia or AMD GPUs

Nvidia is proposing a new approach to telco networks dubbed “AI radio access network (AI-RAN).”  The GPU king says: “Traditional CPU or ASIC-based RAN systems are designed only for RAN use and cannot process AI traffic today. AI-RAN enables a common GPU-based infrastructure that can run both wireless and AI workloads concurrently, turning networks from single-purpose to multi-purpose infrastructures and turning sites from cost-centers to revenue sources. With a strategic investment in the right kind of technology, telcos can leap forward to become the AI grid that facilitates the creation, distribution, and consumption of AI across industries, consumers, and enterprises. This moment in time presents a massive opportunity for telcos to build a fabric for AI training (creation) and AI inferencing (distribution) by repurposing their central and distributed infrastructures.”

One of the first principles of AI-RAN technology is to be able to run RAN and AI workloads concurrently and without compromising carrier-grade performance. This multi-tenancy can be either in time or space: dividing the resources based on time of day or based on percentage of compute. This also implies the need for an orchestrator that can provision, de-provision, or shift workloads seamlessly based on available capacity.

Image Credit:  Pitinan Piyavatin/Alamy Stock Photo

ARC-1, an appliance Nvidia showed off earlier this year, comes with a Grace Blackwell “superchip” that would replace either a traditional vendor’s application-specific integrated circuit (ASIC) or an Intel processor. Ericsson and Nokia are exploring the possibilities with Nvidia.  Developing RAN software for use with Nvidia’s chips means acquiring competency in compute unified device architecture (CUDA), Nvidia’s instruction set. “They do have to reprofile into CUDA,” said Soma Velayutham, the general manager of Nvidia’s AI and telecom business, during a recent interview with Light Reading. “That is an effort.”

Proof of Concept:

SoftBank has turned the AI-RAN vision into reality, with its successful outdoor field trial in Fujisawa City, Kanagawa, Japan, where NVIDIA-accelerated hardware and NVIDIA Aerial software served as the technical foundation.  That achievement marks multiple steps forward for AI-RAN commercialization and provides real proof points addressing industry requirements on technology feasibility, performance, and monetization:

  • World’s first outdoor 5G AI-RAN field trial running on an NVIDIA-accelerated computing platform. This is an end-to-end solution based on a full-stack, virtual 5G RAN software integrated with 5G core.
  • Carrier-grade virtual RAN performance achieved.
  • AI and RAN multi-tenancy and orchestration achieved.
  • Energy efficiency and economic benefits validated compared to existing benchmarks.
  • A new solution to unlock AI marketplace integrated on an AI-RAN infrastructure.
  • Real-world AI applications showcased, running on an AI-RAN network.

Above all, SoftBank aims to commercially release their own AI-RAN product for worldwide deployment in 2026. To help other mobile network operators get started on their AI-RAN journey now, SoftBank is also planning to offer a reference kit comprising the hardware and software elements required to trial AI-RAN in a fast and easy way.

SoftBank developed their AI-RAN solution by integrating hardware and software components from NVIDIA and ecosystem partners and hardening them to meet carrier-grade requirements. Together, the solution enables a full 5G vRAN stack that is 100% software-defined, running on NVIDIA GH200 (CPU+GPU), NVIDIA Bluefield-3 (NIC/DPU), and Spectrum-X for fronthaul and backhaul networking. It integrates with 20 radio units and a 5G core network and connects 100 mobile UEs.

The core software stack includes the following components:

  • SoftBank-developed and optimized 5G RAN Layer 1 functions such as channel mapping, channel estimation, modulation, and forward-error-correction, using NVIDIA Aerial CUDA-Accelerated-RAN libraries
  • Fujitsu software for Layer 2 functions
  • Red Hat’s OpenShift Container Platform (OCP) as the container virtualization layer, enabling different types of applications to run on the same underlying GPU computing infrastructure
  • A SoftBank-developed E2E AI and RAN orchestrator, to enable seamless provisioning of RAN and AI workloads based on demand and available capacity

AI marketplace solution integrated with SoftBank AI-RAN.  Image Credit: Nvidia

The underlying hardware is the NVIDIA GH200 Grace Hopper Superchip, which can be used in various configurations from distributed to centralized RAN scenarios. This implementation uses multiple GH200 servers in a single rack, serving AI and RAN workloads concurrently, for an aggregated-RAN scenario. This is comparable to deploying multiple traditional RAN base stations.

In this pilot, each GH200 server was able to process 20 5G cells using 100-MHz bandwidth, when used in RAN-only mode. For each cell, 1.3 Gbps of peak downlink performance was achieved in ideal conditions, and 816Mbps was demonstrated with carrier-grade availability in the outdoor deployment.

……………………………………………………………………………………………………………………………………..

Could AMD GPU’s be an alternative to Nvidia AI-RAN?

AMD is certainly valued by NScale, a UK business with a GPU-as-a-service offer, as an AI alternative to Nvidia. “AMD’s approach is quite interesting,” said David Power, NScale’s chief technology officer. “They have a very open software ecosystem. They integrate very well with common frameworks.” So far, though, AMD has said nothing publicly about any AI-RAN strategy.

The other telco concern is about those promised revenues. Nvidia insists it was conservative when estimating that a telco could realize $5 in inferencing revenues for every $1 invested in AI-RAN. But the numbers met with a fair degree of skepticism in the wider market. Nvidia says the advantage of doing AI inferencing at the edge is that latency, the time a signal takes to travel around the network, would be much lower compared with inferencing in the cloud. But the same case was previously made for hosting other applications at the edge, and they have not taken off.

Even if AI changes that, it is unclear telcos would stand to benefit. Sales generated by the applications available on the mobile Internet have gone largely to hyperscalers and other software developers, leaving telcos with a dwindling stream of connectivity revenues. Expect AI-RAN to be a big topic for 2025 as operators carefully weigh their options.  Many telcos are unconvinced there is a valid economic case for AI-RAN, especially since GPUs generate a lot of power (they are perceived as “energy hogs”). 

References:

AI-RAN Goes Live and Unlocks a New AI Opportunity for Telcos

https://www.lightreading.com/ai-machine-learning/2025-preview-ai-ran-would-be-a-paradigm-shift

Nvidia bid to reshape 5G needs Ericsson and Nokia buy-in

Softbank goes radio gaga about Nvidia in nervy days for Ericsson

T-Mobile emerging as Nvidia’s big AI cheerleader

AI cloud start-up Vultr valued at $3.5B; Hyperscalers gorge on Nvidia GPUs while AI semiconductor market booms

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

Nvidia enters Data Center Ethernet market with its Spectrum-X networking platform

FT: New benchmarks for Gen AI models; Neocloud groups leverage Nvidia chips to borrow >$11B

Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!

Page 2 of 2
1 2