Ericsson goes with custom silicon (rather than Nvidia GPUs) for AI RAN
Ahead of MWC Barcelona 2026, Ericsson unveiled its initial suite of AI-RAN products at a pre-event briefing in London, emphasizing a strategy anchored in proprietary, purpose-built silicon to enhance radio access network (RAN) performance. While the wireless industry is finally moving to virtualized/cloud RAN utilizing general-purpose processors from Intel, Ericsson is defending its continued investment in custom silicon for specialized, high-performance tasks.
Concurrently, the company is demonstrating a strong push toward software-defined flexibility, ensuring its proprietary RAN algorithms and AI-native software are portable across diverse, open silicon platforms. Ericsson was exploring the use of Nvidia’s Arm-based Grace CPU, rather than the Hopper-branded GPU, but has opted for custom silicon (ASICs) instead.
Ericsson’s RAN portfolio currently diverges into two primary architectures. The majority of its footprint relies on ASICs—developed through internal design and external partnerships with Intel. The alternative is Cloud RAN, which pairs Ericsson’s software stack with Intel Xeon processors. Despite the industry’s promise that virtualization would decouple hardware from software, Intel remains Ericsson’s sole silicon partner for production-grade deployments.
This hardware lock-in was underscored during Ericsson’s recent London event, where documentation confirmed “commercial support” exclusively for Intel, while AMD, Arm, and NVIDIA remain relegated to “prototype support.” Despite years of industry rhetoric regarding silicon diversity in the vRAN ecosystem, tangible progress remains stalled. Furthermore, the integration of AI into RAN software introduces new layers of complexity that may further entrench hardware dependencies.
Industry observers remain skeptical of Ericsson’s ambition for a “unified software stack” across heterogeneous hardware platforms. While hardware-software disaggregation is achievable in the higher layers (L2/L3), Layer 1 (L1)—the most compute-intensive portion of the stack—remains heavily optimized for specific silicon. Ericsson’s initial strategy relied on the portability of L1 code across x86 architectures (via AMD) or the adoption of Arm’s SVE2 (Scalable Vector Extension) to match Intel’s AVX-512 capabilities. However, achieving high-performance parity across these platforms without significant refactoring remains a significant engineering hurdle.
A critical bottleneck in PHY Layer (L1) processing is Forward Error Correction (FEC), which traditionally necessitates dedicated hardware acceleration. Ericsson initially addressed this using a lookaside acceleration model, offloading FEC tasks to discrete PCIe-based Intel accelerators. In recent iterations, Intel has moved toward a more integrated System-on-Chip (SoC) approach, embedding the accelerator directly onto the CPU die (e.g., vRAN Boost).
The primary challenge for Ericsson lies in achieving silicon parity across the AMD and NVIDIA ecosystems. While AMD’s FPGA-based accelerators have faced scrutiny regarding power efficiency, NVIDIA’s GPU-based offloading was previously viewed as cost-prohibitive for standard FEC. However, the rise of AI-RAN has recalibrated these economic models, as telcos explore the dual-use potential of GPUs for both RAN and AI workloads. Emerging platforms, such as Google’s Tensor Processing Units (TPUs), have also been identified by Ericsson leadership as viable long-term options.
Despite ambitions for a unified “single software track,” Ericsson’s technical roadmap suggests a more nuanced reality. While L2 and higher layers aim for a universal codebase across hardware platforms, L1 necessitates concurrent feature development and platform-specific tailoring. As CTO Erik Ekudden noted, maximizing the efficiency of advanced accelerators requires a degree of software customization that challenges the ideal of total hardware-software disaggregation.

Ericsson executives are keen to avoid what Executive VP Per Narvinger describes as a “native implementation,” which would create silicon vendor lock-in. To prevent that the company is prioritizing Hardware Abstraction Layers (HALs). Key initiatives include the adoption of the BBDev (Baseband Device) interface to decouple RAN software from underlying silicon. Furthermore, potential integration with NVIDIA’s CUDA platform is being evaluated to provide the necessary abstraction for heterogeneous compute environments, though this remains contingent on broader industry standardization.
Ericsson’s AI strategy mirrors this modular approach. By leveraging AI as a functional abstraction layer, the company aims to simplify software portability across diverse platforms while maintaining AI control loops for real-time network management. Unlike competitors tethered to high-TDP GPUs, Ericsson maintains that AI-RAN is commercially viable on general-purpose and purpose-built silicon. Recent London showcases highlighted AI-driven gains in spectral efficiency, channel estimation, and beamforming without external acceleration. A production-ready AI-native link adaptation model recently delivered a 10% spectral efficiency improvement in field tests and is now integrated into the latest baseband portfolio.
As for radios—a domain less susceptible to full virtualization—Ericsson is embedding Neural Network Accelerators (NNA) directly into its radio-unit ASICs. These programmable matrix cores are optimized for Massive MIMO inference, enabling sub-millisecond beamforming and channel estimation while adhering to strict site power envelopes. New AI‑ready radios, feature Ericsson custom silicon with neural network accelerators. They are said to boost on‑site AI inference capabilities in Massive MIMO radios, enabling real‑time optimization and full stack, fully distributed AI.
………………………………………………………………………………………………………………………………………………………………………………………………………………
References:
https://www.lightreading.com/5g/ericsson-does-ai-ran-minus-nvidia-in-push-for-5g-silicon-freedom


GPUs don’t need to be installed at or near cell sites to support AI inferencing, according to Yago Tenorio, Verizon’s chief technology officer. That can be done from a much smaller number of data centers hosting Verizon’s core network software, he told Light Reading during a previous interview. Greg McCall, BT’s incoming chief security and networks officer, similarly doubts whether numerous RAN sites would be needed to support low-latency AI applications in a country the size of the UK.
https://www.lightreading.com/ai-machine-learning/cursed-robot-stalked-mwc-goes-mad-for-ai-but-isn-t-sure-about-6g