SoftBank’s Transformer AI model boosts 5G AI-RAN uplink throughput by 30%, compared to a baseline model without AI
Softbank has developed its own Transformer-based AI model that can be used for wireless signal processing. SoftBank used its Transformer model to improve uplink channel interpolation which is a signal processing technique where the network essentially makes an educated guess as to the characteristics and current state of a signal’s channel. Enabling this type of intelligence in a network contributes to faster, more stable communication, according to SoftBank. The Japanese wireless network operator successfully increased uplink throughput by approximately 20% compared to a conventional signal processing method (the baseline method). In the latest demonstration, the new Transformer-based architecture was run on GPUs and tested in a live Over-the-Air (OTA) wireless environment. In addition to confirming real-time operation, the results showed further throughput gains and achieved ultra-low latency.
Editor’s note: A Transformer model is a type of neural network architecture that emerged in 2017. It excels at interpreting streams of sequential data associated with large language models (LLMs). Transformer models have also achieved elite performance in other fields of artificial intelligence (AI), including computer vision, speech recognition and time series forecasting. Transformer models are lightweight, efficient, and versatile – capable of natural language processing (NLP), image recognition and wireless signal processing as per this Softbank demo.
Significant throughput improvement:
- Uplink channel interpolation using the new architecture improved uplink throughput by approximately 8% compared to the conventional CNN model. Compared to the baseline method without AI, this represents an approximately 30% increase in throughput, proving that the continuous evolution of AI models leads to enhanced communication quality in real-world environments.
Higher AI performance with ultra-low latency:
- While real-time 5G communication requires processing in under 1 millisecond, this demonstration with the Transformer achieved an average processing time of approximately 338 microseconds, an ultra-low latency that is about 26% faster than the convolution neural network (CNN) [1.] based approach. Generally, AI model processing speeds decrease as performance increases. This achievement overcomes the technically difficult challenge of simultaneously achieving higher AI performance and lower latency. Editor’s note: Perhaps this can overcome the performance limitations in ITU-R M.2150 for URRLC in the RAN, which is based on an uncompleted 3GPP Release 16 specification.
Note 1. CNN-based approaches to achieving low latency focus on optimizing model architecture, computation, and hardware to accelerate inference, especially in real-time applications. Rather than relying on a single technique, the best results are often achieved through a combination of methods.
Using the new architecture, SoftBank conducted a simulation of “Sounding Reference Signal (SRS) prediction,” a process required for base stations to assign optimal radio waves (beams) to terminals. Previous research using a simpler Multilayer Perceptron (MLP) AI model for SRS prediction confirmed a maximum downlink throughput improvement of about 13% for a terminal moving at 80 km/h.*2
In the new simulation with the Transformer-based architecture, the downlink throughput for a terminal moving at 80 km/h improved by up to approximately 29%, and by up to approximately 31% for a terminal moving at 40 km/h. This confirms that enhancing the AI model more than doubled the throughput improvement rate (see Figure 1). This is a crucial achievement that will lead to a dramatic improvement in communication speeds, directly impacting the user experience.
The most significant technical challenge for the practical application of “AI for RAN” is to further improve communication quality using high-performance AI models while operating under the real-time processing constraint of less than one millisecond. SoftBank addressed this by developing a lightweight and highly efficient Transformer-based architecture that focuses only on essential processes, achieving both low latency and maximum AI performance. The important features are:
(1) Grasps overall wireless signal correlations
By leveraging the “Self-Attention” mechanism, a key feature of Transformers, the architecture can grasp wide-ranging correlations in wireless signals across frequency and time (e.g., complex signal patterns caused by radio wave reflection and interference). This allows it to maintain high AI performance while remaining lightweight. Convolution focuses on a part of the input, while Self-Attention captures the relationships of the entire input (see Figure 2).
(2) Preserves physical information of wireless signals
While it is common to normalize input data to stabilize learning in AI models, the architecture features a proprietary design that uses the raw amplitude of wireless signals without normalization. This ensures that crucial physical information indicating communication quality is not lost, significantly improving the performance of tasks like channel estimation.
(3) Versatility for various tasks
The architecture has a versatile, unified design. By making only minor changes to its output layer, it can be adapted to handle a variety of different tasks, including channel interpolation/estimation, SRS prediction, and signal demodulation. This reduces the time and cost associated with developing separate AI models for each task.
The demonstration results show that high-performance AI models like Transformer and the GPUs that run them are indispensable for achieving the high communication performance required in the 5G-Advanced and 6G eras. Furthermore, an AI-RAN that controls the RAN on GPUs allows for continuous performance upgrades through software updates as more advanced AI models emerge, even after the hardware has been deployed. This will enable telecommunication carriers to improve the efficiency of their capital expenditures and maximize value.
Moving forward, SoftBank will accelerate the commercialization of the technologies validated in this demonstration. By further improving communication quality and advancing networks with AI-RAN, SoftBank will contribute to innovation in future communication infrastructure. The Japan based conglomerate strongly endorsed AI RAN at MWC 2025.
References:
https://www.softbank.jp/en/corp/news/press/sbkk/2025/20250821_02/
https://www.telecoms.com/5g-6g/softbank-claims-its-ai-ran-tech-boosts-throughput-by-30-
https://www.telecoms.com/ai/softbank-makes-mwc-25-all-about-ai-ran
https://www.ibm.com/think/topics/transformer-model
https://www.itu.int/rec/R-REC-M.2150/en
Softbank developing autonomous AI agents; an AI model that can predict and capture human cognition
Dell’Oro Group: RAN Market Grows Outside of China in 2Q 2025
Dell’Oro: AI RAN to account for 1/3 of RAN market by 2029; AI RAN Alliance membership increases but few telcos have joined
Dell’Oro: RAN revenue growth in 1Q2025; AI RAN is a conundrum
Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?
OpenAI announces new open weight, open source GPT models which Orange will deploy
Deutsche Telekom and Google Cloud partner on “RAN Guardian” AI agent