AI/ML
Does AI change the business case for cloud networking?
For several years now, the big cloud service providers – Amazon Web Services (AWS), Microsoft Azure, and Google Cloud – have tried to get wireless network operators to run their 5G SA core network, edge computing and various distributed applications on their cloud platforms. For example, Amazon’s AWS public cloud, Microsoft’s Azure for Operators, and Google’s Anthos for Telecom were intended to get network operators to run their core network functions into a hyperscaler cloud.
AWS had early success with Dish Network’s 5G SA core network which has all its functions running in Amazon’s cloud with fully automated network deployment and operations.
Conversely, AT&T has yet to commercially deploy its 5G SA Core network on the Microsoft Azure public cloud. Also, users on AT&T’s network have experienced difficulties accessing Microsoft 365 and Azure services. Those incidents were often traced to changes within the network’s managed environment. As a result, Microsoft has drastically reduced its early telecom ambitions.
Several pundits now say that AI will significantly strengthen the business case for cloud networking by enabling more efficient resource management, advanced predictive analytics, improved security, and automation, ultimately leading to cost savings, better performance, and faster innovation for businesses utilizing cloud infrastructure.
Markets & Markets forecasts the global cloud AI market (which includes cloud AI networking) will grow at a CAGR of 32.4% from 2024 to 2029.
-
Optimized resource allocation:
AI algorithms can analyze real-time data to dynamically adjust cloud resources like compute power and storage based on demand, minimizing unnecessary costs.
-
Predictive maintenance:
By analyzing network patterns, AI can identify potential issues before they occur, allowing for proactive maintenance and preventing downtime.
-
Enhanced security:
AI can detect and respond to cyber threats in real-time through anomaly detection and behavioral analysis, improving overall network security.
-
Intelligent routing:
AI can optimize network traffic flow by dynamically routing data packets to the most efficient paths, improving network performance.
-
Automated network management:AI can automate routine network management tasks, freeing up IT staff to focus on more strategic initiatives.
“Now enter AI,” he continued. “With AI … I really have a power to do some amazing things, like enrich customer experiences, automate my network, feed the network data into my customer experience virtual agents. There’s a lot I can do with AI. It changes the business case that we’ve been running.”
Deutsche Telekom and Google Cloud partner on “RAN Guardian” AI agent
Deutsche Telekom and Google Cloud today announced a new partnership to improve Radio Access Network (RAN) operations through the development of a network AI agent. Built using Gemini 2.0 in Vertex AI from Google Cloud, the agent can analyze network behavior, detect performance issues, and implement corrective actions to improve network reliability, reduce operational costs, and enhance customer experiences.
Deutsche Telekom says that as telecom networks become increasingly complex, traditional rule-based automation falls short in addressing real-time challenges. The solution is to use Agentic AI which leverages large language models (LLMs) and advanced reasoning frameworks to create intelligent agents that can think, reason, act, and learn independently.
The RAN Guardian agent, which has been tested and verified at Deutsche Telekom, collaborates in a human-like manner, detecting network anomalies and executing self-healing actions to optimize RAN performance. It will be exhibited at next week’s Mobile World Congress (MWC) in Barcelona, Spain.
–>This cooperative initiative appears to be a first step towards building autonomous and self-healing networks.
In addition to Gemini 2.0 in Vertex AI, the RAN Guardian also uses CloudRun, BigQuery, and Firestore to help deliver:
- Autonomous RAN performance monitoring: The RAN Guardian will continuously analyze key network parameters in real time to predict and detect anomalies.
- AI-driven issue classification and routing: The agent will identify and prioritize network degradations based on multiple data sources, including network monitoring data, inventory data, performance data, and coverage data.
- Proactive network optimization: The agent will also recommend or autonomously implement corrective actions, including resource reallocation and configuration adjustments.
“By combining Deutsche Telekom’s deep telecom expertise with Google Cloud’s cutting-edge AI capabilities, we’re building the next generation of intelligent networks,” said Angelo Libertucci, Global Industry Lead, Telecommunications, Google Cloud. “This means fewer disruptions, faster speeds, and an overall enhanced mobile experience for Deutsche Telekom’s customers.”
“Traditional network management approaches are no longer sufficient to meet the demands of 5G and beyond. We are pioneering AI agents for networks, working with key partners like Google Cloud to unlock a new level of intelligence and automation in RAN operations as a step towards autonomous, self-healing networks” said Abdu Mudesir, Group CTO, Deutsche Telekom.
Mr. Mudesir and Google Cloud’s Muninder Sambi will discuss the role of AI agents in the future of network operations at MWC next week.
References:
https://www.telecoms.com/ai/deutsche-telekom-and-google-cloud-team-up-on-ai-agent-for-ran-operations
Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?
The case for and against AI-RAN technology using Nvidia or AMD GPUs
AI RAN Alliance selects Alex Choi as Chairman
AI sparks huge increase in U.S. energy consumption and is straining the power grid; transmission/distribution as a major problem
Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?
An increasing focus on deploying AI into radio access networks (RANs) was among the key findings of NVIDIA’s third annual “State of AI in Telecommunications” survey of 450 telecom professionals, as more than a third of respondents indicated they’re investing or planning to invest in AI-RAN. The survey polled more than 450 telecommunications professionals worldwide, revealing continued momentum for AI adoption — including growth in generative AI use cases — and how the technology is helping optimize customer experiences and increase employee productivity. The percentage of network operators planning to use open source tools increased from 28% in 2023 to 40% in 2025. AvidThink Founder and Principal Roy Chua said one of the biggest challenges network operators will have when using open source models is vetting the outputs they get during training.
Of the telecommunications professionals surveyed, almost all stated that their company is actively deploying or assessing AI projects. Here are some top insights on impact and use cases:
- 84% said AI is helping to increase their company’s annual revenue
- 77% said AI helped reduce annual operating costs
- 60% said increased employee productivity was their biggest benefit from AI
- 44% said they’re investing in AI for customer experience optimization, which is the No. 1 area of investment for AI in telecommunications
- 40% said they’re deploying AI into their network planning and operations, including RAN
The percentage of respondents who indicated they will build AI solutions in-house rose from 27% in 2024 to 37% this year. “Telcos are really looking to do more of this work themselves,” Nvidia’s Global Head of Business Development for Telco Chris Penrose [1.] said. “They’re seeing the importance of them taking control and ownership of becoming an AI center of excellence, of doing more of the training of their own resources.”
With respect to using AI inferencing, Chris said, “”We’ve got 14 publicly announced telcos that are doing this today, and we’ve got an equally big funnel.” Penrose noted that the AI skills gap remains the biggest hurdle for operators. Why? Because, as he put it, just because someone is an AI scientist doesn’t mean they are also necessarily a generative AI or agentic AI scientist specifically. And in order to attract the right talent, operators need to demonstrate that they have the infrastructure that will allow top-tier employees to do amazing work. See also: GPUs, data center infrastructure, etc.
Note 1. Penrose represented AT&T’s IoT business for years at various industry trade shows and events before leaving the company in 2020.
Rather than the large data centers processing AI Large Language Models (LLMs), AI inferencing could be done more quickly at smaller “edge” facilities that are closer to end users. That’s where telecom operators might step in. “Telcos are in a unique position,” Penrose told Light Reading. He explained that many countries want to ensure that their AI data and operations remain inside the boundaries of that country. Thus, telcos can be “the trusted providers of [AI] infrastructure in their nations.”
“We’ll call it AI RAN-ready infrastructure. You can make money on it today. You can use it for your own operations. You can use it to go drive some services into the market. … Ultimately your network itself becomes a key anchor workload,” Penrose said.
Source: Skorzewiak/Alamy Stock Photo
Nvidia proposes that network operators can not only run their own AI workloads on Nvidia GPUs, they can also sell those inferencing services to third parties and make a profit by doing so. “We’ve got lots of indications that many [telcos] are having success, and have not only deployed their first [AI compute] clusters, but are making reinvestments to deploy additional compute in their markets,” Penrose added.
Nvidia specifically pointed to AI inferencing announcements by Singtel, Swisscom, Telenor, Indosat and SoftBank.
Other vendors are hoping for similar sales. “I think this vision of edge computing becoming AI inferencing at the end of the network is massive for us,” HPE boss Antonio Neri said last year, in discussing HPE’s $14 billion bid for Juniper Networks.
That comes after multi-access edge computing (MEC) has not lived up to its potential, partially because a 5G SA core network is needed for that and few have been commercially deployed. Edge computing disillusionment is clear among hyperscalers and also network operators. For example, Cox folded its edge computing business into its private networks operation. AT&T no longer discusses the edge computing locations it was building with Microsoft and Google. And Verizon has admitted to edge computing “miscalculations.”
Will AI inferencing be the savior for MEC? The jury is out on that topic. However, Nvidia said that 40% of its revenues already come from AI inferencing. Presumably that inferencing is happening in larger data centers and then delivered to nearby users. Meaning, a significant amount of inferencing is being done today without additional facilities, distributed at a network’s edge, that could enable speedier, low-latency AI services.
“The idea that AI inferencing is going to be all about low-latency connections, and hence stuff like AI RAN and and MEC and assorted other edge computing concepts, doesn’t seem to be a really good fit with the current main direction of AI applications and models,” argued Disruptive Wireless analyst Dean Bubley in a Linked In post.
References:
https://blogs.nvidia.com/blog/ai-telcos-survey-2025/
State of AI in Telecommunications
https://www.fierce-network.com/premium/whitepaper/edge-computing-powered-global-ai-inference
https://www.fierce-network.com/cloud/are-ai-services-telcos-magic-revenue-bullet
The case for and against AI-RAN technology using Nvidia or AMD GPUs
Ericsson’s sales rose for the first time in 8 quarters; mobile networks need an AI boost
AI RAN Alliance selects Alex Choi as Chairman
Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029
AI sparks huge increase in U.S. energy consumption and is straining the power grid; transmission/distribution as a major problem
Tata Consultancy Services: Critical role of Gen AI in 5G; 5G private networks and enterprise use cases
Cisco CEO sees great potential in AI data center connectivity, silicon, optics, and optical systems
It’s no surprise to IEEE Techblog readers that Cisco’s networking business – still its biggest unit, generating nearly half its total sales – reported <$6.9 billion in revenue for the three-month period ending in January (Cisco’s second fiscal quarter). That was down 3% compared with the same quarter the year before. For its first half year, networking sales dropped 14% year-over-year, to about $13.6 billion.
However, total second-quarter revenues grew 9% year-over-year, to just less than $14 billion, boosted by the Splunk (security company) acquisition in March 2024. Thanks to that deal, Cisco’s security revenues more than doubled for the first half, to about $4.1 billion. But net income fell 8%, to roughly $2.4 billion, due partly to higher costs for research and development, as well as sales and marketing expenses.
Cisco groused about an “inventory correction” as networking customers digested stock they had already bought, but that surely is not the case now as that inventory has been worked off by its customers (ISPs, telcos, enterprise & government end users). Cisco CFO Richard Scott Herren now says “The demand that we’re seeing today a function of extended lead times like we saw a couple of years ago. That’s not the case. Our lead times are not extending.”
Currently, Cisco firmly believes that Ethernet connectivity sales to owners of AI data centers is an “emerging opportunity.” That refers to Cisco’s data center switching solutions for “web-scale” and enterprise customer intra-data center communications. The company’s AI strategy is described here.
Image Courtesy of Cisco Systems
………………………………………………………………………………………………………………………………………
AI investments “will lead to our networking equipment being combined with Nvidia GPUs, and that’s how we’ll accomplish that in the future,” CEO Chuck Robbins told industry analysts on a call to discuss second-quarter results, according to a Motley Fool transcript. “There’s so much change going on right now from a technology perspective that there’s both excitement about the opportunity, and candidly, there’s a little bit of fear of slowing down too much and letting your competition get too much ahead of you. So, we saw solid demand,” he said.
However, Cisco will face mighty competition in that space.
- Nokia is targeting the same opportunity and last month said it would spend an additional €100 million (US$104 million) on its Internet Protocol unit annually with the goal of generating another €1 billion ($1.04 billion) in data center revenues by 2028.
- Arista Networks is another rival in this market, selling high performance Ethernet switches to cloud service providers like Microsoft.
- Nvidia, whose $7 billion acquisition of Mellanox in 2019 gave it effective control of InfiniBand, an alternative to Ethernet that had represented the main option for connecting GPU clusters when analysts published research on the topic in August 2023. Just as important, the Mellanox division of Nvidia also is a leader in Ethernet connectivity within data centers as described in this IEEE Techblog post.
- Juniper Networks (being acquired by HPC) is also focusing on networking the AI data center as per a white paper you can download after filling out this form.
During the Q & A, Robbins elaborated: “On the $700 million in AI orders, it’s a combination of systems, silicon, optics, and optical systems. And I think if you break it down, it’s about half is in silicon and systems. And it continues to accelerate. And I’d say the teams have done a great job on the silicon front. We’ve invested heavily in more resources there. The team is running parallel development efforts for multiple chips that are staggered in their time frames. They’ve worked hard. They were increasing the yield, which is a positive thing. And so, we feel good about it, but it’s a combination of all those things that we’re selling to the customers.”
…………………………………………………………………………………………………………………………………………………………………………………………
Enterprise AI:
“What we’re seeing on the enterprise side relative to AI is it’s still — customers are still in the very early days, and they all realize they need to figure out exactly what their use cases are. We’re starting to see some spending though on specific AI-driven infrastructure. And we think as we get AI pods out there — we got Hyperfabric coming. We got AI defense coming.
We have Hypershield in the market. And we got this new DPU switch, they are all going to be a part of the infrastructure to support these AI applications. So, we’re beginning to see it happen, but I think it’s also really important to understand that as the enterprises leverage their private data, their proprietary data, and they’ll do some training on that and then they’ll run inference obviously against that. We believe that opportunity is an order of magnitude higher than what we’ve seen in training today. We’re going to continue to innovate and build capabilities to put ourselves in a better position to be a real beneficiary as this continues to accelerate. But as of today, we feel like we’re in pretty good shape.”
“If you look at AI defense with the AI Summit that we did recently, there’s — I think there’s about 20-some-odd customers who are interested in going to proof of concept with us right now on it. We had almost half the Fortune 100 there for that event. So, I feel good about where we are. It will turn into greater demand as we just continue to scale these products.”
Telco use of AI Edge Applications:
“We see some of the European network operators are looking at delivering AI as a service,” said Robbins. “We see a lot of them planning for AI edge applications that are sitting at the edge of their networks that they’re managing for customers.”
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,………………………………………………
Cisco raised its guidance and now expects revenues for the full year of between $56 billion and $56.5 billion, up from its earlier range of $55.3 billion to $56.3 billion.
,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,………………………………………………
References:
https://www.cisco.com/site/uk/en/solutions/artificial-intelligence/index.html
https://www.juniper.net/content/dam/www/assets/white-papers/us/en/networking-the-ai-data-center.pdf
Nokia selects Intel’s Justin Hotard as new CEO to increase growth in IP networking and data center connections
Initiatives and Analysis: Nokia focuses on data centers as its top growth market
Nvidia enters Data Center Ethernet market with its Spectrum-X networking platform
AT&T and Verizon cut jobs another 6% last year; AI investments continue to increase
In 2018, after AT&T acquired Time Warner, the enlarged AT&T had approximately 230,000 employees and annual revenues of about $172 billion. Both figures have declined subsequently, but headcount has been reduced at a much sharper rate. Data published this week, after AT&T reported full-year sales of $122.3 billion, shows another 9,500 jobs were cut in 2024, decreasing the total to 141,000 workers. Back in 2017, counting the Time Warner business it was then trying to buy, AT&T had as many as 280,000 employees. Is AI being used to replace jettisoned employees at AT&T?
During Monday’s earnings call, AT&T’s CEO John Stankey said AI is already powering a variety of functions at the telco. “What we’ve been able to do in our call centers and how we operate within our customer base, a lot of that has been driven by AI tool applications. And it’s not that we’re necessarily exclusively replacing individuals with the technology, but we’re making them a lot more effective and efficient,” Stankey said, adding that AT&T is also using AI to develop its computer code. “We’re spending less right now to develop new code internally … and it’s through the application of AI and technology.”
Stankey added that AT&T plans to invest more heavily in AI this year, including by using its customer data to more effectively target customers with promotions and other offerings. “If I were to say I had a goal for 2025, I would like to … be talking about good momentum we’ve received in business as a result of executing on some of those things moving forward.”
AI was also highlighted during recent calls about financial results as something that would help to sharpen the axe. Those updates came after AT&T said in December 2024 that its latest aim was to slash another $3 billion in annual costs by the end of 2027. “In 2025, we will make progress on this goal by further integrating AI throughout our operations,” said Stankey this week. AT&T has also put AI to use on writing code and adapting the capabilities of the mobile network based on analysis of traffic patterns.
AT&T officials have previously discussed the possibility that AT&T central offices could host AI capabilities. During a recent analyst event, AT&T CTO Jeremy Legg said that some of the company’s central offices could be used for AI, but that the company would have to address the power requirements for those AI computing functions.
……………………………………………………………………………………………………………………………………………………………………………………….
Job cuts at Verizon have also been dramatic. Under CEO Hans Vestberg, it has managed to grow sales while axing jobs. Annual revenues at the company rose by $6.5 billion between 2020 and 2024, to about $134.8 billion, just as 32,600 jobs were slashed over this period. Workforce shrinkage left Verizon with fewer than 100,000 employees at the end of 2024. The rate of Verizon job cuts was relatively low in 2024, when fewer than 6,000 positions were eliminated, down from the nearly 12,000 that were scrapped in 2023. Yet Verizon employs about 78,000 fewer people today than it did ten years ago, and there has been no sign the trend might go into reverse.
At Verizon, “driving down costs in our operations” is part of a “three-pronged strategy for AI,” Vestberg told analysts. AI Connect, which should be the most exciting prong, positions AI in the same way edge computing was positioned years ago. The idea is to host the resources needed for AI applications in the network facilities that dot the US – some 16,000 “near net” enterprise locations, according to Verizon, along with between 100 and 200 acres of land partially “zoned” for data center build – and then charge for the privilege.
“If you think about where we are on generative AI today, it’s where large language modules are trained at large data centers and that require enormous capacities. Over time, that will, of course, come much closer to the edge of the network,” Verizon CEO Hans Vestberg explained on the operator’s quarterly conference call.
Kyle Malady, head of Verizon Business and the executive leading the operator’s AI efforts, offered more details: “Power, space and cooling are the currencies that are in demand right now, and we have all three,” he said in discussing the AI sector. “As we look across our assets, take inventory and compare against other players in the market, we believe that we are in a leadership position when it comes to usable power and space. We have facilities across the United States that either have spare power, space and cooling, or can be retrofitted. As we sit here today, we have 2-10+ megawatts of usable power across many of our sites. … In addition, we have between 100 and 200 acres of undeveloped land, some currently zoned for data center builds, and much of it in prime, data center-friendly areas.” Malady added that Verizon would deploy Vultr’s GPU-as-a-Service (GPUaaS) in its data centers in order to support the AI computing applications that require those kinds of high-performance graphical processing units (GPUs).
Malady added that Verizon sees a total addressable market (TAM) of $40 billion or more in this new area.
……………………………………………………………………………………………………………………………………………………………………………………………………
CEO’s Stankey and Vestberg have made cost cutting a priority, but most of the attendant layoffs so far are not due to the impact of AI. Stankey said, “It’s not that we’re necessarily exclusively replacing individuals with the technology, but we’re making them a lot more effective and efficient in how they handle customer needs and then complementing that with customer-supported AI.”
If AI does create new types of job, as many AI cheerleaders say, they have clearly not increased the headcount at AT&T or Verizon. A key take-away is that over the last few years, telcos were able to operate a business of roughly the same size with just a fraction of the workforce they previously employed. Average revenues per employee rose 6% at Verizon last year, to about $1.35 million, and have soared from less than $717,000 a decade ago. At AT&T, they grew 7% in 2024, to nearly $868,000, and are up from less than $544,000 in 2014.
References:
https://www.lightreading.com/ai-machine-learning/at-t-and-verizon-cut-another-15-3k-jobs-in-2024-as-ai-advanced
Verizon and AT&T cut 5,100 more jobs with a combined 214,350 fewer employees than 2015
https://www.verizon.com/about/news/verizon-unveils-ai-strategy-power-next-gen-ai-demands
https://www.lightreading.com/the-edge-network/at-t-and-verizon-are-pivoting-into-the-landlord-biz-for-ai
AT&T to deploy Fujitsu and Mavenir radio’s in crowded urban areas
AT&T’s leads the pack of U.S. fiber optic network service providers
AT&T’s fiber business grows along with FWA “Internet Air” in Q4-2023
U.S. Cellular to Sell Spectrum Licenses to Verizon in $1 Billion Deal
Verizon to buy Frontier Communications
Verizon Business sees escalating risks in mobile and IoT security
Gartner: Gen AI nearing trough of disillusionment; GSMA survey of network operator use of AI
Global IT spending is expected to total $5.61 trillion in 2025, an increase of 9.8% from 2024, according to the latest forecast by Gartner, Inc.
“While budgets for CIOs are increasing, a significant portion will merely offset price increases within their recurrent spending,” said John-David Lovelock, Distinguished VP Analyst at Gartner. “This means that, in 2025, nominal spending versus real IT spending will be skewed, with price hikes absorbing some or all of budget growth. All major categories are reflecting higher-than-expected prices, prompting CIOs to defer and scale back their true budget expectations.”
GenAI will Influence IT Spending, but IT Spending Won’t Be on GenAI Itself:
Segments including data center systems, devices and software will see double-digit growth in 2025, largely due to generative AI (GenAI) hardware upgrades (see Table 1). However, these upgraded segments will not differentiate themselves in terms of functionality yet, even with new hardware.
Table 1. Worldwide IT Spending Forecast (Millions of U.S. Dollars)
2024 Spending |
2024 Growth (%) |
2025 Spending |
2025 Growth (%) |
|
Data Center Systems | 329,132 | 39.4 | 405,505 | 23.2 |
Devices | 734,162 | 6.0 | 810,234 | 10.4 |
Software | 1,091,569 | 12.0 | 1,246,842 | 14.2 |
IT Services | 1,588,121 | 5.6 | 1,731,467 | 9.0 |
Communications Services |
1,371,787 |
2.3 | 1,423,746 | 3.8 |
Overall IT | 5,114,771 | 7.7 | 5,617,795 | 9.8 |
Source: Gartner (January 2025)
“GenAI is sliding toward the trough of disillusionment which reflects CIOs declining expectations for GenAI, but not their spending on this technology,” said Lovelock. “For instance, the new AI ready PCs do not yet have ‘must have’ applications that utilize the hardware. While both consumers and enterprises will purchase AI-enabled PC, tablets and mobile phones, those purchases will not be overly influenced by the GenAI functionality.”
Spending on AI-optimized servers easily doubles spending on traditional servers in 2025, reaching $202 billion dollars.
“IT services companies and hyperscalers account for over 70% of spending in 2025,” said Lovelock. “By 2028, hyperscalers will operate $1 trillion dollars’ worth of AI optimized servers, but not within their traditional business model or IaaS Market. Hyperscalers are pivoting to be part of the oligopoly AI model market.”
Gartner’s IT spending forecast methodology relies heavily on rigorous analysis of the sales by over a thousand vendors across the entire range of IT products and services. Gartner uses primary research techniques, complemented by secondary research sources, to build a comprehensive database of market size data on which to base its forecast.
More information on the forecast can be found in the complimentary Gartner webinar “IT Spending Forecast, 4Q24 Update: GenAI’s Impact on a $7 Trillion IT Market.”
………………………………………………………………………………………………………….
Gartner’s 2025 forecast for IT spending is consistent with the market research firm’s predictions from late last year that the move to AI is driving a surge in spending on data center infrastructure and IT services in Europe. IT spending across the continent will come in at US$1.28 trillion in 2025 they said. Presumably it takes a little longer to gather up the data necessary for predictions across the whole world.
……………………………………………………………………………………………………
Separately, Citi analysts expect 2025 growth to be largely driven by continued AI spending as data center capital expenditure for the biggest cloud service providers is forecasted to increase by 40% this year.
……………………………………………………………………………………………………
In a recent survey of network operators, GSMA found that telcos are allocating more resources to in-house and out-of-house AIs capabilities and projects, but only a subset are spending more than 15% of their digital budgets on AI. Nearly half of operators are dedicating 5% to 15% of their digital budgets towards AI, covering a range of categories, including data systems, large language models and infrastructure upgrades, the GSMA survey found. That AI money is also being allocated toward AI teams, tools and partnerships, said GSMA. The association, which primarily represents mobile operators, has been asked for more details about the size, scope and methodology of its latest study.
AI Status at Network Operators:
References:
Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?
Canalys & Gartner: AI investments drive growth in cloud infrastructure spending
AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms
AI wave stimulates big tech spending and strong profits, but for how long?
Telco spending on RAN infrastructure continues to decline as does mobile traffic growth
CES 2025: Intel announces edge compute processors with AI inferencing capabilities
At CES 2025 today, Intel unveiled the new Intel® Core™ Ultra (Series 2) processors, designed to revolutionize mobile computing for businesses, creators and enthusiast gamers. Intel said “the new processors feature cutting-edge AI enhancements, increased efficiency and performance improvements.”
“Intel Core Ultra processors are setting new benchmarks for mobile AI and graphics, once again demonstrating the superior performance and efficiency of the x86 architecture as we shape the future of personal computing,” said Michelle Johnston Holthaus, interim co-CEO of Intel and CEO of Intel Products. “The strength of our AI PC product innovation, combined with the breadth and scale of our hardware and software ecosystem across all segments of the market, is empowering users with a better experience in the traditional ways we use PCs for productivity, creation and communication, while opening up completely new capabilities with over 400 AI features. And Intel is only going to continue bolstering its AI PC product portfolio in 2025 and beyond as we sample our lead Intel 18A product to customers now ahead of volume production in the second half of 2025.”
Intel also announced new edge computing processors, designed to provide scalability and superior performance across diverse use cases. Intel Core Ultra processors were said to deliver remarkable power efficiency, making them ideal for AI workloads at the edge, with performance gains that surpass competing products in critical metrics like media processing and AI analytics. Those edge processors are targeted at compute servers running in hospitals, retail stores, factory floors and other “edge” locations that sit between big data centers and end-user devices. Such locations are becoming increasingly important to telecom network operators hoping to sell AI capabilities, private wireless networks, security offerings and other services to those enterprise locations.
Intel edge products launching today at CES include:
- Intel® Core™ Ultra 200S/H/U series processors (code-named Arrow Lake).
- Intel® Core™ 200S/H series processors (code-named Bartlett Lake S and Raptor Lake H Refresh).
- Intel® Core™ 100U series processors (code-named Raptor Lake U Refresh).
- Intel® Core™ 3 processor and Intel® Processor (code-named Twin Lake).
“Intel has been powering the edge for decades,” said Michael Masci, VP of product management in Intel’s edge computing group, during a media presentation last week. According to Masci, AI is beginning to expand the edge opportunity through inferencing [1.]. “Companies want more local compute. AI inference at the edge is the next major hotbed for AI innovation and implementation,” he added.
Note 1. Inferencing in AI refers to the process where a trained AI model makes predictions or decisions based on new data, rather than previously stored “training models.” It’s essentially AI’s ability to apply learned knowledge on fresh inputs in real-time. Edge computing plays a critical role in inferencing, because it brings it closer to users. That lowers latency (much faster AI responses) and can also reduce bandwidth costs and ensure privacy and security as well.
Editor’s Note: Intel’s edge compute business – the one pursuing AI inferencing – is in in its Client Computing Group (CCG) business unit. Intel’s chips for telecom operators reside inside its NEX business unit.
Intel’s Masci specifically called out Nvidia’s GPU chips, claiming Intel’s new silicon lineup supports up to 5.8x faster performance and better usage per watt. Indeed, Intel claims their “Core™ Ultra 7 processor uses about one-third fewer TOPS (Trillions Operations Per Second) than Nvidia’s Jetson AGX Orin, but beats its competitor with media performance that is up to 5.6 times faster, video analytics performance that is up to 3.4x faster and performance per watt per dollar up to 8.2x better.”
………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
However, Nvidia has been using inference in its AI chips for quite some time. Company officials last month confirmed that 40% of Nvidia’s revenues come from AI inference, rather than AI training efforts in big data centers. Colette Kress, Nvidia Executive Vice President and Chief Financial Officer, said, “Our architectures allows an end-to-end scaling approach for them to do whatever they need to in the world of accelerated computing and Ai. And we’re a very strong candidate to help them, not only with that infrastructure, but also with the software.”
“Inference is super hard. And the reason why inference is super hard is because you need the accuracy to be high on the one hand. You need the throughput to be high so that the cost could be as low as possible, but you also need the latency to be low,” explained Nvidia CEO Jensen Huang during his company’s recent quarterly conference call.
“Our hopes and dreams is that someday, the world does a ton of inference. And that’s when AI has really succeeded, right? It’s when every single company is doing inference inside their companies for the marketing department and forecasting department and supply chain group and their legal department and engineering, and coding, of course. And so we hope that every company is doing inference 24/7.”
……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
Sadly for its many fans (including this author), Intel continues to struggle in both data center processors and AI/ GPU chips. The Wall Street Journal recently reported that “Intel’s perennial also-ran, AMD, actually eclipsed Intel’s revenue for chips that go into data centers. This is a stunning reversal: In 2022, Intel’s data-center revenue was three times that of AMD.”
Even worse for Intel, more and more of the chips that go into data centers are GPUs and Intel has minuscule market share of these high-end chips. GPUs are used for training and delivering AI. The WSJ notes that many of the companies spending the most on building out new data centers are switching to chips that have nothing to do with Intel’s proprietary architecture, known as x86, and are instead using a combination of a competing architecture from ARM and their own custom chip designs. For example, more than half of the CPUs Amazon has installed in its data centers over the past two years were its own custom chips based on ARM’s architecture, Dave Brown, Amazon vice president of compute and networking services, said recently.
This displacement of Intel is being repeated all across the big providers and users of cloud computing services. Microsoft and Google have also built their own custom, ARM-based CPUs for their respective clouds. In every case, companies are moving in this direction because of the kind of customization, speed and efficiency that custom silicon supports.
References:
https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qbu4
https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qdhd
https://www.wsj.com/tech/intel-microchip-competitors-challenges-562a42e3
Massive layoffs and cost cutting will decimate Intel’s already tiny 5G network business
WSJ: China’s Telecom Carriers to Phase Out Foreign Chips; Intel & AMD will lose out
The case for and against AI-RAN technology using Nvidia or AMD GPUs
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
FT: Nvidia invested $1bn in AI start-ups in 2024
AI winner Nvidia faces competition with new super chip delayed
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions
Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections
A growing portion of the billions of dollars being spent on AI data centers will go to the suppliers of networking chips, lasers, and switches that integrate thousands of GPUs and conventional micro-processors into a single AI computer cluster. AI can’t advance without advanced networks, says Nvidia’s networking chief Gilad Shainer. “The network is the most important element because it determines the way the data center will behave.”
Networking chips now account for just 5% to 10% of all AI chip spending, said Broadcom CEO Hock Tan. As the size of AI server clusters hits 500,000 or a million processors, Tan expects that networking will become 15% to 20% of a data center’s chip budget. A data center with a million or more processors will cost $100 billion to build.
The firms building the biggest AI clusters are the hyperscalers, led by Alphabet’s Google, Amazon.com, Facebook parent Meta Platforms, and Microsoft. Not far behind are Oracle, xAI, Alibaba Group Holding, and ByteDance. Earlier this month, Bloomberg reported that capex for those four hyperscalers would exceed $200 billion this year, making the year-over-year increase as much as 50%. Goldman Sachs estimates that AI data center spending will rise another 35% to 40% in 2025. Morgan Stanley expects Amazon and Microsoft to lead the pack with $96.4bn and $89.9bn of capex respectively, while Google and Meta will follow at $62.6bn and $52.3bn.
AI compute server architectures began scaling in recent years for two reasons.
1.] High end processor chips from Intel neared the end of speed gains made possible by shrinking a chip’s transistors.
2.] Computer scientists at companies such as Google and OpenAI built AI models that performed amazing feats by finding connections within large volumes of training material.
As the components of these “Large Language Models” (LLMs) grew to millions, billions, and then trillions, they began translating languages, doing college homework, handling customer support, and designing cancer drugs. But training an AI LLM is a huge task, as it calculates across billions of data points, rolls those results into new calculations, then repeats. Even with Nvidia accelerator chips to speed up those calculations, the workload has to be distributed across thousands of Nvidia processors and run for weeks.
To keep up with the distributed computing challenge, AI data centers all have two networks:
- The “front end” network which sends and receives data to/from external users —like the networks of every enterprise data center or cloud-computing center. It’s placed on the network’s outward-facing front end or boundary and typically includes equipment like high end routers, web servers, DNS servers, application servers, load balancers, firewalls, and other devices which connect to the public internet, IP-MPLS VPNs and private lines.
- A “back end” network that connects every AI processor (GPUs and conventional MPUs) and memory chip with every other processor within the AI data center. “It’s just a supercomputer made of many small processors,” says Ram Velaga, Broadcom’s chief of core switching silicon. “All of these processors have to talk to each other as if they are directly connected.” AI’s back-end networks need high bandwidth switches and network connections. Delays and congestion are expensive when each Nvidia compute node costs as much as $400,000. Idle processors waste money. Back-end networks carry huge volumes of data. When thousands of processors are exchanging results, the data crossing one of these networks in a second can equal all of the internet traffic in America.
Nvidia became one of today’s largest vendors of network gear via its acquisition of Israel based Mellanox in 2020 for $6.9 billion. CEO Jensen Huang and his colleagues realized early on that AI workloads would exceed a single box. They started using InfiniBand—a network designed for scientific supercomputers—supplied by Mellanox. InfiniBand became the standard for AI back-end networks.
While most AI dollars still go to Nvidia GPU accelerator chips, back-end networks are important enough that Nvidia has large networking sales. In the September quarter, those network sales grew 20%, to $3.1 billion. However, Ethernet is now challenging InfiniBand’s lock on AI networks. Fortunately for Nvidia, its Mellanox subsidiary also makes high speed Ethernet hardware modules. For example, xAI uses Nvidia Ethernet products in its record-size Colossus system.
While current versions of Ethernet lack InfiniBand’s tools for memory and traffic management, those are now being added in a version called Ultra Ethernet [1.]. Many hyperscalers think Ethernet will outperform InfiniBand, as clusters scale to hundreds of thousands of processors. Another attraction is that Ethernet has many competing suppliers. “All the largest guys—with an exception of Microsoft—have moved over to Ethernet,” says an anonymous network industry executive. “And even Microsoft has said that by summer of next year, they’ll move over to Ethernet, too.”
Note 1. Primary goals and mission of Ultra Ethernet Consortium (UEC): Deliver a complete architecture that optimizes Ethernet for high performance AI and HPC networking, exceeding the performance of today’s specialized technologies. UEC specifically focuses on functionality, performance, TCO, and developer and end-user friendliness, while minimizing changes to only those required and maintaining Ethernet interoperability. Additional goals: Improved bandwidth, latency, tail latency, and scale, matching tomorrow’s workloads and compute architectures. Backwards compatibility to widely-deployed APIs and definition of new APIs that are better optimized to future workloads and compute architectures.
……………………………………………………………………………………………………………………………………………………………………………………………………………………………….
Ethernet back-end networks offer a big opportunity for Arista Networks, which builds switches using Broadcom chips. In the past two years, AI data centers became an important business for Arista. AI provides sales to Arista switch rivals Cisco and Juniper Networks (soon to be a part of Hewlett Packard Enterprise), but those companies aren’t as established among hyperscalers. Analysts expect Arista to get more than $1 billion from AI sales next year and predict that the total market for back-end switches could reach $15 billion in a few years. Three of the five big hyperscale operators are using Arista Ethernet switches in back-end networks, and the other two are testing them. Arista CEO Jayshree Ullal (a former SCU EECS grad student of this author/x-adjunct Professor) says that back-end network sales seem to pull along more orders for front-end gear, too.
The network chips used for AI switching are feats of engineering that rival AI processor chips. Cisco makes its own custom Ethernet switching chips, but some 80% of the chips used in other Ethernet switches comes from Broadcom, with the rest supplied mainly by Marvell. These switch chips now move 51 terabits of data a second; it’s the same amount of data that a person would consume by watching videos for 200 days straight. Next year, switching speeds will double.
The other important parts of a network are connections between computing nodes and cables. As the processor count rises, connections increase at a faster rate. A 25,000-processor cluster needs 75,000 interconnects. A million processors will need 10 million interconnects. More of those connections will be fiber optic, instead of copper or coax. As networks speed up, copper’s reach shrinks. So, expanding clusters have to “scale-out” by linking their racks with optics. “Once you move beyond a few tens of thousand, or 100,000, processors, you cannot connect anything with copper—you have to connect them with optics,” Velaga says.
AI processing chips (GPUs) exchange data at about 10 times the rate of a general-purpose processor chip. Copper has been the preferred conduit because it’s reliable and requires no extra power. At current network speeds, copper works well at lengths of up to five meters. So, hyperscalers have tried to “scale-up” within copper’s reach by packing as many processors as they can within each shelf, and rack of shelves.
Back-end connections now run at 400 gigabits per second, which is equal to a day and half of video viewing. Broadcom’s Velaga says network speeds will rise to 800 gigabits in 2025, and 1.6 terabits in 2026.
Nvidia, Broadcom, and Marvell sell optical interface products, with Marvell enjoying a strong lead in 800-gigabit interconnects. A number of companies supply lasers for optical interconnects, including Coherent, Lumentum Holdings, Applied Optoelectronics, and Chinese vendors Innolight and Eoptolink. They will all battle for the AI data center over the next few years.
A 500,000-processor cluster needs at least 750 megawatts, enough to power 500,000 homes. When AI models scale to a million or more processors, they will require gigawatts of power and have to span more than one physical data center, says Velaga.
The opportunity for optical connections reaches beyond the AI data center. That’s because there isn’t enough power. In September, Marvell, Lumentum, and Coherent demonstrated optical links for data centers as far apart as 300 miles. Nvidia’s next-generation networks will be ready to run a single AI workload across remote locations.
Some worry that AI performance will stop improving as processor counts scale. Nvidia’s Jensen Huang dismissed those concerns on his last conference call, saying that clusters of 100,000 processors or more will just be table stakes with Nvidia’s next generation of chips. Broadcom’s Velaga says he is grateful: “Jensen (Nvidia CEO) has created this massive opportunity for all of us.”
References:
https://www.datacenterdynamics.com/en/news/morgan-stanley-hyperscaler-capex-to-reach-300bn-in-2025/
https://ultraethernet.org/ultra-ethernet-specification-update/
Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!
Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?
Canalys & Gartner: AI investments drive growth in cloud infrastructure spending
AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms
AI wave stimulates big tech spending and strong profits, but for how long?
Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029
Using a distributed synchronized fabric for parallel computing workloads- Part I
Using a distributed synchronized fabric for parallel computing workloads- Part II
FT: Nvidia invested $1bn in AI start-ups in 2024
Nvidia invested $1bn in artificial intelligence companies in 2024, as it emerged as a crucial backer of start-ups using the company’s graphics processing units (GPUs). The king of AI semiconductors, which surpassed a $3tn market capitalization in June due to huge demand for its high-performing GPUs, has significantly invested into some of its own customers.
According to corporate filings and Dealroom research, Nvidia spent a total of $1bn across 50 start-up funding rounds and several corporate deals in 2024, compared with 2023, which saw 39 start-up rounds and $872mn in spending. The vast majority of deals were with “core AI” companies with high computing infrastructure demands, and so in some cases also buyers of its own chips. Tech companies have spent tens of billions of dollars on Nvidia’s chips over the past year since the debut of ChatGPT two years ago kick-started an unprecedented surge of investment in AI. Nvidia’s uptick in deals comes after it amassed a $9bn war chest of cash with its GPUs becoming one of the world’s hottest commodities.
The company’s shares rose more than 170% in 2024, as it and other tech giants helped power the S&P 500 index to its best two-year run this century. Nvidia’s $1bn worth of investments in “non-affiliated entities” in the first nine months last year includes both its venture and corporate investment arms.
According to company filings, that sum was 15% more than in 2023 and more than 10 times as much as it invested in 2022. Some of Nvidia’s largest customers, such as Microsoft, Amazon and Google, are actively working to reduce their reliance on its GPUs by developing their own custom chips. Such a development could make smaller AI companies a more important generator of revenues for Nvidia in the future.
“Right now Nvidia wants there to be more competition and it makes sense for them to have these new players in the mix,” said a fund manager with a stake in a number of companies it had invested in.
In 2024, Nvidia struck more deals than Microsoft and Amazon, although Google remains far more active, according to Dealroom. Such prolific dealmaking has raised concerns about Nvidia’s grip over the AI industry, at a time when it is facing heightened antitrust scrutiny in the US, Europe and China. Bill Kovacic, former chair of the US Federal Trade Commission, said competition watchdogs were “keen” to investigate a “dominant enterprise making these big investments” to see if buying company stakes was aimed at “achieving exclusivity”, although he said investments in a customer base could prove beneficial. Nvidia strongly rejects the idea that it connects funding with any requirement to use its technology.
The company said it was “working to grow our ecosystem, support great companies and enhance our platform for everyone. We compete and win on merit, independent of any investments we make.” It added: “Every company should be free to make independent technological choices that best suit their needs and strategies.”
The Santa Clara based company’s most recent start-up deal was a strategic investment in Elon Musk’s xAI. Other significant 2024 investments included its participation in funding rounds for OpenAI, Cohere, Mistral and Perplexity, some of the most prominent AI model providers.
Nvidia also has a start-up incubator, Inception, which separately has helped the early evolution of thousands of fledgling companies. The Inception program offers start-ups “preferred pricing” on hardware, as well as cloud credits from Nvidia’s partners.
There has been an uptick in Nvidia’s acquisitions, including a takeover of Run:ai, an Israeli AI workload management platform. The deal closed this week after coming under scrutiny from the EU’s antitrust regulator, which ultimately cleared the transaction. The US Department of Justice was also looking at the deal, according to Politico. Nvidia also bought AI software groups Nebulon, OctoAI, Brev.dev, Shoreline.io and Deci. Collectively it has made more acquisitions in 2024 than the previous four years combined, according to Dealroom. Recommended News in-depthArtificial intelligence Wall Street frenzy creates $11bn debt market for AI groups buying Nvidia chips.
The company is investing widely, pouring millions of dollars into AI groups involved in medical technology, search engines, gaming, drones, chips, traffic management, logistics, data storage and generation, natural language processing and humanoid robots. Its portfolio includes a number of start-ups whose valuations have soared to billions of dollars. CoreWeave, an AI cloud computing service provider and significant purchaser of Nvidia chips, is preparing to float early this year at a valuation as high as $35bn — increasing from about $7bn a year ago.
Nvidia invested $100mn in CoreWeave in early 2023, and participated in a $1bn equity fundraising round by the company in May. Another start-up, Applied Digital, was facing a plunging share price in 2024, with revenue misses and considerable debt obligations, before a group of investors led by Nvidia provided $160mn of equity capital in September, prompting a 65 per cent surge in its share price.
“Nvidia is using their massive market cap and huge cash flow to keep purchasers alive,” said Nate Koppikar, a short seller at Orso Partners. “If Applied Digital had died, that’s [a large volume] of sales that would have died with it.”
Neocloud groups such as CoreWeave, Crusoe and Lambda Labs have acquired tens of thousands of Nvidia’s high-performance GPUs, that are crucial for developing generative AI models. Those Nvidia AI chips are now also being used as collateral for huge loans. The frenzied dealmaking has shone a light on a rampant GPU economy in Silicon Valley that is increasingly being supported by deep-pocketed financiers in New York. However, its rapid growth has raised concerns about the potential for more risky lending, circular financing and Nvidia’s chokehold on the AI market.
References:
https://www.ft.com/content/f8acce90-9c4d-4433-b189-e79cad29f74e
https://www.ft.com/content/41bfacb8-4d1e-4f25-bc60-75bf557f1f21
The case for and against AI-RAN technology using Nvidia or AMD GPUs
Nvidia is proposing a new approach to telco networks dubbed “AI radio access network (AI-RAN).” The GPU king says: “Traditional CPU or ASIC-based RAN systems are designed only for RAN use and cannot process AI traffic today. AI-RAN enables a common GPU-based infrastructure that can run both wireless and AI workloads concurrently, turning networks from single-purpose to multi-purpose infrastructures and turning sites from cost-centers to revenue sources. With a strategic investment in the right kind of technology, telcos can leap forward to become the AI grid that facilitates the creation, distribution, and consumption of AI across industries, consumers, and enterprises. This moment in time presents a massive opportunity for telcos to build a fabric for AI training (creation) and AI inferencing (distribution) by repurposing their central and distributed infrastructures.”
One of the first principles of AI-RAN technology is to be able to run RAN and AI workloads concurrently and without compromising carrier-grade performance. This multi-tenancy can be either in time or space: dividing the resources based on time of day or based on percentage of compute. This also implies the need for an orchestrator that can provision, de-provision, or shift workloads seamlessly based on available capacity.
Image Credit: Pitinan Piyavatin/Alamy Stock Photo
ARC-1, an appliance Nvidia showed off earlier this year, comes with a Grace Blackwell “superchip” that would replace either a traditional vendor’s application-specific integrated circuit (ASIC) or an Intel processor. Ericsson and Nokia are exploring the possibilities with Nvidia. Developing RAN software for use with Nvidia’s chips means acquiring competency in compute unified device architecture (CUDA), Nvidia’s instruction set. “They do have to reprofile into CUDA,” said Soma Velayutham, the general manager of Nvidia’s AI and telecom business, during a recent interview with Light Reading. “That is an effort.”
Proof of Concept:
SoftBank has turned the AI-RAN vision into reality, with its successful outdoor field trial in Fujisawa City, Kanagawa, Japan, where NVIDIA-accelerated hardware and NVIDIA Aerial software served as the technical foundation. That achievement marks multiple steps forward for AI-RAN commercialization and provides real proof points addressing industry requirements on technology feasibility, performance, and monetization:
- World’s first outdoor 5G AI-RAN field trial running on an NVIDIA-accelerated computing platform. This is an end-to-end solution based on a full-stack, virtual 5G RAN software integrated with 5G core.
- Carrier-grade virtual RAN performance achieved.
- AI and RAN multi-tenancy and orchestration achieved.
- Energy efficiency and economic benefits validated compared to existing benchmarks.
- A new solution to unlock AI marketplace integrated on an AI-RAN infrastructure.
- Real-world AI applications showcased, running on an AI-RAN network.
Above all, SoftBank aims to commercially release their own AI-RAN product for worldwide deployment in 2026. To help other mobile network operators get started on their AI-RAN journey now, SoftBank is also planning to offer a reference kit comprising the hardware and software elements required to trial AI-RAN in a fast and easy way.
SoftBank developed their AI-RAN solution by integrating hardware and software components from NVIDIA and ecosystem partners and hardening them to meet carrier-grade requirements. Together, the solution enables a full 5G vRAN stack that is 100% software-defined, running on NVIDIA GH200 (CPU+GPU), NVIDIA Bluefield-3 (NIC/DPU), and Spectrum-X for fronthaul and backhaul networking. It integrates with 20 radio units and a 5G core network and connects 100 mobile UEs.
The core software stack includes the following components:
- SoftBank-developed and optimized 5G RAN Layer 1 functions such as channel mapping, channel estimation, modulation, and forward-error-correction, using NVIDIA Aerial CUDA-Accelerated-RAN libraries
- Fujitsu software for Layer 2 functions
- Red Hat’s OpenShift Container Platform (OCP) as the container virtualization layer, enabling different types of applications to run on the same underlying GPU computing infrastructure
- A SoftBank-developed E2E AI and RAN orchestrator, to enable seamless provisioning of RAN and AI workloads based on demand and available capacity
AI marketplace solution integrated with SoftBank AI-RAN. Image Credit: Nvidia
The underlying hardware is the NVIDIA GH200 Grace Hopper Superchip, which can be used in various configurations from distributed to centralized RAN scenarios. This implementation uses multiple GH200 servers in a single rack, serving AI and RAN workloads concurrently, for an aggregated-RAN scenario. This is comparable to deploying multiple traditional RAN base stations.
In this pilot, each GH200 server was able to process 20 5G cells using 100-MHz bandwidth, when used in RAN-only mode. For each cell, 1.3 Gbps of peak downlink performance was achieved in ideal conditions, and 816Mbps was demonstrated with carrier-grade availability in the outdoor deployment.
……………………………………………………………………………………………………………………………………..
Could AMD GPU’s be an alternative to Nvidia AI-RAN?
AMD is certainly valued by NScale, a UK business with a GPU-as-a-service offer, as an AI alternative to Nvidia. “AMD’s approach is quite interesting,” said David Power, NScale’s chief technology officer. “They have a very open software ecosystem. They integrate very well with common frameworks.” So far, though, AMD has said nothing publicly about any AI-RAN strategy.
The other telco concern is about those promised revenues. Nvidia insists it was conservative when estimating that a telco could realize $5 in inferencing revenues for every $1 invested in AI-RAN. But the numbers met with a fair degree of skepticism in the wider market. Nvidia says the advantage of doing AI inferencing at the edge is that latency, the time a signal takes to travel around the network, would be much lower compared with inferencing in the cloud. But the same case was previously made for hosting other applications at the edge, and they have not taken off.
Even if AI changes that, it is unclear telcos would stand to benefit. Sales generated by the applications available on the mobile Internet have gone largely to hyperscalers and other software developers, leaving telcos with a dwindling stream of connectivity revenues. Expect AI-RAN to be a big topic for 2025 as operators carefully weigh their options. Many telcos are unconvinced there is a valid economic case for AI-RAN, especially since GPUs generate a lot of power (they are perceived as “energy hogs”).
References:
AI-RAN Goes Live and Unlocks a New AI Opportunity for Telcos
https://www.lightreading.com/ai-machine-learning/2025-preview-ai-ran-would-be-a-paradigm-shift
Nvidia bid to reshape 5G needs Ericsson and Nokia buy-in
Softbank goes radio gaga about Nvidia in nervy days for Ericsson
T-Mobile emerging as Nvidia’s big AI cheerleader