Dell’Oro: Analysis of the Nokia-NVIDIA-partnership on AI RAN

According to Dell’Oro VP Stefan Pongratz, Nokia has outlined a clear plan to arrest its declining RAN revenue share (see chart below), with NVIDIA  now a central pillar of that strategy. The partnership is designed to deliver AI RAN [1.] while meeting wireless network operators’ near-term constraints and concerns on performance, power, and TCO (Total Cost of Ownership).  IEEE Techblog has noted in many past blog posts that telcos have huge doubts about AI RAN which implies they won’t buy into that new RAN architecture.

This is especially relevant considering the monumental failure of multi-vendor Open RAN which was promoted as a game changer, but has dismally failed to attain that vision.

Note 1.  AI RAN is a mobile RAN architecture where AI and machine learning are embedded into the RAN software and underlying compute platform to optimize how the network is planned, configured, and operated.  It is being pushed by NVIDIA to get its GPUs into 5G, 5G Advanced and 6G base stations and other wireless network equipment in the RAN.

……………………………………………………………………………………………………………………………………………………..

Nokia aims to use collaboration with NVIDIA (which invested $1B in the Finland based company) to stabilize its RAN market share in the near term and create a platform for long-term growth in AI-native 5G-Advanced and 6G networks. The timing—following a dense cadence of disclosures at NVIDIA’s GPU Technology Conference and Nokia’s Capital Markets Day—makes this an ideal time to reassess the scope of the joint announcements, the RAN implications, and Nokia’s broader competitive posture in an increasingly concentrated market.

Both companies share a belief that telecom networks will evolve from best-effort connectivity into a distributed compute fabric underpinning autonomous machines, self-driving vehicles, humanoids, and industrial digital twins. From that perspective, the RAN becomes an “AI grid” that executes and orchestrates AI workloads at the edge, enabling massive numbers of latency-sensitive, bandwidth-intensive AI use cases.

Unlike prior attempts to penetrate the RAN market with its GPUs, NVIDIA is now taking a more pragmatic approach, explicitly targeting parity with incumbent, purpose-built RAN equipment based on performance, power, and TCO rather than leading with speculative multi-tenant or new-revenue narratives. Nokia, acutely aware of wireless telco risk tolerance, is positioning the solution so that the ROI must be justifiable on a pure RAN basis, with additional AI and edge-compute upside treated as optional rather than foundational.

A quick recap of NVIDIA’s entry into RAN: Based on the announcement and subsequent discussions, our understanding is that NVIDIA will invest $1 B in Nokia and that NVIDIA-powered AI-RAN products will be incorporated into Nokia’s RAN portfolio starting in 2027 (with trials beginning in 2026). While RAN compute—which represents less than half of the $30B+ RAN market—is immaterial relative to NVIDIA’s $4+ T market cap, the potential upside becomes more meaningful when viewed in the context of NVIDIA’s broader telecom ambitions and its $165 B in trailing-twelve-month revenue.

With a deployed base of more than 1 million BTS, Nokia is prioritizing three migration vectors to GPU/AI-RAN, in order of expected impact:

  • Purpose-built D-RAN [2.], by inserting a new card into existing AirScale slots.

  • D-RAN vRAN [3.], using COTS servers at the cell site.

  • Cloud RAN [4.] or vRAN, using centralized COTS infrastructure.

This approach aligns with wireless network operators’ desire to sweat existing AirScale assets while minimizing operational disruption.

Note 2.  Purpose-built D-RAN is a distributed RAN architecture where the baseband processing runs on dedicated, vendor-specific hardware at or very close to the cell site, rather than on generic COTS servers. It is “purpose-built” because the silicon, boards, and software stack are tightly integrated and optimized for RAN performance, power efficiency, and footprint, not general-purpose compute.

Note 3. vRAN or virtual RAN is a technology that virtualizes the functions of a cellular network’s radio access network, moving them from dedicated hardware to software running on general-purpose servers. This approach makes mobile networks more flexible, scalable, and cost-efficient by replacing proprietary hardware with software on common-off-the-shelf (COTS) hardware.

Note 4. Cloud RAN (C-RAN) is a centralized cellular network architecture that uses cloud computing to virtualize and process radio access network (RAN) functions. This architecture centralizes baseband units in a “BBU hotel,” allowing for more flexible and scalable network management, efficient resource allocation, and improved network performance. It allows operators to pool resources, adjust capacity based on demand, and support new services, which is a key enabler for 5G networks.

………………………………………………………………………………………………………………………………………………

In this model, the Distributed Unit, and often the higher-layer functions, are physically collocated with the radio unit at the site, making each site a largely self-contained RAN node. This contrasts with Cloud RAN or vRAN, where baseband functions are centralized or virtualized on shared cloud infrastructure, and with cloud/AI-RAN approaches that rely on GPUs or other general-purpose accelerators instead of custom RAN hardware.

The macro-RAN market (baseband plus radio) is roughly a 30 billion USD annual opportunity, with on the order of 1–2 million macro sites shipped per year. In that context, operators have limited appetite to pay more than 10,000 USD for a GPU per sector, even if software-led benefits accumulate over time, which is why NVIDIA is signaling GPU pricing in line with ARC-Compact but at roughly double the capacity and Nokia is targeting 48–50% gross margins in Mobile Infrastructure by 2028, slightly above the current run-rate.

If the TCO and performance-per-watt gap versus custom silicon continues to narrow, the partnership could materially influence AI-RAN and Cloud-RAN trajectories while also supporting Nokia’s margin expansion goals. AI-RAN was already expected to scale to roughly one-third of the RAN market by 2029; Nokia’s decision to lean harder into GPUs amplifies this structural shift without fundamentally changing the long-term 6G direction.

In the near term, GPU-enabled D-RAN using empty AirScale slots is expected to dominate deployments, reflecting operators’ preference for incremental, site-level upgrades. At the same time, the Nokia-NVIDIA partnership is not expected to meaningfully alter the overall Cloud RAN vs. D-RAN mix, Open RAN adoption (slow or non-existent) , or the trajectory of multi-tenant RAN, which remain more dependent on network operator architecture and commercial decisions than on a single vendor–silicon alignment.

Nokia plans to remain disciplined and focus on areas where it can differentiate and unlock value—particularly through software/faster innovation cycles via its recently announced partnership with NVIDIA. The company sees meaningful opportunities to capture incremental share in North America, Europe, India, and select APAC markets. And it is already off to a solid start— we estimate that Nokia’s 1Q25–3Q25 RAN revenue share outside North America improved slightly relative to 2024. Following this stabilization phase, Nokia is betting that its investments will pay off and that it will be well-positioned to lead with AI-native networks and 6G.

Nokia’s objective is clear: stabilize RAN in the short term, then grow by leading in AI-native networks and 6G over the longer horizon. Success now hinges on Nokia’s ability to operationalize the GPU-based RAN roadmap at scale and on NVIDIA’s ability to deliver carrier-grade economics and performance—turning the AI-RAN narrative into production-grade, repeatable deployments.

Nokia sees meaningful opportunities to capture incremental RAN market share in North America, Europe, India, and select APAC markets. And it is already off to a solid start— we estimate that Nokia’s 1Q25–3Q25 RAN revenue share outside North America improved slightly relative to 2024. Following this stabilization phase, Nokia is betting that its investments will pay off and that it will be well-positioned to lead with AI-native networks and 6G.

References:

Nokia and NVIDIA Take on RAN

Nokia in major pivot from traditional telecom to AI, cloud infrastructure, data center networking and 6G

Dell’Oro: RAN market stable, Mobile Core Network market +14% Y/Y with 72 5G SA core networks deployed

Indosat Ooredoo Hutchison, Nokia and Nvidia AI-RAN research center in Indonesia amongst telco skepticism

Nvidia pays $1 billion for a stake in Nokia to collaborate on AI networking solutions

RAN silicon rethink – from purpose built products & ASICs to general purpose processors or GPUs for vRAN & AI RAN

Dell’Oro: AI RAN to account for 1/3 of RAN market by 2029; AI RAN Alliance membership increases but few telcos have joined

Dell’Oro: RAN revenue growth in 1Q2025; AI RAN is a conundrum

AI RAN Alliance selects Alex Choi as Chairman

Expose: AI is more than a bubble; it’s a data center debt bomb

NTT DOCOMO successful outdoor trial of AI-driven wireless interface with 3 partners

NTT DOCOMO  has successfully executed the world’s premier outdoor field trial of real-time transceiver systems leveraging artificial intelligence (AI)-driven wireless technology, a critical advancement for sixth-generation (6G) mobile communications (AKA IMT 2030).

Conducted in collaboration with parent company NTT, Inc. (NTT), Nokia Bell Labs, and SK Telecom Co., Ltd, the field trials were held across three sites in Yokosuka City, Kanagawa Prefecture. The results validated that the application of AI optimized system throughput (transmission speed), achieving up to a 100% improvement over conventional, non-AI methods under identical environmental conditions, effectively doubling communication speeds.

Wireless communication quality can be compromised by fluctuations in radio propagation environments, leading to unstable connections. To mitigate this challenge, the partners developed “AI-AI technology,” which applies AI to both the transmitting and receiving ends of the wireless interface. This system dynamically optimizes modulation and demodulation schemes based on prevailing radio conditions, facilitating stable communication across diverse use cases. The efficacy of this technology had previously been confirmed in indoor environments.

The recent field trials aimed to verify the technology’s stable performance in complex outdoor settings, where radio conditions are subject to greater variability from factors such as temperature, weather, and physical obstructions.

Source: Pitinan Piyavatin/Alamy Stock Photo

This innovative AI wireless technology was evaluated across three distinct outdoor courses with varying propagation conditions, including the presence of obstacles and terminal mobility:

  • Course 1: A public road featuring gentle curves, with a test vehicle traveling up to 40 km/h.
  • Course 2: An environment with partial signal obstructions.
  • Course 3: A road with minimal obstructions, with a test vehicle traveling up to 60 km/h.

In all test scenarios, the technology demonstrated its ability to compensate for signal degradation, confirming enhanced communication speeds. Specifically, in the highly complex propagation conditions of Course 1, the AI-AI technology yielded an average throughput improvement of 18% and a maximum increase of 100% compared to traditional methods.

These findings enable higher-speed data transmission for users and allow network operators to enhance spectrum efficiency and deliver superior quality of service (QoS). The successful outdoor validation marks a significant milestone toward the practical implementation of 6G systems, which promise a combination of high wireless transmission efficiency and reduced power consumption.  NTT DOCOMO remains committed to refining this technology under a wide range of conditions and accelerating R&D efforts toward 6G realization, while simultaneously collaborating with global partners on 6G standardization (in 3GPP and ITU-R WP5D) and deployment.

This new technology will be featured at the NTT R&D FORUM 2025 hosted by NTT, scheduled from November 19–21 and November 25–26, 2025.

…………………………………………………………………………………………………………………………………………………………………………………….

These three AI-wireless field trials represent the latest joint effort stemming from the collaborative AI research partnership of DOCOMO, parent NTT, Nokia Bell Labs, and SK Telecom Co, which was established at Mobile World Congress (MWC) in February 2024.

NTT Docomo has forged additional 6G alliances with a range of partners, including Ericsson, domestic Japanese suppliers Fujitsu and NEC, and testing specialists Keysight Technologies and Rohde & Schwarz.

This collaboration highlights the extensive international cooperation in 6G development involving Japanese, Korean, and Western corporations. This contrasts sharply with 6G development initiatives in the People’s Republic of China, which remain predominantly insular and confined almost exclusively to domestic Chinese entities.

This year has seen an increase in partnerships among Korean and Japanese operators. Earlier this month, KDDI‘s research partnership with Nokia Bell Labs was announced, focusing on achieving 6G energy efficiency and enhanced network resilience. Samsung and SoftBank entered into a memorandum of understanding (MoU) last month to co-develop prospective next-generation technologies, encompassing 6G, AI-driven Radio Access Networks (AI RAN), and Large Telecom Models (LTMs).

In a separate MoU signed in March, KT‘s and Samsung’s collaboration was formalized to jointly advance 6G antenna technology. Additionally, KT has maintained a separate research engagement with Nokia centered on semantic communications research.

………………………………………………………………………………………………………………………………………………………………………………………….

About NTT DOCOMO:

NTT DOCOMO, Japan’s leading mobile operator with over 91 million subscribers, is one of the global leaders in 3G, 4G and 5G mobile network technologies.
Under the slogan “Bridging Worlds for Wonder & Happiness,” DOCOMO is actively collaborating with global partners to expand its business scope from mobile services to comprehensive solutions, aiming to deliver unsurpassed value and drive innovation in technology and communications, ultimately to support positive change and advancement in global society.

………………………………………………………………………………………………………………………………………………………………………………………….

References:

https://www.docomo.ne.jp/english/info/media_center/pr/2025/1117_00.html

https://www.docomo.ne.jp/english/

https://www.lightreading.com/6g/ntt-docomo-doubles-6g-throughput-in-ai-trials

NTT Docomo will use its wireless technology to enter the metaverse

Expose: AI is more than a bubble; it’s a data center debt bomb

We’ve previously described the tremendous debt that AI companies have assumed, expressing serious doubts that it will ever be repaid. This article expands on that by pointing out the huge losses incurred by the AI startup darlings and that AI poster child Open AI won’t have the cash to cover its costs 9which are greater than most analysts assume).  Also, we quote from the Wall Street Journal, Financial Times, Barron’s, along with a dire forecast from the Center for Public Enterprise.

In Saturday’s print edition, The Wall Street Journal notes:

OpenAI and Anthropic are the two largest suppliers of generative AI with their chatbots ChatGPT and Claude, respectively, and founders Sam Altman and Dario Amodei have become tech celebrities.

What’s only starting to become clear is that those companies are also sinkholes for AI losses that are the flip side of chunks of the public-company profits.

OpenAI hopes to turn profitable only in 2030, while Anthropic is targeting 2028. Meanwhile, the amounts of money being lost are extraordinary.

It’s impossible to quantify how much cash flowed from OpenAI to big tech companies. But OpenAI’s loss in the quarter equates to 65% of the rise in underlying earnings of Microsoft, Nvidia, Alphabet, Amazon and Meta together. That ignores Anthropic, from which Amazon recorded a profit of $9.5B from its holding in the loss-making company in the quarter.

OpenAI committed to spend $250 billion more on Microsoft’s cloud and has signed a $300 billion deal with Oracle, $22 billion with CoreWeave and $38 billion with Amazon, which is a big investor in rival Anthropic.

OpenAI doesn’t have the income to cover its costs. It expects revenue of $13 billion this year to more than double to $30 billion next year, then to double again in 2027, according to figures provided to shareholders. Costs are expected to rise even faster, and losses are predicted to roughly triple to more than $40 billion by 2027. Things don’t come back into balance even in OpenAI’s own forecasts until total computing costs finally level off in 2029, allowing it to scrape into profit in 2030.

The losses at OpenAI that has helped boost the profits of Big Tech may, in fact, understate the true nature of the problem.  According to the Financial Times:

OpenAI’s running costs may be a lot more than previously thought, and that its main backer Microsoft is doing very nicely out of their revenue share agreement.

OpenAI appears to have spent more than $12.4bn at Azure on inference compute alone in the last seven calendar quarters. Its implied revenue for the period was a minimum of $6.8bn. Even allowing for some fudging between annualised run rates and period-end totals, the apparent gap between revenues and running costs is a lot more than has been reported previously.

The apparent gap between revenues and running costs is a lot more than has been reported previously. If the data is accurate, then it would call into question the business model of OpenAI and nearly every other general-purpose LLM vendor.

Also, the financing needed to build out the data centers at the heart of the AI boom is increasingly becoming an exercise in creative accounting. The Wall Street Journal reports:

The Hyperion deal is a Frankenstein financing that combines elements of private-equity, project finance and investment-grade bonds. Meta needed such financial wizardry because it already issued a $30B bond in October that roughly doubled its debt load overnight.

Enter Morgan Stanley, with a plan to have someone else borrow the money for Hyperion. Blue Owl invested about $3 billion for an 80% private-equity stake in the data center, while Meta retained 20% for the $1.3 billion it had already spent. The joint venture, named Beignet Investor after the New Orleans pastry, got another $27 billion by issuing bonds that pay off in 2049, $18 billion of which Pimco purchased. That debt is on Beignet’s balance sheet, not Meta’s.

Dan Fuss, vice chairman of Loomis Sayles told Barrons: “We are good at taking credit risk,” Dan said, cheerfully admitting to having the scars to show for it. That is, he added, if they know the credit. But that’s become less clear with the recent spate of mind-bendingly complex megadeals, with myriad entities funding multibillion-dollar data centers.  Fuss thinks current data-center deals are too speculative. The risk is too great and future revenue too uncertain. And yields aren’t enough to compensate, he concluded.

Increased wariness about monster hyper-scaler borrowings has sent the cost of insuring their debt against default soaring. Credit default swaps (CDS) more than doubled for Oracle since September, after it issued $18 billion in public bonds and took out a $38 billion private loan. CoreWeave’s CDS gapped higher this past week, mirroring the slide of the data-center company’s stock.

According to the Bank Credit Analyst (BCA), capex busts weigh on the economy, which further hits asset prices, the firm says. Following the dot-com bust, a housing bubble grew, which burst in the 2008-09 financial crisis. “It is far from certain that a new bubble will emerge (after the AI bubble bursts) this time around, in which case the resulting recession could be more severe than the one in 2001,” BCA notes.

………………………………………………………………………………………………………………………………………………

The furious push by AI hyperscalers to build out data centers will need about $1.5 trillion of investment-grade bonds over the next five years and extensive funding from every other corner of the market, according to an analysis by JPMorgan Chase & Co.  “The question is not ‘which market will finance the AI-boom?’ Rather, the question is ‘how will financings be structured to access every capital market?’” according to the strategists.
Leveraged finance is primed to provide around $150 billion over the next half decade, they said. Even with funding from the investment-grade and high-yield bond markets, as well as up to $40 billion per year in data-center securitizations, it will still be insufficient to meet demand, the strategists added. Private credit and governments could help cover a remaining $1.4 trillion funding gap, the report estimates.  The bank calculates an at least $5 trillion tab that could climb as high as $7 trillion, single handedly driving a reacceleration in growth in the bond and syndicated loan markets, the strategists wrote in a report Monday.
Data center demand — which the analysts said will be limited only by physical constraints like computing resources, real estate, and energy — has gone parabolic in recent months, defying some market-watchers’ fears of a bubble. A $30 billion bond sale by Meta Platforms Inc. last month set a record for the largest order book in the history of the high-grade bond market, and investors were ready to fork over another $18 billion to Oracle Corp. last week to fund a data center campus.
Warning signs that investor exuberance about data centers may be approaching irrational levels have been flashing brighter in recent weeks. More than half of data industry executives are worried about future industry distress in a recent poll, and others on Wall Street have expressed concern about the complex private debt instruments hyperscalers are using to keep AI funding off their balance sheets.
……………………………………………………………………………………………………………………………………………….

The widening gap between the expenditures needed to build out AI data centers and the cash flows generated by the products they enable creates a colossal risk which could crash asset values of AI companies. The Center for Public Enterprise reports that it’s “Bubble or Nothing.

Should economic conditions in the tech sector sour, the burgeoning artificial intelligence (AI) boom may evaporate—and, with it, the economic activity associated with the boom in data center development.

Circular financing, or “roundabouting,” among so-called hyperscaler tenants—the leading tech companies and AI service providers—create an interlocking liability structure across the sector. These tenants comprise an incredibly large share of the market and are financing each others’ expansion, creating concentration risks for lenders and shareholders.

Debt is playing an increasingly large role in the financing of data centers. While debt is a quotidian aspect of project finance, and while it seems like hyperscaler tech companies can self-finance their growth through equity and cash, the lack of transparency in some recent debt-financed transactions and the interlocked liability structure of the sector are cause for concern.

If there is a sudden stop in new lending to data centers, Ponzi finance units ‘with cash flow shortfalls will be forced to try to make position by selling out position’—in other words to force a fire sale—which is ‘likely to lead to a collapse of asset values.’

The fact that the data center boom is threatened by, at its core, a lack of consumer demand and the resulting unstable investment pathways, is itself an ironic miniature of the U.S. economy as a whole. Just as stable investment demand is the linchpin of sectoral planning, stable aggregate demand is the keystone in national economic planning. Without it, capital investment crumbles.

……………………………………………………………………………………………………………..

Postscript (November 23, 2025):

In addition to cloud/hyperscaler AI spending, AI start-ups (especially OpenAI) and newer IT infrastructure companies (like Oracle) play a prominent role. It’s often a “scratch my back and I’ll scratch yours” type of deal.  Let’s look at the “circular financing” arrangement between Nvidia and OpenAI where capital flows from Nvidia to OpenAI and then back to Nvidia. That ensures Nvidia a massive, long-term customer and providing OpenAI with the necessary capital and guaranteed access to critical, high-demand hardware. Here’s the scoop:

  • Nvidia has agreed to invest up to $100 billion in OpenAI over time. This investment will be in cash, likely for non-voting equity shares, and will be made in stages as specific data center deployment milestones are met.
  • OpenAIhas committed to building and deploying at least 10 gigawatts of AI data center capacity using Nvidia’s silicon and equipment, which will involve purchasing millions of Nvidia expensive GPU chips.

Here’s the Circular Flow of this deal:

  • Nvidia provides a cash investment to OpenAI.
  • OpenAI uses that capital (and potentially raises additional debt using the commitment as collateral) to build new data centers.
  • OpenAI then uses the funds to purchase Nvidia GPUs and other data center infrastructure.
  • The revenue from these massive sales flows back to Nvidia, helping to justify its soaring stock price and funding further investments.

What’s wrong with such an arrangement you ask? Anyone remember the dot-com/fiber optic boom and bust? Critics have drawn parallels to the “vendor financing” practices of the dot-com era, arguing these interconnected deals could create a “mirage of growth” and potentially an AI bubble, as the actual organic demand for the products is difficult to assess when companies are essentially funding their own sales.

However, supporters note that, unlike the dot-com bubble, these deals involve the creation of tangible physical assets (data centers and chips) and reflect genuine, booming demand for AI compute capacity although it’s not at all certain how they’ll be paid for.

There’s a similar cozy relationship with the $1B Nvidia invested in Nokia with the Finnish company now planning to ditch Marvell’s silicon and replace it by buying the more expensive, power hungry Nvidia GPUs for its wireless network equipment.  Nokia, has only now become a strong supporter of Nvidia’s AI RAN (Radio Access Network), which has many telco skeptics.

………………………………………………………………………………………………………………………………………………….

References:

https://www.wsj.com/tech/ai/big-techs-soaring-profits-have-an-ugly-underside-openais-losses-fe7e3184

https://www.ft.com/content/fce77ba4-6231-4920-9e99-693a6c38e7d5

https://www.wsj.com/tech/ai/three-ai-megadeals-are-breaking-new-ground-on-wall-street-896e0023

https://www.barrons.com/articles/ai-debt-megadeals-risk-uncertainty-boom-bust-7de307b9?mod=past_editions

Bubble or Nothing

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?

AI Data Center Boom Carries Huge Default and Demand Risks

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025; much more in 2026!

Gartner: AI spending >$2 trillion in 2026 driven by hyperscalers data center investments

Amazon’s Jeff Bezos at Italian Tech Week: “AI is a kind of industrial bubble”

FT: Scale of AI private company valuations dwarfs dot-com boom

 

 

IBM and Groq Partner to Accelerate Enterprise AI Inference Capabilities

 IBM and Groq [1.] today announced a strategic market and technology partnership designed to give clients immediate access to Groq’s inference technology — GroqCloud, on watsonx Orchestrate – providing clients high-speed AI inference capabilities at a cost that helps accelerate agentic AI deployment. As part of the partnership, Groq and IBM plan to integrate and enhance RedHat open source vLLM technology with Groq’s LPU architecture. IBM Granite models are also planned to be supported on GroqCloud for IBM clients.

………………………………………………………………………………………………………………………………………………….

Note 1. Groq is a privately held company founded by Jonathan Ross in 2016. As a startup, its ownership is distributed among its founders, employees, and a variety of venture capital and institutional investors including BlackRock Private Equity PartnersGroq developed the LPU and GroqCloud to make compute faster and more affordable. The company says it is trusted by over two million developers and teams worldwide and is a core part of the American AI Stack.

NOTE that Grok, a conversational AI assistant developed by Elon Musk’s xAI is a completely different entity.

………………………………………………………………………………………………………………………………………………….

Enterprises moving AI agents from pilot to production still face challenges with speed, cost, and reliability, especially in mission-critical sectors like healthcare, finance, government, retail, and manufacturing. This partnership combines Groq’s inference speed, cost efficiency, and access to the latest open-source models with IBM’s agentic AI orchestration to deliver the infrastructure needed to help enterprises scale.

Powered by its custom LPU, GroqCloud delivers over 5X faster and more cost-efficient inference than traditional GPU systems. The result is consistently low latency and dependable performance, even as workloads scale globally. This is especially powerful for agentic AI in regulated industries.

For example, IBM’s healthcare clients receive thousands of complex patient questions simultaneously. With Groq, IBM’s AI agents can analyze information in real-time and deliver accurate answers immediately to enhance customer experiences and allow organizations to make faster, smarter decisions.

This technology is also being applied in non-regulated industries. IBM clients across retail and consumer packaged goods are using Groq for HR agents to help enhance automation of HR processes and increase employee productivity.

“Many large enterprise organizations have a range of options with AI inferencing when they’re experimenting, but when they want to go into production, they must ensure complex workflows can be deployed successfully to ensure high-quality experiences,” said Rob Thomas, SVP, Software and Chief Commercial Officer at IBM. “Our partnership with Groq underscores IBM’s commitment to providing clients with the most advanced technologies to achieve AI deployment and drive business value.”

“With Groq’s speed and IBM’s enterprise expertise, we’re making agentic AI real for business. Together, we’re enabling organizations to unlock the full potential of AI-driven responses with the performance needed to scale,” said Jonathan Ross, CEO & Founder at Groq. “Beyond speed and resilience, this partnership is about transforming how enterprises work with AI, moving from experimentation to enterprise-wide adoption with confidence, and opening the door to new patterns where AI can act instantly and learn continuously.”

IBM will offer access to GroqCloud’s capabilities starting immediately and the joint teams will focus on delivering the following capabilities to IBM clients, including:

  • High speed and high-performance inference that unlocks the full potential of AI models and agentic AI, powering use cases such as customer care, employee support and productivity enhancement.
  • Security and privacy-focused AI deployment designed to support the most stringent regulatory and security requirements, enabling effective execution of complex workflows.
  • Seamless integration  with IBM’s agentic product, watsonx Orchestrate, providing clients flexibility to adopt purpose-built agentic patterns tailored to diverse use cases.

The partnership also plans to integrate and enhance RedHat open source vLLM technology with Groq’s LPU architecture to offer different approaches to common AI challenges developers face during inference. The solution is expected to enable watsonx to leverage capabilities in a familiar way and let customers stay in their preferred tools while accelerating inference with GroqCloud. This integration will address key AI developer needs, including inference orchestration, load balancing, and hardware acceleration, ultimately streamlining the inference process.

Together, IBM and Groq provide enhanced access to the full potential of enterprise AI, one that is fast, intelligent, and built for real-world impact.

References:

https://www.prnewswire.com/news-releases/ibm-and-groq-partner-to-accelerate-enterprise-ai-deployment-with-speed-and-scale-302588893.html

FT: Scale of AI private company valuations dwarfs dot-com boom

AI adoption to accelerate growth in the $215 billion Data Center market

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?