AWS to deploy AI inference chips from Cerebras in its data centers; Anapurna Labs/Amazon in-house AI silicon products

Amazon Web Services (AWS) announced it plans to integrate AI processors from Cerebras Systems [1.]  into its data centers, signaling growing confidence in the AI-focused semiconductor startup. Under a new multiyear partnership announced Friday, AWS will deploy Cerebras’s Wafer-Scale Engine (WSE) to accelerate inference workloads—the stage of AI operations where models generate responses to user queries. Financial details of the agreement were not disclosed.

Note 1.  Founded in 2015 and headquartered in Sunnyvale, CA, Cerebras claims to have the world’s fastest AI inference and training platform.

The collaboration reflects a significant realignment in compute infrastructure strategies across the AI ecosystem. While initial industry focus centered on model training, the rapid expansion of deployed AI services is driving demand for optimized inference performance. Traditional GPUs, though unmatched for training, can be suboptimal for inference scenarios that require ultra-low latency and high throughput. Cloud and AI platform providers are therefore diversifying their silicon portfolios to better match workload profiles and to scale capacity efficiently.

AWS, the world’s largest cloud infrastructure provider, has traditionally relied on its in-house semiconductor division, Annapurna Labs, for custom chip design. Annapurna’s Trainium processors compete with GPUs from major suppliers such as Nvidia and AMD, offering cost and performance advantages for AI training workloads. The new partnership introduces Cerebras technology into AWS infrastructure, where it will work alongside Trainium to enhance large-scale inference capabilities.

Cerebras, best known for its wafer-scale architecture, markets its WSE processors as a high-speed inference platform capable of executing the decode phase of generative AI processing—where text, images, or other outputs are generated—at up to 25 times the speed of conventional GPU solutions. The company, valued at approximately $23 billion following a $1 billion funding round in February, has attracted backing from Fidelity, Benchmark, Tiger Global, Atreides, and Coatue.

The Cerebras deal underscores a major shift in the market for computing power. Image Credit: rebecca lewington/cerebras syste/Reuters

The AWS collaboration follows Cerebras’s major compute partnership with OpenAI, which reportedly involves deploying up to 750 MW of computing capacity powered by its chips. AWS and Cerebras will position their joint offering as a premium cloud inference solution, targeting enterprise AI developers requiring high-performance and scalable compute.

“The scale of AI demand is shifting from model creation to global deployment,” said Andrew Feldman, CEO of Cerebras. “Working with AWS aligns our technology with the industry’s largest cloud, giving us reach to a broad enterprise and developer base. If you want slow inference, there will be cheaper ways to go,” Feldman said. “But if you want fast tokens, if speed matters to you, if you’re doing coding or agentic work, not only are we the absolute fastest, but we intend to set the bar. We’re in this to win it.”

AWS and Cerebras will support both aggregated and disaggregated configurations. Disaggregated is ideal when you have large, stable workloads. Most customers run a mix of workloads with different prefill/decode ratios, where the traditional aggregated approach is still ideal. The start-up expects most customers will want access to both and the ability to route workloads to whichever configuration serves them best.

The move intensifies competition in the inference silicon segment, where Nvidia faces growing pressure from purpose-built processor architectures such as Cerebras’s WSE and other emerging alternatives. Nvidia, which recently announced a $20 billion licensing deal with Groq and plans to unveil a new inference-optimized platform, remains the dominant supplier but now contends with an accelerating wave of specialization across the AI compute stack.

AWS vice president and Annapurna Labs co-founder Nafea Bshara emphasized the company’s goal of offering flexible performance tiers. “Our job is to push the speed and lower the price,” he said, noting that AWS will continue to offer cost-optimized Trainium-only options alongside high-performance Cerebras-Trainium configurations.

………………………………………………………………………………………………………………………………………………………………………………………………….

Amazon’s Internally Designed AI Silicon:

Amazon has built a fairly broad internal AI-oriented silicon portfolio through Annapurna Labs, primarily for AWS:

  • Inferentia (Inferentia, Inferentia2) – Custom machine learning accelerators designed for high-throughput, low-cost inference at cloud scale. These power many AWS inference instances and are positioned as an alternative to Nvidia GPUs for production model serving.

  • Trainium (Trainium, Trainium2, Trainium3) – AI training accelerators optimized for large-scale model training (including frontier and foundation models), with Trainium2 and Trainium3 as newer generations offering materially higher performance and better $/compute than the first generation. These are central to projects such as the Rainier supercomputer for Anthropic.

  • Graviton (Graviton, Graviton2/3/4) – Arm-based general-purpose CPUs used heavily across EC2, increasingly in AI-adjacent roles (pre/post-processing, orchestration, model-serving microservices) and as part of cost-optimized AI stacks, even though they are not dedicated accelerators.

  • Nitro system – While not an AI accelerator per se, the Nitro family (offload cards and system) is an internally developed data-plane and virtualization offload architecture that underpins EC2 and works in tandem with Graviton, Inferentia, and Trainium to free CPU cycles and improve I/O for AI/ML workloads.

All of these are designed and iterated internally by Annapurna Labs for exclusive use in AWS data centers, then exposed to customers via AWS services rather than as standalone merchant silicon.

Amazon’s Annapurna Labs is an internal chip design group that has become a core strategic asset for AWS, especially for custom data center and AI silicon.

Origins and acquisition:

  • Annapurna Labs is an Israeli chip design startup founded in 2011 by semiconductor veterans of Intel and Broadcom, including Avigdor Willenz and Nafea Bshara.

  • “When we talked with market sources and consulted with experts in the fields of data and servers, at that time only Amazon had a holistic vision and the ability to execute on a large scale,” recalls Bshara about the start of the romance with Amazon. “We were prepared to build the technology and at the same time were open to working with startups. From there we began a journey together with many meetings and shared thinking, among others with James Hamilton (Microsoft’s former data-base product architect and to AWS SVP), and from there within six months we found ourselves inside Amazon.”
  • Amazon began working with the company around 2013 and acquired it in 2015 for an estimated $350–$400 million.

  • Before the deal, Annapurna was in stealth, focusing on low‑power networking and server chips to improve data center efficiency.

Role inside Amazon and AWS:

  • Post‑acquisition, Annapurna was folded into AWS as a specialist microelectronics and custom silicon group, designing chips to reduce cost and power per unit of compute.

  • The group underpins several key AWS technologies: the Nitro system for offloading virtualization and I/O, Arm‑based Graviton CPUs for general compute, and Trainium and Inferentia accelerators for AI training and inference.

  • These chips let AWS optimize performance per watt and per dollar versus x86 servers and third‑party accelerators, improving margins and competitive pricing.

Key products and architectures:

  • Nitro: A combination of custom hardware and software that offloads storage, networking, and security functions from the host CPU, increasing tenant isolation and freeing CPU cycles for workloads.

  • Graviton: A family of Arm‑based server CPUs; by 2018 Graviton was widely adopted on AWS and is now used by most AWS customers for general cloud infrastructure workloads due to better price‑performance and energy efficiency.

  • Inferentia and Trainium: Custom accelerators designed by Annapurna for machine learning inference (Inferentia) and training (Trainium), intended to reduce AWS’s dependence on high‑priced Nvidia GPUs for AI workloads.

Strategic importance and AI focus:

  • Annapurna’s work is central to Amazon’s strategy of vertical integration in the cloud: owning the silicon stack as much as the software and services.

  • The group designs chips that power Amazon’s AI infrastructure, including systems used both by internal teams and external customers such as Anthropic, for which AWS is the primary cloud and silicon provider.

  • Amazon and Anthropic are collaborating on “Project Rainier,” a massive supercomputer built around hundreds of thousands of Annapurna‑designed Trainium2 chips, targeting more than five times the compute used to train current frontier models.

Organization, footprint, and industry impact:

  • Annapurna Labs maintains a significant presence in Israel, employing hundreds of engineers focused on advanced AI and networking processors for AWS.

  • It also operates major engineering hubs such as an Austin, Texas lab where advanced semiconductors and AI systems are designed and tested.

  • Analysts often describe the acquisition as one of Amazon’s most successful, arguing that Annapurna’s custom silicon is a “secret sauce” that helps AWS compete with Microsoft, Google, and others on performance, cost, and energy efficiency.

…………………………………………………………………………………………………………………………………………………………..

References:

https://www.cerebras.ai/company

https://www.cerebras.ai/blog/cerebras-is-coming-to-aws

https://www.wsj.com/tech/amazon-announces-inference-chips-deal-with-cerebras-109ecd31

https://www.marketwatch.com/story/how-the-ceo-of-this-upstart-nvidia-rival-hopes-to-seize-on-the-lucrative-market-for-ai-chips-d5ccdab0

https://en.globes.co.il/en/article-nafea-bshara-the-israeli-behind-amazons-graviton-chip-1001420744

Intel and AI chip startup SambaNova partner; SN50 AI inferencing chip max speed said to be 5X faster than competitive AI chips

Custom AI Chips: Powering the next wave of Intelligent Computing

RAN silicon rethink – from purpose built products & ASICs to general purpose processors or GPUs for vRAN & AI RAN

Will “AI at the Edge” transform telecom or be yet another telco monetization failure?

Huawei to Double Output of Ascend AI chips in 2026; OpenAI orders HBM chips from SK Hynix & Samsung for Stargate UAE project

OpenAI and Broadcom in $10B deal to make custom AI chips

U.S. export controls on Nvidia H20 AI chips enables Huawei’s 910C GPU to be favored by AI tech giants in China

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

2026 Consumer Electronics Show Preview: smartphones, AI in devices/appliances and advanced semiconductor chips

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Google announces Gemini: it’s most powerful AI model, powered by TPU chips

 

AT&T and AWS to deliver last mile connectivity for AI workloads; AT&T Geo Modeler™ AI simulation tool

AT&T is strategically re-architecting its infrastructure for the AI era through high-capacity network modernization and deep integration with hyperscale cloud providers.

In addition to its almost six year old deal to run its 5G SA core network in Microsoft Azure’s cloudAT&T announced at MWC 2026 that it’s now woring with Amazon Web Services (AWS) to extend 5G and fiber connectivity from business customers and locations directly into AWS environments, creating secure, resilient and reliable premises‑to‑cloud architectures for AI workloads. The collaboration is designed to reduce network complexity and latency while supporting real‑time analytics, machine learning, and agentic AI use cases.

This collaboration continues a long-standing relationship between AT&T and AWS and follows recent news outlining broader efforts to modernize the nation’s connectivity infrastructure by providing high-capacity fiber to AWS data centers, migrate AT&T workloads to AWS cloud capabilities and explore emerging satellite technologies.

AWS Interconnect – last mile embeds AT&T‑delivered connectivity directly into AWS workflows, designed to enable customers to provision and manage last‑mile connectivity within the AWS environment and lays the foundation for the use of AI agents to monitor and manage the AI experience from the user to the cloud. This streamlined, self‑managed approach helps enterprises reduce network complexity while maintaining control of their extended enterprise network, allowing businesses to move faster as they scale AI.

High level illustration of the planned AWS Interconnect – last mile architecture, showing how resilient interconnections and AT&T Fiber and fixed wireless access are intended to simplify private connectivity from customer locations into AWS environments. 

Diagram Source: AT&T

………………………………………………………………………………………………………

“AI does not just need more compute; it needs flatter networks and faster connections,” said Shawn Hakl, SVP & Head of Product, AT&T Business. “By bringing high‑capacity connectivity closer to cloud platforms, integrating the management of the networks directly into the cloud provisioning process and engineering for resiliency at the metro level, AT&T is helping enterprises streamline their networks, improve performance, security, and scale AI with confidence.”

AT&T says they are building an AI‑ready network (?) designed to scale performance by continuing ongoing network investment, including the growth of capacities up to 1.6Tbps across key metro and long‑haul routes.

AT&T also announced it would work with Nvidia, Microsoft and MicroAI through its Connected AI platform for “smart manufacturing.”

………………………………………………………………………………………………………………..

Finally, AT&T described  AT&T Geo Modeler which is able to better predict connectivity for emerging technologies like autonomous vehicles, drones, and robotics.

The Geo Modeler is an AI-powered simulation tool that helps predict, in near real time, how a wireless network will perform in the real world. Inspired by the video games Kounev played with his family growing up, the virtual model and simulation is “essentially like a giant video game of the United States” that, infused with AI tools, gives engineers a clearer picture of where potential weak spots may appear. Then issues can be addressed earlier and fixes can roll out faster. In essence, it creates virtual models, similar to the way video games are designed and developed.

“The Geo Modeler helps us see how the real world will shape coverage before we build, so we can deliver connectivity that’s ready for what’s next,” said AT&T scientist Velin Kounev.

Matt Harden, VP of Connected Solutions at AT&T, agrees. “The Geo Modeler is a foundational capability for the connected mobility era,” he said. “By marrying advanced geospatial simulation with AI-driven network orchestration, we can deliver predictable, high-performance connectivity that adapts with the environment. Whether it’s a hurricane, a packed stadium, or a city corridor full of autonomous vehicles, we will be prepared.”

References:

https://about.att.com/story/2026/aws-collaboration-scalable-business-ai.html

https://about.att.com/blogs/2026/150-years-of-connection.html

https://about.att.com/blogs/2025/geo-modeler.html

AT&T and Ericsson boost Cloud RAN performance with AI-native software running on Intel Xeon 6 SoC

AT&T deploys nationwide 5G SA while Verizon lags and T-Mobile leads

AT&T to buy spectrum licenses from EchoStar for $23 billion

AT&T’s convergence strategy is working as per its 3Q 2025 earnings report

Progress report: Moving AT&T’s 5G core network to Microsoft Azure Hybrid Cloud platform

AT&T 5G SA Core Network to run on Microsoft Azure cloud platform

 

Intel and AI chip startup SambaNova partner; SN50 AI inferencing chip max speed said to be 5X faster than competitive AI chips

Intel and AI chip startup SambaNova have entered into a multi-year strategic collaboration to deploy high-performance, cost-efficient AI inference solutions [1.] tailored for AI-native firms, enterprises, and government sectors. This global initiative leverages Intel® Xeon® infrastructure, with Intel Capital further signaling commitment through participation in SambaNova’s $350M Series E financing round.  The collaboration will give customers a powerful alternative to GPU‑centric solutions, offering optimized performance for leading open‑source models with predictable throughput and total cost of ownership. Founded in 2017, the Palo Alto, CA company specializes in AI chips and software. SambaNova’s Chairman is Lip-Bu Tan, who is also the CEO of Intel!

Note 1. AI inferencing is the process of using a trained AI model to make real-time predictions, decisions, or generate content from new, previously unseen data. It transforms inputs (a query, image, sensor reading) into useful results (a sentence, classification, alert). Unlike training and language models, inference is about prompt execution, often requiring low-latency (speed) and high efficiency. AI Inference chips have attracted intense investor interest following a wave of deal making around rivals to Nvidia, as AI companies seek faster and more efficient hardware. See References below for more information.

………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

For high-scale AI workloads, the integration of Intel CPUs with SambaNova’s specialized AI platform was said to offer a high-efficiency rack-level inference alternative. This partnership serves as a strategic bridge as Intel scales its independent GPU-based offerings. Intel remains fully committed to its internal GPU roadmap, continuing aggressive investment across architecture, software, and systems. This collaboration enhances Intel’s edge-to-cloud strategy without altering its competitive trajectory in the GPU market. By combining Xeon processors, Intel networking, and SambaNova systems, the two companies are positioned to capture a significant share of the multi-billion-dollar inference market through heterogeneous data center architectures.

As part of the collaboration, Intel plans to make a strategic investment in SambaNova to accelerate the rollout of an Intel‑powered AI cloud. The collaboration is expected to span three key areas:

  • AI Cloud Expansion – Scaling SambaNova’s vertically integrated AI cloud, built on Intel Xeon‑based infrastructure and optimized for large language and multimodal models. The platform will deliver low‑latency, high‑throughput AI services, supported by reference architectures, deployment blueprints, and partnerships with system integrators and software vendors.
  • Integrated AI Infrastructure – Combining SambaNova’s systems with Intel’s CPUs, accelerators, and networking technologies to power scalable, production‑ready inference for reasoning, code generation, multimodal applications, and agentic workflows.
  • Go‑to‑Market Execution – Joint co‑selling and co‑marketing through Intel’s global enterprise, cloud, and partner channels to accelerate adoption across the AI ecosystem.

Together, SambaNova and Intel aim to shape the next generation of heterogeneous AI data centers — integrating Intel Xeon processors, Intel GPUs, Intel networking and storage, and SambaNova systems — to unlock a multi‑billion‑dollar inference market opportunity.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………………

SambaNova also announced its SN50 AI chip, which boasts a max speed that’s 5X faster than competitive chips, according to the company.

Image Credit: SambaNova

Positioned as the most efficient chip for agentic AI, the SN50 chip offers enterprises a 3X lower total cost of ownership – a powerful foundation to scale fast inference and bring autonomous AI agents into full production. The SN50 will be shipping to customers later this year.  To quickly scale and distribute SN50, SambaNova is collaborating with Intel, and has obtained $350 million in strategic Series E financing to expand manufacturing and cloud capacity.

“AI is no longer a contest to build the biggest model,” said Rodrigo Liang, co‑founder and CEO of SambaNova. “With the SN50 and our deep collaboration with Intel, the real race is about who can light up entire data centers with AI agents that answer instantly, never stall, and do it at a cost that turns AI from an experiment into the most profitable engine in the cloud.”

“Customers are asking for more choice and more efficient ways to scale AI,” said Kevork Kechichian, EVP, General Manager, Data Center Group, Intel. “By combining Intel’s leadership in compute, networking, and memory with SambaNova’s full-stack AI systems and inference cloud platform, we are delivering a compelling option for organizations looking for GPU alternatives to deploy advanced AI at scale.”

The SN50 delivers five times more compute per accelerator and four times more network bandwidth than the previous generation. It links up to 256 accelerators over a multi‑terabyte‑per‑second interconnect, cutting time‑to‑first‑token and supporting larger batch sizes. The result: Enterprises can deploy bigger, longer‑context AI models with higher throughput and responsiveness — while keeping performance high and costs and latency under control.

“AI is moving from a software story to an infrastructure story,” said Landon Downs, co-founder and managing partner at Cambium Capital. “SN50 is engineered for the real-world latency and economic requirements that will determine who successfully deploys agentic AI at scale.”

Built on SambaNova’s Reconfigurable Data Unit (RDU) architecture, SN50 delivers:

  • Instant AI Experiences – Ultra‑low latency delivers real‑time responsiveness for next‑gen enterprise apps like voice assistants.
  • Unmatched Scale and Concurrency – Power thousands of simultaneous AI sessions with consistent high performance.
  • Breakthrough Model Capacity – Three‑tier memory architecture unlocks 10T+ parameter models and 10M+ context lengths for deeper reasoning and richer outputs.
  • Maximum Efficiency at Scale – Higher hardware utilization lowers cost‑per‑token, driving greater performance and ROI.
  • Smarter Memory, Smarter Efficiency – Resident multi‑model memory and agentic caching optimize the three‑tier architecture, cutting infrastructure costs for enterprise‑scale AI deployments.

“The new SambaNova SN50 RDU changes the tokenomics of AI inference at scale. By delivering both high performance and high throughput with a chip that uses existing power and is air cooled, SambaNova is changing the game,” said Peter Rutten, Research Vice-President Performance Intensive Computing at analyst firm IDC.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………………

SoftBank Corp. will be the first customer to deploy SN50 within its next‑generation AI data centers in Japan. The deployment will power low‑latency inference services for sovereign and enterprise customers across Asia‑Pacific, supporting both open‑source and proprietary frontier models with aggressive latency and throughput requirements.

“With SN50, we are building an AI inference fabric for Japan that can serve our customers and partners with the speed, resiliency and sovereignty they expect from SoftBank,” said Hironobu Tamba, Vice President and Head of the Data Platform Strategy Division of the Technology Unit at SoftBank Corp. “By standardizing on SN50, we gain the ability to deliver world‑class AI services on our own terms — with the performance of the best GPU clusters, but with far better economics and control.”

The SN50 deployment deepens SambaNova’s existing relationship with SoftBank Corp., which already hosts SambaCloud to provide ultra‑fast inference for developers in the region. By anchoring its newest clusters on SN50, SoftBank positions SambaNova as the inference backbone for its sovereign AI initiatives and future large‑scale agentic services.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………………

References:

https://newsroom.intel.com/data-center/intel-and-sambanova-planning-multi-year-collaboration-for-xeon-based-ai-inference

https://sambanova.ai/press/sambanova-unveils-fastest-chip-for-agentic-ai-collaborates-with-intel-and-raises-350m

Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?

CES 2025: Intel announces edge compute processors with AI inferencing capabilities

Groq and Nvidia in non-exclusive AI Inference technology licensing agreement; top Groq execs joining Nvidia

Analysis: Edge AI and Qualcomm’s AI Program for Innovators 2026 – APAC for startups to lead in AI innovation

Custom AI Chips: Powering the next wave of Intelligent Computing

RAN silicon rethink – from purpose built products & ASICs to general purpose processors or GPUs for vRAN & AI RAN

OpenAI and Broadcom in $10B deal to make custom AI chips

Huawei to Double Output of Ascend AI chips in 2026; OpenAI orders HBM chips from SK Hynix & Samsung for Stargate UAE project

U.S. export controls on Nvidia H20 AI chips enables Huawei’s 910C GPU to be favored by AI tech giants in China

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

China vs U.S.: Race to Generate Power for AI Data Centers as Electricity Demand Soars

The International Energy Agency (IEA) forecasts that in the next five years, the global demand for power (electricity) is set to grow roughly 50% faster than it did during the previous decade – and more than twice as fast as energy demand overall.  That tremendous increase in demand is due to power hungry AI data centers.  There’s also electric cars and buses, electric-powered industrial machines, and electric heating of homes.

Global AI growth will be contingent on generating more power for data centers:

  • Global data center power demand is now expected to rise to a record 1,596 terawatt-hours by 2035 – +255% increase from 2025 levels.
  • The U.S. is set to remain the leader in energy consumption with a +144% surge in demand over this period, to 430 terawatt-hours.
  • China’s demand is projected to rise +255%, to 397 terawatt-hours.
  • European demand is expected to surge +303%, to 274 terawatt-hours.
  • New data centers coming online between now and 2030 will need more than 600 terawatt-hours of electricity. This is enough to power ~60 million homes.

 

Power for AI Data Centers: China vs U.S.:

China is currently ahead of the United States in generating and building out power infrastructure to support AI data centers, a phenomenon sometimes described by industry observers as an “electron gap.”

China’s rapid, centralized expansion of electricity generation—including both massive renewable projects and traditional, dispatchable power—has created a significant capacity advantage in the race to support AI workloads, which are increasingly limited by energy availability rather than just chip access.

Key factors in China’s power advantage for AI include:

Massive Generation Growth: Between 2010 and 2024, China’s power production increased by more than the rest of the world combined. In 2024 alone, China added 543 gigawatts of power capacity—more than the total capacity added by the U.S. in its entire history.

Significant Surplus Capacity: By 2030, China is projected to have roughly 400 gigawatts of spare power capacity, which is triple the expected power demand of the global data center fleet at that time.

“Eastern Data, Western Computing” Initiative: China is actively shifting energy-intensive data centers to its resource-rich western regions (like Inner Mongolia) while powering them with surplus renewable energy, such as wind and solar.

Lower Costs and Faster Buildouts: Data centers in China can pay less than half the rates for electricity that American data centers do. Furthermore, projects in China can move from planning to operation in months, compared to years in the U.S. due to faster permitting and fewer regulatory hurdles.

Conclusions:

While the U.S. currently leads in advanced AI chips and model development, it is facing a severe “energy bottleneck” for new data centers, with some requiring over a gigawatt of power. U.S. power demand has remained relatively flat for 20 years, resulting in a lag in building new capacity, whereas China has traditionally built power infrastructure in anticipation of high demand. Morgan Stanley has forecast that U.S. data centers could face a 44-gigawatt electricity shortfall in the next three years.

Despite China’s advantage in energy, U.S. export controls on high-end AI chips (such as Nvidia’s GPUs) have acted as a significant constraint on China’s actual AI compute power. This has led to a situation where the U.S. has the best “brains” (chips) but limited power to run them, while China has the “muscle” (energy) but limited access to top-tier AI brains.

However, the rapid improvements in Chinese AI models (such as DeepSeek), which are more energy-efficient and optimized for lower-tier hardware, may help mitigate this constraint.

References:

https://www.bloomberg.com/news/newsletters/2026-02-14/ai-battle-turbocharged-by-50-power-demand-surge-new-economy

https://www.iea.org/reports/electricity-2026

https://x.com/KobeissiLetter/status/2023437717888250284

How will the United States and China power the AI race?

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Analysis: Ethernet gains on InfiniBand in data center connectivity market; White Box/ODM vendors top choice for AI hyperscalers

Fiber Optic Boost: Corning and Meta in multiyear $6 billion deal to accelerate U.S data center buildout

How will fiber and equipment vendors meet the increased demand for fiber optics in 2026 due to AI data center buildouts?

Analysis: Cisco, HPE/Juniper, and Nvidia network equipment for AI data centers

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Nvidia CEO Huang: AI is the largest infrastructure buildout in human history; AI Data Center CAPEX will generate new revenue streams for operators

 

Nvidia CEO Huang: AI is the largest infrastructure buildout in human history; AI Data Center CAPEX will generate new revenue streams for operators

Executive Summary:

In a February 6, 2026 CNBC interview with with Scott Wapner, Nvidia CEO Jensen Huang [1.] characterized the current AI build‑out as “the largest infrastructure buildout in human history,” driven by exceptionally high demand for compute from hyperscalers and AI companies. “Through the roof” is how he described AI infrastructure spending.  It’s a “once-in-a-generation infrastructure buildout,” specifically highlighting that demand for Nvidia’s Blackwell chips and the upcoming Vera Rubin platform is “sky-high.” He emphasized that the shift from experimental AI to AI as a fundamental utility has reached a definitive inflection point for every major industry.

Jensen forecasts aa roughly 7–to- 8‑year AI investment cycle lies ahead, with capital intensity justified because deployed AI infrastructure is already generating rising cash flows for operators.  He maintains that the widely cited ~$660 billion AI data center capex pipeline is sustainable, on the grounds that GPUs and surrounding systems are revenue‑generating assets, not speculative overbuild. In his view, as long as customers can monetize AI workloads profitably, they will “keep multiplying their investments,” which underpins continued multi‑year GPU demand, including for prior‑generation parts that remain fully leased.

Note 1.  Being the undisputed leader of AI hardware (GPU chips and networking equipment via its Mellanox acquisition), Nvidia MUST ALWAYS MAKE POSITIVE REMARKS AND FORECASTS related to the AI build out boom.  Reader discretion is advised regarding Huang’s extremely bullish, “all-in on AI” remarks.

Huang reiterated that AI will “fundamentally change how we compute everything,” shifting data centers from general‑purpose CPU‑centric architectures to accelerated computing built around GPUs and dense networking. He emphasizes Nvidia’s positioning as a full‑stack infrastructure and computing platform provider—chips, systems, networking, and software—rather than a standalone chip vendor.  He accuratedly stated that Nvidia designs “all components of AI infrastructure” so that system‑level optimization (GPU, NIC, interconnect, software stack) can deliver performance gains that outpace what is possible with a single chip under a slowing Moore’s Law. The installed base is presented as productive: even six‑year‑old A100‑class GPUs are described as fully utilized through leasing, underscoring persistent elasticity of AI compute demand across generations.

AI Poster Childs – OpenAI and Anthropic:

Huang praised OpenAI and Anthropic, the two leading artificial intelligence labs, which both use Nvidia chips through cloud providers. Nvidia invested $10 billion in Anthropic last year, and Huang said earlier this week that the chipmaker will invest heavily in OpenAI’s next fundraising round.

“Anthropic is making great money. Open AI is making great money,” Huang said. “If they could have twice as much compute, the revenues would go up four times as much.”

He said that all the graphics processing units that Nvidia has sold in the past — even six-year old chips such as the A100 — are currently being rented, reflecting sustained demand for AI computing power.

“To the extent that people continue to pay for the AI and the AI companies are able to generate a profit from that, they’re going to keep on doubling, doubling, doubling, doubling,” Huang said.

Economics, utilization, and returns:

On economics, Huang’s central claim is that AI capex converts into recurring, growing revenue streams for cloud providers and AI platforms, which differentiates this cycle from prior overbuilds. He highlights very high utilization: GPUs from multiple generations remain in service, with cloud operators effectively turning them into yield‑bearing infrastructure.

This utilization and monetization profile underlies his view that the capex “arms race” is rational: when AI services are profitable, incremental racks of GPUs, network fabric, and storage can be modeled as NPV‑positive infrastructure projects rather than speculative capacity. He implies that concerns about a near‑term capex cliff are misplaced so long as end‑market AI adoption continues to inflect.

Competitive and geopolitical context:

Huang acknowledges intensifying global competition in AI chips and infrastructure, including from Chinese vendors such as Huawei, especially under U.S. export controls that have reduced Nvidia’s China revenue share to roughly half of pre‑control levels. He frames Nvidia’s strategy as maintaining an innovation lead so that developers worldwide depend on its leading‑edge AI platforms, which he sees as key to U.S. leadership in the AI race.

He also ties AI infrastructure to national‑scale priorities in energy and industrial policy, suggesting that AI data centers are becoming a foundational layer of economic productivity, analogous to past buildouts in electricity and the internet.

Implications for hyperscalers and chips:

Hyperscalers (and also Nvidia customers) Meta , Amazon, Google/Alphabet and Microsoft recently stated that they plan to dramatically increase spending on AI infrastructure in the years ahead. In total, these hyperscalers could spend $660 billion on capital expenditures in 2026 [2.] , with much of that spending going toward buying Nvidia’s chips. Huang’s message to them is that AI data centers are evolving into “AI factories” where each gigawatt of capacity represents tens of billions of dollars of investment spanning land, compute, and networking. He suggests that the hyperscaler industry—roughly a $2.5 trillion sector with about $500 billion in annual capex transitioning from CPU to GPU‑centric generative AI—still has substantial room to run.

Note 2.  An understated point is that while these hyperscalers are spending hundered of billions of dollars on AI data centers and Nvidia chips/equipment they are simultaneously laying off tens of thousands of employees.  For example, Amazon recently announced 16,000 job cuts this year after 14,000 layoffs last October.

From a chip‑level perspective, he argues that Nvidia’s competitive moat stems from tightly integrated hardware, networking, and software ecosystems rather than any single component, positioning the company as the systems architect of AI infrastructure rather than just a merchant GPU vendor.

References:

https://www.cnbc.com/2026/02/06/nvidia-rises-7percent-as-ceo-says-660-billion-capex-buildout-is-sustainable.html

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Analysis: Cisco, HPE/Juniper, and Nvidia network equipment for AI data centers

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

184K global tech layoffs in 2025 to date; ~27.3% related to AI replacing workers

 

 

Analysis: SpaceX FCC filing to launch up to 1M LEO satellites for solar powered AI data centers in space

SpaceX has applied to the Federal Communications Commission (FCC) for permission to launch up to 1 million LEO satellites for a new solar-powered AI data center system in space.  The private company, 40% owned by Elon Musk, envisions an orbital data center system with “unprecedented computing capacity” needed to run large-scale AI inference and applications for billions of users, according to SpaceX’s filing entered late on Friday.

Data centers are the physical backbone of artificial intelligence, requiring massive amounts of power. “By directly harnessing near-constant solar power with little operating or maintenance costs, these satellites will achieve transformative cost and energy efficiency while significantly reducing the environmental impact associated with terrestrial data centers,” the FCC filing said. Musk would need the telecom regulator’s approval to move forward.

Credit: Blueee/Alamy Stock Photo

The proposed new satellites would operate in “narrow orbital shells” of up to 50 kilometers each. The satellites would operate at altitudes of between 500 kilometers and 2,000 kilometers, and 30 degrees, and “sun-synchronous orbit inclinations” to capture power from the sun. The system is designed to be interconnected via optical links with existing Starlink broadband satellites, which would transmit data traffic back to ground Earth stations.

SpaceX’s request bets heavily on reduced costs of Starship, the company’s next-generation reusable rocket under development.  Starship has test-launched 11 times since 2023. Musk expects the rocket, which is crucial for expanding Starlink with more powerful satellites, to put its first payloads into orbit this year.
“Fortunately, the development of fully reusable launch vehicles like Starship that can deploy millions of tons of mass per year to orbit when launching at rate, means on-orbit processing capacity can reach unprecedented scale and speed compared to terrestrial buildouts, with significantly reduced environmental impact,” SpaceX said.
SpaceX is positioning orbital AI compute as the definitive solution to the terrestrial capacity crunch, arguing that space-based infrastructure represents the most efficient path for scaling next-generation workloads. As ground-based data centers face increasing grid density constraints and power delivery limitations, SpaceX intends to leverage high-availability solar irradiation to bypass Earth’s energy bottlenecks.The company’s technical rationale hinges on several key architectural advantages:
  • Energy Density & Sustainability: By tapping into “near-constant solar power,” SpaceX aims to utilize a fraction of the Sun’s output—noting that even a millionth of its energy exceeds current civilizational demand by four orders of magnitude.
  • Thermal Management: To address the cooling requirements of high-density AI clusters, these satellites will utilize radiative heat dissipation, eliminating the water-intensive cooling loops required by terrestrial facilities.
  • Opex & Scalability: The financial viability of this orbital layer is tethered to the Starship launch platform. SpaceX anticipates that the radical reduction in $/kg launch costs provided by a fully reusable heavy-lift vehicle will enable rapid scaling and ensure that, within years, the lowest LCOA (Levelized Cost of AI) will be achieved in orbit.
The transition to orbital AI compute introduces a fundamental shift in network topology, moving processing from terrestrial hubs to a decentralized, space-based edge layer. The latency implications are characterized by three primary architectural factors:
  • Vacuum-Speed Data Transmission: In a vacuum, light propagates roughly 50% faster than through terrestrial fiber optic cables. By utilizing Starlink’s optical inter-satellite links (OISLs)—a “petabit” laser mesh—data can bypass terrestrial bottlenecks and subsea cables. This potentially reduces intercontinental latency for AI inference to under 50ms, surpassing many long-haul terrestrial routes.
  • Edge-Native Processing & Data Gravity: Current workflows require downlinking massive raw datasets (e.g., Synthetic Aperture Radar imagery) for terrestrial processing, a process that can take hours. Shifting to orbital edge computing allows for “in-situ” AI inference, processing data onboard to deliver actionable insights in minutes rather than hours. This “Space Cloud” architecture eliminates the need to route raw data back to the Earth’s internet backbone, reducing data transmission volumes by up to 90%.
  • LEO Proximity vs. Terrestrial Hops: While terrestrial fiber remains the “gold standard” for short-range latency (typically 1–10ms), it is often hindered by inefficient routing and multiple hops. SpaceX’s LEO constellation, operating at altitudes between 340km and 614km, currently delivers median peak-hour latencies of ~26ms in the US. Future orbital configurations may feature clusters at varying 50km intervals to optimize for specific workload and latency tiers.

………………………………………………………………………………………………………………………………………………………………………………………

The SpaceX FCC filing on Friday follows an exclusive report by Reuters that Elon Musk is considering merging SpaceX with his xAI (Grok chatbot) company ahead of an IPO later this year. Under the proposed merger, shares of xAI would be exchanged for shares in SpaceX. Two entities have been set up in Nevada to facilitate the transaction, Reuters said.  Musk also runs electric automaker Tesla, tunnel company The Boring Co. and neurotechnology company Neuralink.

………………………………………………………………………………………………………………………………………………………………………………………

References:

https://www.reuters.com/business/aerospace-defense/spacex-seeks-fcc-nod-solar-powered-satellite-data-centers-ai-2026-01-31/

https://www.lightreading.com/satellite/spacex-seeks-fcc-approval-for-mega-ai-data-center-constellation

https://www.reuters.com/world/musks-spacex-merger-talks-with-xai-ahead-planned-ipo-source-says-2026-01-29/

Google’s Project Suncatcher: a moonshot project to power ML/AI compute from space

Blue Origin announces TeraWave – satellite internet rival for Starlink and Amazon Leo

China ITU filing to put ~200K satellites in low earth orbit while FCC authorizes 7.5K additional Starlink LEO satellites

Amazon Leo (formerly Project Kuiper) unveils satellite broadband for enterprises; Competitive analysis with Starlink

Telecoms.com’s survey: 5G NTNs to highlight service reliability and network redundancy

 

Huge significance of EchoStar’s AWS-4 spectrum sale to SpaceX

U.S. BEAD overhaul to benefit Starlink/SpaceX at the expense of fiber broadband providers

Telstra selects SpaceX’s Starlink to bring Satellite-to-Mobile text messaging to its customers in Australia

SpaceX launches first set of Starlink satellites with direct-to-cell capabilities

AST SpaceMobile to deliver U.S. nationwide LEO satellite services in 2026

GEO satellite internet from HughesNet and Viasat can’t compete with LEO Starlink in speed or latency

How will fiber and equipment vendors meet the increased demand for fiber optics in 2026 due to AI data center buildouts?

Subsea cable systems: the new high-capacity, high-resilience backbone of the AI-driven global network

Fiber Optic Boost: Corning and Meta in multiyear $6 billion deal to accelerate U.S data center buildout

Corning Incorporated and Meta Platforms, Inc. (previously known as Facebook) have entered a multiyear agreement valued at up to $6 billion. This strategic collaboration aims to accelerate the deployment of cutting-edge data center infrastructure within the U.S. to bolster Meta’s advanced applications, technologies, and ambitious artificial intelligence initiatives.   The agreement specifies that Corning will furnish Meta with its latest advancements in optical fiber, cable, and comprehensive connectivity solutions. As part of this commitment, Corning plans to significantly scale its manufacturing capabilities across its North Carolina facilities.

A key element of this expansion is a substantial capacity increase at its fiber optic cable manufacturing plant in Hickory NC, for which Meta will serve as the foundational anchor customer.  The construction and operation of these data centers — critical infrastructure that supports our technologies and moves us toward personalized superintelligence — necessitate robust server and hardware systems designed to facilitate information transfer and connectivity with minimal latency. Fiber optic cabling is a cornerstone component for enabling this high-speed, near real-time connectivity, powering applications from sophisticated wearable technology like the Ray-Ban Meta AI glasses to the global connectivity services utilized by billions of individuals and enterprises.

“This long-term partnership with Meta reflects Corning’s commitment to develop, innovate, and manufacture the critical technologies that power next-generation data centers here in the U.S.,” said Wendell P. Weeks, Chairman and Chief Executive Officer, Corning Incorporated. “The investment will expand our manufacturing footprint in North Carolina, support an increase in Corning’s employment levels in the state by 15 to 20 percent, and help sustain a highly skilled workforce of more than 5,000 — including the scientists, engineers, and production teams at two of the world’s largest optical fiber and cable manufacturing facilities. Together with Meta, we’re strengthening domestic supply chains and helping ensure that advanced data centers are built using U.S. innovation and advanced manufacturing.”

Meta is expanding its commitment to build industry-leading data centers in the U.S. and to source advanced technology made domestically.  Here are two quotes from them:

  1. “Building the most advanced data centers in the U.S. requires world-class partners and American manufacturing,” said Joel Kaplan, Chief Global Affairs Officer at Meta. “We’re proud to partner with Corning – a company with deep expertise in optical connectivity and commitment to domestic manufacturing – for the high-performance fiber optic cables our AI infrastructure needs. This collaboration will help create good-paying, skilled U.S. jobs, strengthen local economies, and help secure the U.S. lead in the global AI race.”
  2. “As digital tools and generative AI continue to transform our economy — in fields like healthcare, finance, agriculture, and more — the demand for fiber connectivity will continue to grow. By supporting American companies like Corning and building and operating data centers in America, we’re helping ensure that our nation maintains its competitive edge in the digital economy and the global race for AI leadership.”

Key elements of the agreement:

  • Multiyear, up to $6 billion commitment.
  • Corning to supply latest generation optical fiber, cable and connectivity products designed to meet the density and scale demands of advanced AI data centers.
  • New optical cable manufacturing facility in Hickory, North Carolina, in addition to expanded production capacity across Corning’s North Carolina operations.
  • Agreement supports Corning’s projected employment growth in North Carolina by 15 to 20 percent, sustaining a skilled workforce of more than 5,000 employees in the state, including thousands of jobs tied to two of the world’s largest optical fiber and cable manufacturing facilities.

…………………………………………………………………………………………………………………………………………………………….

Comment and Analysis:

Corning’s “up to $6 billion” Meta agreement is essentially a long‑term, anchor‑tenant bet that AI‑era data centers will be fundamentally more fiber‑intensive than legacy cloud resident data centers, with Corning positioning itself as the default U.S. optical plant for Meta’s buildout through ~2030.  In practice, this deal is a long‑term take‑or‑pay style capacity lock that de‑risks Corning’s capex while giving Meta priority access to scarce, high‑performance data‑center‑grade fiber and cabling.

AI data centers are becoming the new FTTH in the sense that hyperscale AI buildouts are now the primary structural driver of incremental fiber demand, design innovation, and capex prioritization—but with far higher fiber intensity per site and far tighter performance constraints than residential access ever imposed.

Why “AI Data Centers are the new FTTH” for fiber optic vendors:

For fiber‑optic vendors, AI data centers now play the role that FTTH did in the 2005–2015 cycle: the anchor use case that justifies new glass, cable, and connectivity capacity.

  • AI‑optimized data centers need 2–4× more fiber cabling than traditional hyperscalers, and in some designs more than 10×, driven by massively parallel GPU fabrics and east–west traffic.

  • U.S. hyperscale capacity is expected to triple by 2029, forcing roughly a 2× increase in fiber route miles and a 2.3× increase in total fiber miles, a demand shock comparable to or larger than the early FTTH boom but concentrated in fewer, much larger customers.

  • This is already reshaping product roadmaps toward ultra‑high‑fiber‑count (UHFC) cable, bend‑insensitive fiber, and very‑small‑form‑factor connectors to handle hundreds to thousands of fibers per rack and per duct.

In other words, where FTTH once dictated volume and economies of scale, AI data centers now dictate density, performance, and margin mix.

Carrier‑infrastructure: from access to fabric:

From a carrier perspective, the “new FTTH” analogy is about what drives long‑haul and metro planning: instead of last‑mile penetration, it’s AI fabric connectivity and east–west inter‑DC routes.

  • Each new hyperscale/AI data center is modeled to require on the order of 135 new fiber route miles just to reach three core network interconnection points, plus additional miles for new long‑haul routes and capacity upgrades.

  • An FBA‑commissioned study projects U.S. data centers alone will need on the order of 214 million additional fiber miles by 2029, nearly doubling the installed base from ~160M to ~373M fiber miles; that is the new “build everywhere” narrative operators once used for FTTH.

  • Carriers now plan backbone routes, ILAs, and regional rings around dense clusters of AI campuses, treating them as primary traffic gravity wells rather than as just a handful of peering sites at the edge of a consumer broadband network.

The strategic shift: FTTH made the access network fiber‑rich; AI makes the entire cloud and transport fabric fiber‑hungry.

Strategic implications:

  • AI is now the dominant incremental fiber use case: residential fiber adds subscribers; AI adds orders of magnitude more fibers per site and per route.

  • Network economics are moving from passing more homes to feeding more GPUs: route miles, fiber counts, and connector density are being dimensioned to training clusters and inference fabrics, not household penetration curves.

  • Policy and investment narratives should treat AI inter‑DC and campus fiber as “national infrastructure” on par with last‑mile FTTH, given the scale of projected doubling in route miles and more than doubling in fiber miles by 2029.

In summary,  the next decade of fiber innovation and capex will be written less in curb‑side PON and more in ultra‑dense, AI‑centric data centers with internal fiber optical fabrics and interconnects.

……………………………………………………………………………………………………………………………………………………………………………………………….

References:

https://www.corning.com/worldwide/en/about-us/news-events/news-releases/2026/01/corning-and-meta-announce-multiyear-up-to-6-billion-agreement-to-accelerate-us-data-center-buildout.html

Meta Announces Up to $6 Billion Agreement With Corning to Support US Manufacturing

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Analysis: Cisco, HPE/Juniper, and Nvidia network equipment for AI data centers

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

Lumen Technologies to connect Prometheus Hyperscale’s energy efficient AI data centers

Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers

Hyper Scale Mega Data Centers: Time is NOW for Fiber Optics to the Compute Server

Hyperscaler capex > $600 bn in 2026 a 36% increase over 2025 while global spending on cloud infrastructure services skyrockets

Hyperscaler capex for the “big five” (Amazon, Alphabet/Google, Microsoft, Meta/Facebook, Oracle) is now widely forecast to exceed $600 bn in 2026, a 36% increase over 2025. Roughly 75%, or $450 bn, of that spend is directly tied to AI infrastructure (i.e., servers, GPUs, datacenters, equipment), rather than traditional cloud.  Hyperscalers are increasingly leaning on debt markets to bridge the gap between rapidly rising AI capex budgets and internal free cash flow, transforming historically cash-funded business models into ones utilizing leverage, albeit with still very strong balance sheets. Aggregate capex for “the big five”, after buybacks and dividends are included, are now above projected cash flows, thereby necessitating external funding needs.

……………………………………………………………………………………………………………………………………………………………………………………………………………………..

According to market research from Omdia (owned by Informa) global spending on cloud infrastructure services reached $102.6 billion in Q3 2025 — a 25% year-on-year increase. It was the fifth consecutive quarter in which cloud spending growth remained above 20%.  Omdia says it “reflects a significant shift in the technology landscape as enterprise demand for AI moves beyond early experimentation toward scaled production deployment.” AWS, Microsoft Azure, and Google Cloud – maintained their market rankings from the previous quarter, and collectively accounted for 66% of global cloud infrastructure spending. Together, the three firms had 29% year-on-year growth in their cloud spending.

Hyperscaler AI strategies are shifting from a focus on incremental model performance to platform-driven, production-ready approaches. Enterprises are now evaluating AI platforms based not solely on model capabilities, but also on their support for multi-model strategies and agent-based applications. This evolution is accelerating hyperscalers’ move toward platform-level AI capabilities. According to the report, Amazon Web Services (AWS), Microsoft Azure, and Google Cloud are integrating proprietary foundation models with a growing range of third-party and open-weight models to meet these new demands.

“Collaboration across the ecosystem remains critical,” said Rachel Brindley, Senior Director at Omdia. “Multi-model support is increasingly viewed as a production requirement rather than a feature, as enterprises seek resilience, cost control, and deployment flexibility across generative AI workloads.”

Facing challenges with practical application, major cloud providers are boosting resources for AI agent lifecycle management, including creation and operationalization, as enterprise-level deployment proves more intricate than anticipated.

Yi Zhang, Senior Analyst at Omdia, said, “Many enterprises still lack standardized building blocks that can support business continuity, customer experience, and compliance at the same time, which is slowing the real-world deployment of AI agents. This is where hyperscalers are increasingly stepping in, using platform-led approaches to make it easier for enterprises to build and run agents in production environments.”

This past October, Omdia released a report forecasting that growth of cloud adoption among communications service providers (CSPs) will double this year. It also forecasted a compound annual growth rate (CAGR) of 7.3% to 2030, resulting in the telco cloud market being worth $24.8 billion.

………………………………………………………………………………………………………………………………………………………………………………………………………………………..

Editor’s Note:  Does anyone remember the stupendous increase in fiber optic spending from 1998-2001 till that bubble burst?  Caveat Emptor!

………………………………………………………………………………………………………………………………………………………………………………………………………………………..

References:

https://www.mufgamericas.com/sites/default/files/document/2025-12/AI_Chart_Weekly_12_19_Financing_the_AI_Supercycle.pdf

https://www.telecoms.com/public-cloud/global-cloud-infrastructure-spend-up-25-in-q3

https://www.telecoms.com/public-cloud/telco-investment-in-cloud-infrastructure-is-accelerating-omdia

AI infrastructure spending boom: a path towards AGI or speculative bubble?

Expose: AI is more than a bubble; it’s a data center debt bomb

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?

AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025; much more in 2026!

Gartner: AI spending >$2 trillion in 2026 driven by hyperscalers data center investments

AI spending is surging; companies accelerate AI adoption, but job cuts loom large

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Canalys & Gartner: AI investments drive growth in cloud infrastructure spending

Sovereign AI infrastructure for telecom companies: implementation and challenges

AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms

Custom AI Chips: Powering the next wave of Intelligent Computing

 

Sovereign AI infrastructure for telecom companies: implementation and challenges

Sovereign AI infrastructure refers to the domestic capability of a nation or an organization to own and control the entire technology stack for artificial intelligence (AI) systems within its own borders, subject to local laws and governance. This includes the physical data centers, specialized hardware (like GPUs), software, data, and skilled workforce.  Sovereign AI infrastructure involves a full “stack” designed to ensure national control and reduce reliance on foreign providers. A few key features:

  • Policies and technical controls (e.g., data localization, encryption) to ensure that sensitive data used for training and inference remains within the jurisdiction.
  • Development and hosting of proprietary or locally tailored AI models and software frameworks that align with national values, languages, and ethical standards.
  • Workforce Development: Investing in domestic talent, including data scientists, engineers, and legal experts, to build and maintain the local AI ecosystem.
  • Regulatory Framework: A comprehensive legal and ethical framework for AI development and deployment that ensures compliance with national laws and standards.

Why It’s Important – The pursuit of sovereign AI infrastructure is driven by several strategic considerations for both governments and private enterprises:

  • National Security: To ensure that critical systems in defense, intelligence, and public infrastructure are not dependent on potentially adversarial foreign technologies or subject to extraterritorial access laws (like the U.S. CLOUD Act).
  • Economic Competitiveness: To foster a domestic tech industry, create high-skilled jobs, protect intellectual property, and capture the significant economic benefits of AI-driven growth.
  • Data Privacy and Compliance: To comply with stringent local data protection regulations (e.g., GDPR in the EU) and build public trust by ensuring citizen data is handled securely and according to local laws. Cultural Preservation: To train AI models on local datasets and languages, preserving cultural nuances and avoiding bias found in generalized, globally trained models.

Image Credit: Nvidia

………………………………………………………………………………………………………………………………………………………………………………………………………..

Governments around the world are starting to build sovereign AI infrastructure, and according to a new report from Morningstar DBRS, which opines that major telecommunications companies are uniquely positioned to benefit from that shift.  Here are a few take-aways from the report:

  • Sovereign AI funding opens a new growth path for telcos – Governments investing in domestic AI infrastructure are increasingly turning to operators, whose network and regulatory strengths position them to capture a large share of this emerging market.
  • Telcos’ capabilities align with sovereignty needs – Their expertise in large-scale networks, local presence, and established government relationships give them an edge over hyperscalers for sensitive, sovereignty-focused AI projects.
  • Early adopters gain advantage – Operators in Canada and Europe are already moving into sovereign AI, positioning themselves to secure higher-margin enterprise and government workloads as national AI buildouts accelerate.
Infrastructure advantages provide a strategic head start for telecommunications companies. Telcos currently manage extensive data centers, fiber optic networks, and computing infrastructure nationwide. Leveraging these established physical assets can significantly reduce the barriers to implementing sovereign AI solutions, contrasting favorably with the greenfield development required by other entities. 
The sophisticated data governance expertise within telcos is well-suited for the stringent requirements of sovereign AI. Their decades of experience managing and processing massive datasets have resulted in mature data handling practices directly applicable to the data infrastructure demands of secure, sovereign AI systems.
Furthermore, existing edge computing capabilities offer a distinct competitive advantage. Telecom networks facilitate localized AI processing near data sources while adhering to data residency requirements—a crucial combination for sovereign AI deployments.  This translates to “embedding AI within their network fabric for both optimization and distributed inference,” enabling AI consumption that offers lower latency, reduced cost, and applicability for high-sensitivity use cases in sectors like government and national security.
The opportunity to integrate AI workloads with emerging 5G and 6G infrastructures creates additional strategic value. Sovereign AI represents a pivotal opportunity for telecom operators to position themselves as central players in national AI strategies, evolving their role beyond primary connectivity provisioning.
……………………………………………………………………………………………………………………………………………………………………………….
Implementing sovereign AI presents substantial challenges despite its strategic potential. Key bottlenecks and technical complexities include:
  • Infrastructure Demands: Building robust domestic AI ecosystems requires specialized expertise spanning hardware, software, data governance, and policy.
  • Resource Constraints: Dr. Matt Hasan, CEO at aiRESULTS and a former AT&T executive, highlights specific bottlenecks:
    • Compute Density at Scale.
    • Spectrum Allocation amidst political pressures.
    • Energy Demand exceeding existing grid capacity.
  • Intensified Reliability Requirements: Sovereign AI implementation places heightened demands on telecom providers for system uptime, reliability, quality, and data privacy. This necessitates a focus on efficient power consumption, resilient routing and backups, robust encryption, and comprehensive cybersecurity measures.
  • Supply Chain Vulnerabilities: Geopolitical tensions introduce risks to the supply of critical components such as GPUs and specialized chips, underscoring the interconnected nature of global hardware supply chains.
  • The rapid evolution of AI technology mandates continuous investment and technical agility to ensure sovereign deployments remain current.
Competitive landscape dynamics:
  • The interplay between global hyperscalers and regional telecom operators is expected to shift.
  • Hasan predicts a collaborative model, with regional telcos leveraging their position as sovereign partners through joint ventures, rather than an outright displacement of hyperscalers.
Ultimately, the objective of sovereign AI is strategic resilience, not complete digital isolation. Nations must judiciously balance sovereignty goals with the advantages of global technological collaboration. For telecom operators, adeptly managing these complexities and investment demands will define sovereign AI’s realization as a viable growth opportunity.
…………………………………………………………………………………………………………………………………………………………………………….

References:

Telcos Across Five Continents Are Building NVIDIA-Powered Sovereign AI Infrastructure

https://dbrs.morningstar.com/research/468155/telecoms-are-well-placed-to-benefit-from-sovereign-ai-infrastructure-plans

How “sovereign AI” could shape telecom

https://www.rcrwireless.com/20251202/ai/sovereign-ai-telcos

Subsea cable systems: the new high-capacity, high-resilience backbone of the AI-driven global network

Analysis: OpenAI and Deutsche Telekom launch multi-year AI collaboration

AI infrastructure spending boom: a path towards AGI or speculative bubble?

Market research firms Omdia and Dell’Oro: impact of 6G and AI investments on telcos

Omdia: How telcos will evolve in the AI era

OpenAI announces new open weight, open source GPT models which Orange will deploy

Expose: AI is more than a bubble; it’s a data center debt bomb

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?

Custom AI Chips: Powering the next wave of Intelligent Computing

AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025; much more in 2026!

IBM and Groq Partner to Accelerate Enterprise AI Inference Capabilities

Dell’Oro: Analysis of the Nokia-NVIDIA-partnership on AI RAN

 

AI infrastructure spending boom: a path towards AGI or speculative bubble?

by Rahul Sharma, Indxx with Alan J Weissberger, IEEE Techblog

Introduction:

The ongoing wave of artificial intelligence (AI) infrastructure investment by U.S. mega-cap tech firms marks one of the largest corporate spending cycles in history. Aggregate annual AI investments, mostly for cloud resident mega-data centers, are expected to exceed $400 billion in 2025, potentially surpassing $500 billion by 2026 — the scale of this buildout rivals that of past industrial revolutions — from railroads to the internet era.[1]

At its core, this spending surge represents a strategic arms race for computational dominance. Meta, Alphabet, Amazon and Microsoft are racing to secure leadership in artificial intelligence capabilities — a contest where access to data, energy, and compute capacity are the new determinants of market power.

AI Spending & Debt Financing:

Leading technology firms are racing to secure dominance in compute capacity — the new cornerstone of digital power:

  • Meta plans to spend $72 billion on AI infrastructure in 2025.
  • Alphabet (Google) has expanded its capex guidance to $91–93 billion.[3]
  • Microsoft and Amazon are doubling data center capacity, while AWS will drive most of Amazon’s $125 billion 2026 investment.[4]
  • Even Apple, typically conservative in R&D, has accelerated AI infrastructure spending.

Their capex is shown in the chart below:

Analysts estimate that AI could add up to 0.5% to U.S. GDP annually over the next several years. Reflecting this optimism, Morgan Stanley forecasts $2.9 trillion in AI-related investments between 2025 and 2028. The scale of commitment from Big Tech is reshaping expectations across financial markets, enterprise strategies, and public policy, marking one of the most intense capital spending cycles in corporate history.[2]

Meanwhile, OpenAI’s trillion-dollar partnerships with Nvidia, Oracle, and Broadcom have redefined the scale of ambition, turning compute infrastructure into a strategic asset comparable to energy independence or semiconductor sovereignty.[5]

Growth Engine or Speculative Bubble?

As Big Tech pours hundreds of billions of dollars into AI infrastructure, analysts and investors remain divided — some view it as a rational, long-term investment cycle, while others warn of a potential speculative bubble.  Yet uncertainty remains — especially around Meta’s long-term monetization of AGI-related efforts.[8]

Some analysts view this huge AI spending as a necessary step towards achieving Artificial General Intelligence (AGI) – an unrealized type of AI that possesses human-level cognitive abilities, allowing it to understand, learn, and adapt to any intellectual task a human can. Unlike narrow AI, which is designed for specific functions like playing chess or image recognition, AGI could apply its knowledge to a wide range of different situations and problems without needing to be explicitly programmed for each one.

Other analysts believe this is a speculative bubble, fueled by debt that can never be repaid. Tech sector valuations have soared to dot-com era levels – and, based on price-to-sales ratios, are well beyond them. Some of AI’s biggest proponents acknowledge the fact that valuations are overinflated, including OpenAI chairman Bret Taylor: “AI will transform the economy… and create huge amounts of economic value in the future,” Taylor told The Verge. “I think we’re also in a bubble, and a lot of people will lose a lot of money,” he added.

Here are a few AI bubble points and charts:

  • AI-related capex is projected to consume up to 94% of operating cash flows by 2026, according to Bank of America.[6]
  • Over $75 billion in AI-linked corporate bonds have been issued in just two months — a signal of mounting leverage. Still, strong revenue growth from AI services (particularly cloud and enterprise AI) keeps optimism alive.[7]
  • Meta, Google, Microsoft, Amazon and xAI (Elon Musk’s company) are all using off-balance-sheet debt vehicles, including special-purpose vehicles (SPVs) to fund part of their AI investments. A slowdown in AI demand could render the debt tied to these SPVs worthless, potentially triggering another financial crisis.
  • Alphabet’s (Google’s parent company) CEO Sundar Pichai sees “elements of irrationality” in the current scale of AI investing which is much more than excessive investments during the dot-com/fiber optic built-out boom of the late 1990s. If the AI bubble bursts, Pichai said that no company will be immune, including Alphabet, despite its breakthrough technology, Gemini, fueling gains in the company’s stock price.

…………………………………………………………………………………………………………………..

From Infrastructure to Intelligence:

Executives justify the massive spend by citing acute compute shortages and exponential demand growth:

  • Microsoft’s CFO Amy Hood admitted, “We’ve been short on capacity for many quarters” and confirmed that the company will increase its spending on GPUs and CPUs in 2026 to meet surging demand.
  • Amazon’s Andy Jassy noted that “every new tranche of capacity is immediately monetized”, underscoring strong and sustained demand for AI and cloud services.
  • Google reported billions in quarterly AI revenue, offering early evidence of commercial payoff.

Macro Ripple Effects – Industrializing Intelligence:

AI data centers have become the factories of the digital age, fueling demand for:

  • Semiconductors, especially GPUs (Nvidia, AMD, Broadcom)
  • Cloud and networking infrastructure (Oracle, Cisco)
  • Energy and advanced cooling systems for AI data centers (Vertiv, Schneider Electric, Johnson Controls, and other specialists such as Liquid Stack and Green Revolution Cooling).
Leading Providers of Energy and Cooling Systems for AI Data Centers:
Company Name  Core Expertise Key Solutions for AI Data Centers
Vertiv Critical infrastructure (power & cooling) Offers full-stack solutions with air and liquid cooling, power distribution units (PDUs), and monitoring systems, including the AI-ready Vertiv 360AI portfolio.
Schneider Electric Energy management & automation Provides integrated power and thermal management via its EcoStruxure platform, specializing in modular and liquid cooling solutions for HPC and AI applications.
Johnson Controls HVAC & building solutions Offers integrated, energy-efficient solutions from design to maintenance, including Silent-Aire cooling and YORK chillers, with a focus on large-scale operations.
Eaton Power management Specializes in electrical distribution systems, uninterruptible power supplies (UPS), and switchgear, which are crucial for reliable energy delivery to high-density AI racks.
These companies focus heavily on innovative liquid cooling technologies, which are essential for managing the extreme heat generated by high-density AI servers and GPUs: 
  • LiquidStack: A leader in two-phase and modular immersion cooling and direct-to-chip systems, trusted by large cloud and hardware providers.
  • Green Revolution Cooling (GRC): Pioneers in single-phase immersion cooling solutions that help simplify thermal management and improve energy efficiency.
  • Iceotope: Focuses on chassis-level precision liquid cooling, delivering dielectric fluid directly to components for maximum efficiency and reduced operational costs.
  • Asetek: Specializes in direct-to-chip (D2C) liquid cooling solutions and rack-level Coolant Distribution Units (CDUs) for high-performance computing.
  • CoolIT Systems: Known for its custom direct liquid cooling technologies, working closely with server OEMs (Original Equipment Manufacturers) to integrate cold plates and CDUs for AI and HPC workloads. 

–>This new AI ecosystem is reshaping global supply chains — but also straining local energy and water resources. For example, Meta’s massive data center in Georgia has already triggered environmental concerns over energy and water usage.

Global Spending Outlook:

  • According to UBS, global AI capex will reach $423 billion in 2025, $571 billion by 2026 and $1.3 trillion by 2030, growing at a 25% CAGR during the period 2025-2030.
    Compute demand is outpacing expectations, with Google’s Gemini saw 130 times rise in AI token usage over the past 18 months, highlighting soaring compute and Meta’s infrastructure needs expanding sharply.[9]

Conclusions:

The AI infrastructure boom reflects a bold, forward-looking strategy by Big Tech, built on the belief that compute capacity will define the next decade’s leaders. If Artificial General Intelligence (AGI) or large-scale AI monetization unfolds as expected, today’s investments will be seen as visionary and transformative. Either way, the AI era is well underway — and the race for computational excellence is reshaping the future of global markets and innovation.

…………………………………………………………………………………………………………………………………………………………………………………………………………………………….

Footnotes:

[1] https://www.investing.com/news/stock-market-news/ai-capex-to-exceed-half-a-trillion-in-2026-ubs-4343520?utm_medium=feed&utm_source=yahoo&utm_campaign=yahoo-www

[2] https://www.venturepulsemag.com/2025/08/01/big-techs-400-billion-ai-bet-the-race-thats-reshaping-global-technology/#:~:text=Big%20Tech’s%20$400%20Billion%20AI%20Bet:%20The%20Race%20That’s%20Reshaping%20Global%20Technology,-3%20months%20ago&text=The%20world’s%20largest%20technology%20companies,enterprise%20strategy%2C%20and%20public%20policy.

[3] https://www.businessinsider.com/big-tech-capex-spending-ai-earnings-2025-10?

[4] https://www.investing.com/analysis/meta-plunged-12-amazon-jumped-11–same-ai-race-different-economics-200669410

[5] https://www.cnbc.com/2025/10/15/a-guide-to-1-trillion-worth-of-ai-deals-between-openai-nvidia.html

[6] https://finance.yahoo.com/news/bank-america-just-issued-stark-152422714.html

[7] https://news.futunn.com/en/post/64706046/from-cash-rich-to-collective-debt-how-does-wall-street?level=1&data_ticket=1763038546393561

[8] https://www.businessinsider.com/big-tech-capex-spending-ai-earnings-2025-10?

[9] https://finance.yahoo.com/news/ai-capex-exceed-half-trillion-093015889.html

……………………………………………………………………………………………………………………………………………………………………………………………………………………………

About the Author:

Rahul Sharma is President & Co-Chief Executive Officer at Indxx a provider of end-to-end indexing services, data and technology products.  He has been instrumental in leading the firm’s growth since 2011. Raul manages Indxx’s Sales, Client Engagement, Marketing and Branding teams while also helping to set the firm’s overall strategic objectives and vision.

Rahul holds a BS from Boston College and an MBA with Beta Gamma Sigma honors from Georgetown University’s McDonough School of Business.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………

References:

Curmudgeon/Sperandeo: New AI Era Thinking and Circular Financing Deals

Expose: AI is more than a bubble; it’s a data center debt bomb

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?

AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025; much more in 2026!

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

FT: Scale of AI private company valuations dwarfs dot-com boom

Amazon’s Jeff Bezos at Italian Tech Week: “AI is a kind of industrial bubble”

AI Data Center Boom Carries Huge Default and Demand Risks

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Dell’Oro: Analysis of the Nokia-NVIDIA-partnership on AI RAN

RAN silicon rethink – from purpose built products & ASICs to general purpose processors or GPUs for vRAN & AI RAN

Nokia in major pivot from traditional telecom to AI, cloud infrastructure, data center networking and 6G

Reuters: US Department of Energy forms $1 billion AI supercomputer partnership with AMD

………………………………………………………………………………………………………………………………………………………………………….

 

Page 1 of 2
1 2