Inside Nokia’s new AI Networking Innovation Lab

As AI workload demands continuously affect how data center networks must operate, challenges across performance, scale, and precision must be addressed to maintain the large-scale demands on network infrastructure.  To address those needs, Nokia announced today the launch of its AI Networking Innovation Lab, a new facility designed to bolster innovation between AI and cloud partners and to accelerate next-generation development of AI infrastructure.
……………………………………………………………………………………………………………………………………….
Located within Nokia’s Sunnyvale, California facility, the lab serves as an innovation hub where Nokia will work across advanced AI networking technologies, architectures and ecosystems with a variety of partners to help shape the future of data center networking. The lab will serve as a testing center for Nokia Validated Designs and a co-innovation hub with its global partners, assessing real-world scenarios, commercial technologies, and the latest networking solutions.
Nokia has teamed up with several prominent infrastructure and platform providers. Early lab partners include AMD, Everpure, Keysight, Lenovo, Nscale, Supermicro and Weka.
  • Silicon & Compute: Collaborating with AMD to optimize enterprise AI workloads alongside Nokia data center switches.
  • Testing & Infrastructure: Partnering with Keysight Technologies to emulate workloads across Ultra Ethernet Consortium (UEC) and RoCEv2 transports.
  • Hardware & Servers: Integrating high-performance platforms from Lenovo and Supermicro.
  • Data Storage & Cloud: Working with Weka and cloud builders like Nscale to eliminate storage bottlenecks during heavy computational training.

Nokia’s AI Networking Innovation Lab is built upon three fundamental pillars: Technology Innovation, Ecosystem Collaboration, and Validation.  Image credit: Nokia

………………………………………………………………………………………………………………….

Technology Innovation: The lab provides a dedicated space for AI partners to experiment with next-gen solutions across the entire networking stack – driving emerging standards forward with pioneering approaches to new protocols, switching silicon, congestion control, real-time telemetry, and automation.

Ram Periakaruppan, Vice President and General Manager, Network Applications and Security business at Keysight:
“Partnering with Nokia in the AI Networking Innovation Lab has enabled us to benchmark and optimize AI networks under real-world conditions…Together, we are helping accelerate AI network adoption by giving operators and hyperscalers the validated insights needed for confident, large-scale deployment.”

Ecosystem Collaboration: True progress depends on a strong ecosystem of technology providers – silicon manufacturers, GPU developers, system, storage and test vendors, and cloud platforms – that work together to create highly-compatible AI-ready solutions. This facilitates joint testing for interoperability, improves integration, and ensures roadmaps are aligned across different hardware, software, and orchestration layers.

Travis Karr, Corporate Vice President, HPC and Sovereign AI at AMD believes customer collaboration and an open ecosystem are fundamental to accelerating AI innovation:

“By co-developing solutions with partners, such as Nokia in their AI networking innovation lab, we ensure our AMD enterprise AI solutions are tested with Nokia data center switches on real-world workloads and network demands. An open, standards-driven approach empowers customers to integrate seamlessly across heterogeneous environments, avoiding lock-in and fostering industry-wide advancement in AI.”

Validation: This positions the lab as the testing ground for Nokia Validated Designs, where customers and partners rigorously validate multi-vendor data center architectures under authentic AI training and inference workloads. By testing failure scenarios, congestion behavior, and operational automation, the lab turns NVDs into proven, deployable solutions — enabling predictable performance, faster deployment, and reduced operational complexity and risk for organizations navigating the AI era.

Arno van Huyssteen, Vice President of Global Telecommunications for Nscale:

“Nokia is a strategic networking partner for Nscale as we build towards AI Grid, and the engineering rigour behind their Validated Designs reflects the kind of innovation needed to enable next-generation AI infrastructure. The depth of hardware, software and failure testing behind those blueprints is what will give operators the confidence to deploy complex AI environments faster, with fewer integration risks and less operational disruption. We’re excited to collaborate in the AI Networking Innovation Lab to help push the boundaries of AI-native networking and validate the next generation of solutions before they reach production.”

A primary focal point inside the lab is managing data center congestion. Unlike traditional cloud traffic, back-end AI networks feature high-density data synchronization across massive GPU clusters. The lab uses advanced automation, AIOps, and lossless Ethernet solutions—such as the Nokia 7220 IXR-H6 switches—to handle these intense uplink and synchronization demands safely.

The AI Networking Innovation Lab supports Nokia’s broader strategy to accelerate the next era of AI-driven connectivity. As demand for AI infrastructure continues to grow, data center networking has become one of the most critical foundations of the global AI ecosystem. Through this investment, Nokia is strengthening its capabilities in AI and cloud infrastructure while advancing its vision of AI-native networking.

Rudy Hoebeke, Vice President of Software Product Management at Nokia:

“The launch of Nokia’s AI Networking Innovation Lab marks a major milestone in our commitment to drive the next era of AI-native connectivity. As the industry continues to evolve with solutions like scale-across and AI-Grid, this lab is poised to accelerate AI networking technology that will not only support but optimize these emerging industry offerings. This center gives our customers and partners early access to new technologies, deeper collaboration with the world’s leading AI ecosystem players, and the confidence that their networks are validated under more realistic AI conditions. By accelerating innovation and reducing deployment risks, we’re enabling the industry to deliver faster, more reliable, and more sustainable AI experiences to people and businesses everywhere.”

………………………………………………………………………………………………………………………

References:

https://www.nokia.com/newsroom/nokia-launches-ai-networking-lab-to-drive-co-innovation-with-partners-and-accelerate-next-era-of-ai-native-data-center-networking/

Analysis: Nokia’s strong growth in Optical Networks and AI network infrastructure

Orange, Nokia, Nvidia, and Intel debate: ASICs vs. GPUs vs. General-Purpose CPUs for RAN Baseband Processing

Nokia’s AI Applications Study: “Physical AI” may require RAN redesign to support high‑volume, low‑latency uplink traffic

Australia’s NBN and Nokia demonstrate multi-generation optical technologies concurrently over existing FTTP infrastructure

Nokia to showcase agentic AI network slicing; Ericsson partners with Ookla to measure 5G network slicing performance

Tampnet to expand 5G offshore connectivity in the Gulf of Mexico using Nokia AirScale 5G radios

Dell’Oro: Analysis of the Nokia-NVIDIA-partnership on AI RAN

 

Why Batch Pipelines Break AI Agents: The Case For Streaming-First Network Operations

By Shazia Hasnie, Ph.D, editorial review by IEEE Techblog team member Sridhar Talari Rajagopal

Abstract:

The adoption of AI agents in network operations has exposed a critical architectural gap. Most enterprise data pipelines were designed for dashboards and reporting, not autonomous decision-making. When AI agents consume data from batch-oriented pipelines, five distinct failure modes emerge: stale data, memory gaps, delete blindness, schema fragility, and coordination failure. This article examines each failure mode, explains the underlying mechanism, and proposes architectural remedies grounded in streaming-first design principles. It also connects each technical failure to measurable business outcomes—extended downtime, recurring incidents, compliance exposure, silent decision degradation, and cascading impact. The result is both a diagnostic framework for I&O leaders and a financial argument for treating streaming data infrastructure as the prerequisite for autonomous operations.

Introduction: The Data Foundation Gap

Artificial intelligence is reshaping network operations. AI agents promise to detect anomalies, diagnose root causes, and execute remediation faster than human engineers. The industry has focused attention on models, GPUs, and orchestration frameworks. The data layer remains largely unexamined.

This is a critical oversight. Most enterprise data pipelines were built for human consumers. They serve dashboards, weekly reports, and historical analysis. Humans tolerate latency. Humans bring context. Humans notice when something looks wrong.

AI agents require something fundamentally different. They need real-time context. They need historical state. They need accurate representations of current reality. When these requirements are not met, agents do not complain. They act—on incomplete information, with incorrect assumptions, producing wrong outcomes.

The gap between what batch pipelines deliver and what agents require creates failure modes that most teams do not see until an agent makes the wrong decision. Recent analysis has identified the economic dimensions of this gap [1], while industry resources have begun documenting the specific failure patterns that arise when batch processing meets autonomous agents [6]. This article extends that work by identifying five distinct failure modes and proposing a streaming-first architectural response.

FIVE FAILURE MODES: ANATOMY OF BATCH-TO-AGENT MISMATCH

The following five failure modes represent the specific ways batch data pipelines undermine autonomous network operations. Each is examined through its mechanism—how the batch pipeline architecture produces the failure—its operational consequence, and the streaming-first architectural remedy that eliminates it. Together, they form a diagnostic taxonomy for any I&O team evaluating whether their data foundation is ready for Agentic AI.

Failure Mode 1: Stale Data

Mechanism: Batch telemetry pipelines poll, collect, and process data in cycles. Data is extracted on a schedule, transformed in bulk, and loaded into a destination—a warehouse, data lake, time-series database, or feature store that holds a static, point-in-time snapshot of the source. Between cycles, the pipeline holds no current state. An AI agent that spins up between cycles receives a snapshot of the past.

Consequence: The agent diagnoses an outage using telemetry from five minutes ago. The network state has changed during that interval. Routes have shifted. Traffic has been redirected. Thus, the agent’s diagnosis is based on a reality that no longer exists. Remediation actions applied to a past state can worsen the current incident. The agent becomes a liability rather than an asset. Industry documentation confirms that AI agents require continuous data freshness to function correctly [5].

Architectural Remedy: Streaming telemetry replaces cyclical polling with continuous event push. Data flows from source to consumer in real time, ingested directly into the streaming platform’s durable event log [2]. The agent consumes from a live stream, not a stale snapshot. Context acquisition takes milliseconds. The cognitive loop remains intact. This is not an add-on to the batch pipeline. It is a structural replacement of the ingestion layer.

Failure Mode 2: Memory Gap

Mechanism: Batch pipelines deliver windows of data—the last hour, the last day, the last processing cycle. They do not preserve the sequence of events that led to the current moment. Historical context is stripped away with each new extract. The pipeline knows what happened. It does not know what happened before.

Consequence: An agent responding to an interface flap cannot answer the most basic diagnostic question: has this happened before? It cannot correlate the current event with the three similar events that occurred in the preceding 24 hours. It cannot detect the pattern that would reveal a degrading optical module. Every incident appears isolated. Pattern recognition—the core value proposition of AI-driven operations—is structurally impossible. The distinction between streaming and batch architectures for these use cases has been well-documented [4].

Architectural Remedy: A durable event log with configurable retention serves as the agent’s memory [2]. Unlike a batch window, which discards history with each new extract, the event log preserves the ordered sequence of all events within the retention period. The agent seeks backward in the log on startup and replays the preceding window of telemetry. Pattern detection across time becomes native to the architecture. This is not a separate cache layered on top. It is the storage layer itself—immutable, ordered, and built for event replay from any offset.

Failure Mode 3: Delete Blindness

Mechanism: Batch pipeline’s Extract, Transform, Load (ETL) processes compare snapshots of source data. They do not watch the database transaction log. They identify what exists at two points in time and process the difference. When a record is deleted from the source system, the pipeline has no way of distinguishing between a row that was deleted and a row that was simply omitted due to extraction error, filtering logic, or schema mismatch. The absence of a row is not an event. It is a gap. Batch pipelines are not designed to interpret gaps as meaningful signals. The record simply vanishes from the next extract. The downstream consumer—an AI agent or any other system—has no way of knowing the record ever existed.

Consequence: The agent queries the downstream data store and finds no record for a deactivated account, a revoked certificate, or a cancelled change order. It cannot distinguish between “never existed” and “was deleted,” so it treats the absence as neutral.

The agent makes decisions on ghosts—data that no longer exists in source systems. In access control scenarios, this is not an operational error. It is a security incident. This specific failure mode has been identified in analyses of batch processing limitations for AI agents [6].

Architectural Remedy: Change data capture (CDC), implemented through Kafka Connect with Debezium connectors, reads the database transaction log directly [2], [8]. Debezium provides CDC source connectors for MySQL, PostgreSQL, MongoDB, SQL Server, and other databases — capturing inserts, updates, and deletes as discrete events with explicit operation types by tailing the database’s native transaction log. Nothing is invisible to the pipeline. The streaming architecture knows not only what exists but what ceased to exist. This is not an ETL workaround with soft-delete flags. It is a structural capability of the integration layer, converting database changes into first-class events the moment they occur.

Failure Mode 4: Schema Fragility

Mechanism: Source database schemas change over time. Columns are renamed, added, deprecated, or re-typed. Batch pipelines are configured for a specific schema at extraction time. When the source schema changes, the pipeline responds in one of two ways. It fails silently and drops the affected field from every subsequent extract. Or it fails loudly and stops processing entirely.

Silent failure is the more dangerous outcome. The pipeline continues delivering data. The consumer has no indication that a critical field is missing.

Consequence: The agent continues operating without a critical data input. It makes decisions with incomplete information. It has no awareness that its reasoning is compromised. The wrong decisions accumulate. By the time the missing field is discovered—often through an operational failure rather than a monitoring alert—the cost of remediation includes auditing and correcting every decision made during the degradation window.

Architectural Remedy: A schema registry with compatibility enforcement validates schema changes before they propagate to downstream consumers [2]. Streaming platforms can enforce backward and forward compatibility rules at the producer level. A breaking schema change is rejected before any data is published. The pipeline fails loudly and immediately. This is not a documentation standard or a code review checklist. It is a structural governance layer embedded in the streaming architecture itself, preventing silent field loss at the point of ingestion.

Failure Mode 5: Coordination Failure

Mechanism: When multiple AI agents operate on batch-derived data, each agent consumes a separate, potentially inconsistent snapshot. Agent A receives data from the 10:00 AM extract. Agent B receives data from the 10:15 AM extract. The extracts differ. Each agent holds a different version of reality. There is no shared, ordered log of events that all agents consume.

Consequence: Two agents respond to the same cascading failure. Agent A identifies a BGP routing issue and begins rerouting traffic. Agent B identifies a DNS resolution failure and begins modifying name server configurations. Neither agent knows the other acted. The redundant changes compete. The conflicting configurations create new instability. The original incident expands rather than resolves. What began as a single point of failure becomes a cascade that erodes trust in autonomous operations.

Architectural Remedy: A shared, ordered event log serves as a single source of truth for all agents in the system. Every agent consumes from the same log. Actions taken by one agent are published back to the log as events, immediately visible to all others [7]. Coordination becomes native to the architecture.

Visibility alone, however, does not prevent conflicting actions. Two agents may observe the same anomaly and both initiate remediation before either’s action becomes visible on the log. In practice, this is addressed through complementary mechanisms layered on the same event-driven model: action intent events that signal an agent is about to act, giving others a window to defer; idempotency keys that prevent duplicate remediation from causing harm; and lightweight leases for resources that should only be modified by one agent at a time. These mechanisms do not require a central coordinator. They are published to the same log, consumed by the same agents, and enforced through the same ordered stream.

This is not a separate orchestration layer or message bus bolted onto the side. It is the core of the streaming platform—a unified, ordered, multi-consumer event stream that provides both the shared state and the coordination primitives that eliminate the inconsistent snapshots batch architectures produce by default.

Batch-to-Streaming Reference Architecture — Five Failure Modes and Their Architectural Remedies

THE UNIFIED DIAGNOSTIC FRAMEWORK

The five failure modes translate into a practical audit that I&O leaders can apply to their own infrastructure. Each question corresponds to a specific architectural requirement.

The Five-Question Audit

  1. Can the data pipeline deliver real-time context to an agent the moment it wakes up? If not, the system is vulnerable to stale data failures.
  2. Can the agent access the preceding window of telemetry to detect patterns across events? If not, the system is vulnerable to memory gap failures.
  3. Does the pipeline capture deletes as explicit events with operation types? If not, the system is vulnerable to delete blindness.
  4. Does the pipeline detect schema changes before they propagate to downstream consumers? If not, the system is vulnerable to schema fragility.
  5. Do all agents share a single, ordered view of events with visibility into each other’s actions? If not, the system is vulnerable to coordination failure.

A negative answer to any one of these questions signals a data foundation that is not ready for autonomous operations. The model is not the bottleneck. The GPUs are not the bottleneck. The telemetry pipeline is.

THE MIGRATION PATH: FROM BATCH TO STREAMING-FIRST

Adopting a streaming-first architecture does not require abandoning existing batch investments overnight. For most organizations, the transition follows a coexistence model: streaming pipelines are introduced alongside batch pipelines, not as an immediate replacement.

The practical starting point is to identify the highest-value agent—the one whose decisions carry the greatest operational or financial consequence—and convert its data pipeline first. This agent is typically the one where stale data, memory gaps, or coordination failures have produced measurable incidents. Converting this single pipeline to streaming telemetry with a durable event log delivers a targeted operational improvement while the rest of the batch estate continues to function.

From there, adoption expands incrementally. Each additional agent is migrated as operational experience with the streaming platform grows. Teams develop competence in offset management, schema governance through the registry, and backpressure handling while batch pipelines continue to serve lower-priority consumers. The streaming and batch estates coexist for a transition period measured in months, not days.

This incremental approach also reveals where streaming delivers the greatest marginal benefit. Not every data flow requires real-time treatment. Dashboards fed by hourly batch extracts may serve their purpose indefinitely. The streaming investment should be directed at the pipelines that feed autonomous agents—the flows where the five failure modes carry real operational consequence. The goal is not to stream everything. It is to stream the right things first.

THE BUSINESS IMPACT: FROM TECHNICAL FAILURE TO FINANCIAL CONSEQUENCE

Technical failures in the data pipeline do not remain technical. They cascade into business outcomes that appear on budget reviews, SLA reports, and board presentations. Each failure mode carries a distinct financial consequence.

Stale Data → Extended Downtime
An agent diagnosing from stale telemetry makes incorrect decisions. Remediation applied to a past state can worsen the current incident. Mean Time to Resolution increases. For revenue-generating services, every minute of extended downtime translates to lost revenue and SLA penalty accrual.

Consider an illustrative model: a Tier-1 service provider processing $50M in customer transactions per hour, 5-minute stale-data induced misdiagnosis that extends an outage by 15 minutes represents $12.5M in direct revenue loss—not counting SLA penalties, regulatory scrutiny, or reputational harm. The cost of a single such incident can exceed the annual investment in the streaming infrastructure that would have prevented it. If even a portion of such incidents are eliminated by replacing the batch pipeline feeding the diagnostic agent with a streaming backbone, the infrastructure investment is recovered in a single avoided outage.

Memory Gap → Recurring Incidents
An agent without historical context cannot recognize chronic conditions. A flapping interface, a memory leak, or a degrading optical module triggers the same alert repeatedly. Each occurrence consumes GPU inference cycles. Each occurrence generates a ticket. Each occurrence may require human escalation. The cumulative cost of a single undiagnosed chronic issue, multiplied across an enterprise network over a year, represents operational expenditure that a stateful agent could eliminate.

Delete Blindness → Compliance and Security Exposure
An agent acting on deleted records makes authorization decisions based on invalid state. A deactivated account granted access. A revoked certificate treated as valid. In regulated industries, these errors are compliance violations with defined financial penalties and reporting obligations. The cost of a single access control error caused by ghost data can exceed the annual cost of the streaming infrastructure that would have prevented it.

Schema Fragility → Silent Decision Degradation
When a batch pipeline drops a critical field, the agent does not fail loudly. It continues operating with incomplete inputs. Decisions degrade silently. The cost includes not only the direct operational impact but the effort of auditing and correcting every decision made during the degradation window. Silent failure multiplies eventual remediation cost.

Coordination Failure → Cascading Impact
When multiple agents act on inconsistent views of reality, they create new problems. Redundant changes compete. Conflicting configurations destabilize the environment. The original incident expands. The cost includes extended resolution time, additional engineering effort, and eroded trust in autonomous operations. Organizational credibility is a balance sheet item that coordination failure depletes.

The Aggregated View
Taken together, the five failure modes represent a predictable drain on AI investment returns. An organization that deploys expensive GPU infrastructure, fine-tunes capable models, and implements event-driven orchestration [3]—but feeds all of it with a batch data pipeline—has built an autonomous operations capability on a foundation that guarantees suboptimal outcomes. The streaming backbone is not an incremental cost. It is the insurance policy that protects the returns on every other AI infrastructure investment.

CONCLUSION: STREAMING-FIRST AS THE ARCHITECTURAL PREREQUISITE

The five failure modes share a common root cause. Batch data pipelines were designed for human consumers who tolerate latency, bring context, and notice anomalies. AI agents tolerate nothing. They act on what they receive.

Each failure mode is addressable within a unified streaming data architecture. Streaming telemetry solves stale data by replacing cyclical polling with continuous event push. Durable event logs solve memory gaps by preserving the sequence of events with configurable retention, allowing agents to replay history and detect patterns across time. Change data capture—a structural component of the streaming architecture implemented through Kafka Connect and Debezium—solves delete blindness by reading database transaction logs directly, capturing inserts, updates, and deletes as discrete events with explicit operation types. A schema registry with compatibility enforcement solves schema fragility by validating schema changes before they propagate downstream, catching breaking changes at the source rather than discovering them after agent failure. A shared, ordered event log solves coordination failure by serving as a single source of truth that all agents consume, ensuring every agent operates on the same reality with visibility into every other agent’s actions—complemented by intent events, idempotency keys, and lightweight leases that prevent conflicting actions without a central coordinator.

These are not disparate tools. They are structural elements of a single streaming data architecture. Apache Kafka provides the durable, shared event log at the core. Kafka Connect provides the integration framework for change data capture, ingesting database changes as first-class events. Schema Registry provides the compatibility governance layer. Together, they form a complete data foundation where stale data, memory gaps, delete blindness, schema fragility, and coordination failure are eliminated by design—not patched after the fact.

These architectural components eliminate the data-layer failure modes. But real-time data also enables real-time action—and that speed demands an execution-layer governance framework. Policy-as-code engines ensure that agent decisions, even when based on perfect context and full state, are validated against operational guardrails before they become cluster changes. The streaming backbone delivers the context. The policy layer ensures that context is acted upon safely.

This streaming architecture is not an end in itself. It is the data foundation upon which event-driven network operations can be built. While the streaming backbone eliminates the data-layer failure modes, organizations that pair it with event-driven compute unlock an additional dimension of efficiency. When a telemetry event flows through the event log and an anomaly is detected, that same stream can trigger the Kubernetes Event-driven Autoscaling (KEDA) of inference workloads [3]—spinning up the right-sized model at the right moment, on the right context. The streaming backbone delivers the context. Event-driven orchestration delivers the compute. Together, they close the loop from detection to inference, ensuring the agent has both the data and the compute it needs without the waste of always-on infrastructure.

The barrier is not technology. Each of these architectural components is proven, open-source, and deployed in production environments today. The barrier is architectural awareness. Organizations that invest in a streaming-first data architecture will deploy AI agents that deliver on their promise. Organizations that do not will discover these failure modes in production—after the wrong decision is already made.

The streaming data architecture is not a performance upgrade for Agentic AI. It is the architectural prerequisite.

REFERENCES

[1] P. Madduri and A. L. Thakur, “The Financial Trap of Autonomous Networks: Scaling Agentic AI in the Telecom Core,” IEEE ComSoc Technology Blog, April 2026. [Online]. Available: https://techblog.comsoc.org/2026/03/30/the-financial-trap-of-autonomous-networks-scaling-agentic-ai-in-the-telecom-core/

[2] Apache Software Foundation, “Apache Kafka Documentation.” [Online].
Available: https://kafka.apache.org/42/getting-started/introduction/

[3] Cloud Native Computing Foundation, “KEDA: Kubernetes Event-driven Autoscaling.” [Online]. Available: https://keda.sh/

[4] Streamkap, “Streaming ETL vs. Batch ETL: A Decision Framework.” [Online].
Available: https://streamkap.com/resources-and-guides/streaming-etl-vs-batch-etl

[5] Streamkap, “Real-Time vs Batch Data for AI Agents: Why Freshness Matters.” [Online]. Available: https://streamkap.com/resources-and-guides/real-time-vs-batch-data-for-agents

[6] Streamkap, “Why AI Agents Can’t Use Batch Data.” [Online]. Available: https://streamkap.com/resources-and-guides/why-agents-cant-use-batch-data

[7] Redpanda, “Building safe, multi-agent AI systems in Redpanda Agentic Data Plane.” [Online]. Available: https://www.redpanda.com/blog/adp-governed-multi-agent-ai-cloud

[8] Debezium Community, “Debezium: Open-Source Change Data Capture,” Debezium Documentation. [Online]. Available: https://debezium.io/

ABOUT THE AUTHOR

Shazia Hasnie, Ph.D., is VP, Product Strategy and Innovation at Cuber AI, focused on Agentic Network Operations, AI-driven automation, and streaming data architectures. Her work explores the intersection of autonomous systems, cloud-native infrastructure, and the economic models that make AI operations sustainable at scale.

linkedin.com/in/shaziahasnie/

Nvidia strategic partnership with IREN targets 5G Watts AI infrastructure buildout + $2.1B investment option

Nvidia has announced a strategic partnership with cloud AI data center operator IREN [1.] to deploy up to 5G Watts (5GW) of AI infrastructure, driven by a $3.4 billion services contract and a $2.1 billion investment option for Nvidia. This collaboration aims to secure critical, high-density data center capacity for AI workloads while accelerating IREN’s transition into a major AI infrastructure provider.  This strategic expansion targets up to 5GW of NVIDIA DSX-aligned AI infrastructure across IREN’s global pipeline. The roadmap centers on the 2GW Sweetwater campus in Texas, positioned to be the flagship deployment of NVIDIA’s DSX factory architecture. This integrated model synergizes NVIDIA’s reference designs with IREN’s core competencies in utility-scale power procurement, site development, and full-stack GPU cloud operations.

Note 1. IREN’s metamorphosis from specialized mining to high-performance computing (HPC) mirrors the trajectory of Tier-1 AI Cloud providers like CoreWeave. With an operational fleet of 23,000 GPUs and a 3GW secured power portfolio in renewable-heavy regions, IREN is rapidly scaling its North American footprint. 

“AI factories are becoming foundational infrastructure for the global economy,” said Jensen Huang, founder and CEO of Nvidia. “Deploying these systems at scale requires deep integration across the full stack — compute, networking, software, power and operations. IREN brings the scale and infrastructure expertise to help accelerate the buildout of next-generation AI infrastructure globally. Together, we are building for the age of AI,” he added.  Future deployments are expected to focus on IREN’s 2-gigawatt Sweetwater campus in Texas, which the companies expect to serve as a flagship deployment for Nvidia’s DSX architecture.

“This partnership combines NVIDIA’s AI systems and architecture leadership with IREN’s expertise across power, land, data centers, GPU deployment and infrastructure operations,” said Daniel Roberts, cofounder and co-CEO of IREN. “Together, we believe we can accelerate deployment of AI infrastructure and expand access to compute for AI-native and enterprise customers globally.”

This partnership follows a massive $9.7B agreement with Microsoft for sovereign GPU cloud services—leveraging GB300 Blackwell systems—and a $5.8B hardware procurement through Dell. Despite the scale of the Microsoft deal, leadership indicates it utilizes only ~10% of IREN’s projected capacity.
……………………………………………………………………………………………………………………………………….
Upshot:
Nvidia’s agreement with IREN introduces a unique structural alignment: Nvidia acts as both an upstream provider and an anchor tenant/stakeholder. By securing long-dated options over direct equity, Nvidia mitigates balance sheet volatility while ensuring preferential access to critical, grid-connected capacity in a supply-constrained market.
……………………………………………………………………………………………………………………………………….

References:

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Analysis: Cisco, HPE/Juniper, and Nvidia network equipment for AI data centers

Expose: AI is more than a bubble; it’s a data center debt bomb

Can the debt fueling the new wave of AI infrastructure buildouts ever be repaid?

Blaize and Winmate Forge Strategic Partnership to Accelerate Edge AI Integration in Ruggedized Systems

Bridging the Edge Connectivity Gap:

While modern AI architecture has historically favored centralized data centers, mission-critical applications require real-time inference at the edge. For defense personnel in remote locations, maritime operations, or emergency medical responders, reliance on cloud-based processing is often non-viable due to bandwidth constraints and latency requirements.

Eldorado Hills, CA based Blaize Holdings, Inc. and Winmate Inc. (TAIWAN) have announced a Strategic Partnership Agreement aimed at generating approximately $15 million in business during its inaugural year.  This collaboration integrates Blaize’s high-performance AI accelerators into Winmate’s industrial-grade ruggedized hardware ecosystem—including UAVs, handhelds, vehicle-mounted computers, and embedded systems—designed for mission-critical reliability in high-stress environments. Both organizations anticipate this agreement to be the foundation of a long-term, multi-year technological synergy.

The partnership addresses the “cloud dependency” bottleneck by leveraging Blaize’s GSP® (Graph Streaming Processor) architecture. These chips are engineered to industrial specifications, enabling sophisticated AI workloads to run locally on the device. When paired with Winmate’s ruggedized chassis—built to withstand extreme thermal fluctuations, high-velocity vibration, and dust ingress—the resulting systems provide high-compute AI capabilities in environments where traditional hardware fails.

Target applications:
  • Border security and surveillance: Real-time threat detection and perimeter monitoring
  • Mobile command and control: On-site intelligence and situational awareness for field teams
  • Drones and unmanned systems: Autonomous navigation and mission execution for UAVs and ground vehicles
  • Critical infrastructure: Continuous monitoring and predictive analytics for power, ports, and transportation
  • Maritime domain awareness: Vessel tracking and anomaly detection at sea
  • Field healthcare: Portable diagnostics and decision support in remote and disaster environments

Deal at a glance:

  • First-year revenue: the parties intend to work in good faith to close approximately $15 million in business, expected to scale meaningfully in subsequent years
  • Term: Three-year initial term, with automatic renewal
  • Next steps: Joint engineering, sales, and marketing execution to bring integrated systems to market, with additional opportunities to be added through follow-on programs
Blaize GSP Architecture and Winmate Ruggedization:
The core technical advantage of the Blaize and Winmate partnership lies in the shift from traditional instruction-set architectures to a graph-native processing model. This transition is critical for high-stakes environments like defense and critical infrastructure, where the “cloud round-trip” is an operational liability.
………………………………………………………………………………………………………………………
1. Blaize Graph Streaming Processor (GSP®) Architecture:
Unlike traditional CPUs or GPUs that process tasks sequentially or in massive rigid parallel blocks, the Blaize GSP is purpose-built to execute AI graphs natively in hardware.
    • Task-Level Parallelism: The architecture leverages an on-chip hardware scheduler to analyze data dependencies in real-time. It executes deeper layers of a neural network as soon as previous layers produce sufficient intermediate results, minimizing the “idle time” typical of sequential processing.
    • Performance-to-Power Ratio: The flagship Blaize 1600 SoC features 16 GSP cores delivering 16 TOPS (Tera Operations Per Second) of AI inference within a conservative 7W power envelope.
    • Memory Efficiency: By streaming data through the processor and holding intermediate results in cache, the GSP reduces external DRAM access by up to 50x, which significantly lowers latency and overall system thermal output.
    • Unified Development Platform: All hardware is supported by the Blaize Picasso SDK, which allows developers to port models from standard frameworks (like PyTorch or TensorFlow) into a streaming execution format without requiring low-level hardware manual coding. 

Image Credit: Blaize Holdings

………………………………………………………………………………………………………………………

2. Winmate Rugged Integration:
Winmate embeds these high-efficiency accelerators into “sovereign edge” platforms—hardware that maintains full operational capability without external network reliance. 
  • Pathfinder P1600 SOM: This System-on-Module is the primary vehicle for integration into Winmate’s handhelds and drones. It operates as a standalone unit with dual ARM Cortex-A53 processors and integrated MIPI CSI camera interfaces for real-time sensor fusion.
  • Mission-Ready Durability: These systems are engineered to meet MIL-STD-810H and IP65+ standards, ensuring that Blaize’s AI silicon remains stable under extreme vibration, thermal shock (operating in sub-zero or high-heat field conditions), and high-velocity impacts.
  • Sovereign Edge Computing: By processing sensitive data locally on ruggedized handhelds or vehicle-mounted units, the partnership ensures data sovereignty, preventing critical telemetry or biometric data from ever leaving the device during field operations
Quotes from the CEOs:

“Our customers can’t wait, and they often can’t rely on the cloud. They need AI that runs where the work happens. Winmate makes some of the most capable rugged systems in the industry, and our chips are designed to run AI inside exactly those kinds of devices. This partnership turns a years-long vision into a practical, deployable answer for defense and critical infrastructure operators,” said Dinakar Munagala, CEO of Blaize, Inc.

“Our platforms are deployed on naval vessels, in border outposts, on industrial sites, and in disaster zones – environments where most hardware fails. With Blaize, we can now deliver those same systems with on-device AI built in, giving customers real-time intelligence wherever they operate,” said Ken Lu, Chairman and CEO of Winmate Inc.

Market Outlook: The Shift to On-Device Intelligence:
The demand for localized, secure AI is currently experiencing exponential growth. Market data from BCC Research projects the global edge AI sector to expand from $11.8 billion in 2025 to $56.8 billion by 2030, representing a CAGR of 36.9%. For sectors such as defense, healthcare, and critical infrastructure, the move toward edge AI is driven by two primary imperatives:
    1. Latency: The necessity for near-zero response times in autonomous and diagnostic systems.
    2. Security: The requirement to process sensitive data locally to mitigate the risks associated with transmitting information over public or compromised networks.
By combining low-power, high-efficiency silicon with hardened mechanical engineering, Blaize and Winmate are positioning themselves at the forefront of this industrial shift toward decentralized intelligence.
…………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

About Blaize, Inc.
Blaize delivers a programmable AI platform, purpose-built for AI inference workloads in real-world environments. Its Hybrid AI architecture combines the Blaize GSP (Graph Streaming Processor) with GPU-based infrastructure, enabling AI inference workloads to run across edge, cloud, and data center. Blaize solutions support computer vision, multimodal AI, and sensor-driven applications across smart cities, industrial automation, telecommunications, retail, logistics, and defense. Blaize is headquartered in El Dorado Hills, California, with a global presence across North America, Europe, the Middle East, and Asia. Visit www.blaize.com or follow us on LinkedIn @blaizeinc.

About Winmate Inc.
Winmate Inc. is a publicly traded global leader in rugged computing systems, delivering industrial-grade platforms – including handhelds, tablets, vehicle-mounted units, panel PCs, and embedded modules – for demanding environments across defense, transportation, energy, healthcare, and industrial markets.

………………………………………………………………………………………………………………………………………………………………………………………………………………………..References:

Orange, Nokia, Nvidia, and Intel debate: ASICs vs. GPUs vs. General-Purpose CPUs for RAN Baseband Processing

For Orange CTO Laurent Leboucher, the main attraction of AI today lies in its potential to improve the efficiency of 5G radio access networks (RANs). That helps explain Orange’s recent collaboration with Nokia and Nvidia. Orange already deploys Nokia’s purpose-built 5G network equipment and software at mobile sites in France and other markets. Until recently, it had little obvious need for Nvidia, the U.S. chip making king best known for the graphics processing units (GPUs) used to train large language models. But Nokia and Nvidia became closely aligned last October, when Nvidia took a 3% stake in Nokia as part of a $1 billion investment. Nokia is now developing AI RAN software designed to run on GPUs.

Leboucher’s interest is driven in part by concerns over the cost of custom silicon — the application-specific integrated circuits (ASICs) used in purpose-built 5G networks. “It creates an opportunity to bring a general-purpose chipset instead of an ASIC implementation,” he told Light Reading at last week’s FutureNet World event in London. “I think we could, at some point, benefit from the economies of scale of new chipsets. That could be Nvidia.”

The rationale is much easier to understand than arguments about 5G for autonomous vehicles. Chip manufacturing is already expensive, and both Nokia and Ericsson expect component costs to rise further this year amid relentless AI demand. At the same time, the RAN market remains relatively small and has contracted. According to market research firm Omdia, telco spending fell from $45 billion in 2022 to $35 billion last year and is expected to stay at that level. In that context, it is increasingly difficult to justify designing high-cost chips with limited reuse outside telecom.

Image Credit: Orange

Last year, Nvidia spent about $18.5 billion on research and development, generated nearly $216 billion in revenue, and reported a gross margin of more than 70%. Its financial strength is not in question. If telecom operators can use its GPUs for RAN software, they may face less pressure to secure the long-term economics of 5G and 6G development. That alone could be enough to support the case for Nvidia. The counterarguments are cost and power consumption. By design, custom silicon is optimized for a specific workload and will always outperform a more general-purpose processor at that task. An Nvidia GPU in the RAN could therefore be seen as excessive — like using a crop duster to water a hanging basket.

Leboucher, believes that Nokia and Nvidia are developing something far more compact than a typical data-center deployment. “It is not a Blackwell GPU,” he said, referring to Nvidia’s current hyperscaler-class product line. “I have an understanding it’s something which is a little bit smaller.” One of the first GPU-based products is expected to come on a card that Orange can insert into an existing Nokia AirScale chassis.

He is also interested in replacing traditional RAN algorithms with AI to improve spectral efficiency and overall performance. Through trials with Nokia and Nvidia, Orange wants to determine whether a GPU is actually required to capture the full benefit. “We can completely rethink the way we are doing algorithms today, using AI for the radio Layer 1,” he said, referring to the most compute-intensive part of the RAN software stack. Some of the “AI-RAN” narrative still sounds “a little bit like science fiction,” Leboucher admitted. “But I think there are some very interesting ideas behind that. We want to understand where we are.”

This is not the first time the industry has debated a shift from ASICs to general-purpose processors for RAN equipment. Alongside its purpose-built 5G portfolio, Ericsson already offers cloud RAN products based on Intel CPUs. Samsung is now focused on Intel-based virtual RAN and has recently predicted the end of purpose-built 5G. Even so, cloud and virtual RAN still account for only a small share of live 5G deployments. Huawei and Ericsson, the two largest RAN vendors, remain committed to custom silicon development.

Nvidia’s entry into the market has clearly given Leboucher and his team more to evaluate as RAN technology becomes more sophisticated. “We are introducing new requirements for radio networks, typically for beamforming, and we have to consider the need for quite powerful chipsets,” he said. “Whether the best way to keep going is using ASICs or a general-purpose architecture – I think this is a good time to ask the question. Before, it was too early.”

The answer could shape Orange’s next major RAN decisions. The operator is preparing for what Leboucher describes as a “refresh” of RAN equipment across several countries ahead of the expected 6G launch in 2030. For the first time, he said, Orange will include cloud RAN as a “major option” in its request for proposal.

The concern around Intel as an alternative to Nvidia is its still-fragile financial position. Before December, Intel had been trying to spin off its network and edge group (NEX), which develops RAN chips. Those plans were later shelved, but the company’s net loss widened to about $4.3 billion in the most recent first quarter, from $887 million a year earlier, while revenue rose only 7% year over year to $13.6 billion. Cristina Rodriguez, who had led NEX, left this month to join Coherent, and Intel has not yet named a successor.  “The shares jumped 28% in after-hours trading, taking Intel firmly into meme-stock territory,” said Radio Free Mobile analyst Richard Windsor in a blog published after results came out on April 23. “I say meme-stock because there is no other way to describe it when the shares are on a 2026 PER [price-to-earnings ratio] of 137x, and its technology looks obsolete.”

Orange places significant value on separating hardware from software, allowing the same RAN software to run across multiple hardware platforms. Ericsson and Samsung both say the virtual RAN software they have built for Intel CPUs could, with relatively modest changes, be ported to AMD silicon using the same x86 architecture or to Arm-based CPUs.

By contrast, Layer 1 code written for Nvidia GPUs and the CUDA software stack would not be portable to other platforms, according to Ericsson. “I think the main challenge we see with that is we are trying very hard to keep our stack portable, to give hardware options,” Michael Begley, Ericsson’s head of RAN compute, told Light Reading at MWC Barcelona this year. “If you go all in on one, it’s great, but you’re all in on one, and you can’t offer those other options to the operators or the ecosystem.”

Leboucher acknowledges that risk. “The risk of lock-in exists, definitely,” he said. “We really want to stay open. At the same time, we know that benefiting from a very, very large-scale general-purpose architecture should improve the TCO [total cost of ownership]. At the end of the day, it will be a trade-off. But we would welcome an architecture where we have the capacity at some point to decide to swap if we need to swap.”

Nokia’s hope is that much of the Layer 1 software written for Nvidia GPUs will eventually be deployable on other GPU platforms. But Nvidia’s near-monopoly in that segment leaves the industry with few alternatives for now. There is also optimism inside Nokia that GPU-based code could later be adapted for capable CPUs, although Ericsson’s comments suggest that would be much harder. For telecom executives, the choices made over the next couple of years may be pivotal as 6G approaches.

………………………………………………………………………………………………………………………………………………………

References:

https://www.lightreading.com/5g/orange-weighs-nvidia-against-intel-for-5g-chips-ahead-of-new-rfp

RAN Silicon Rethink- Part II; vRAN and General-Purpose Compute

RAN silicon rethink – from purpose built products & ASICs to general purpose processors or GPUs for vRAN & AI RAN

Big Tech AI spending binge results in massive job cuts!

Executive Summary:

The tech industry is undergoing a massive structural realignment. Hyperscalers, Software as a Service (SaaS) vendors, and telecom network and equipment providers are aggressively slashing workforces to reallocate capital toward massive AI infrastructure investments.  Alphabet, Meta, Amazon, and Microsoft are projected to spend a collective $674 billion in 2026—over double their 2024 levels.  Most of that spending is AI related.

From the referenced WSJ article:

“Tech companies are in effect playing a game of chicken with each other on capital-spending plans. They are shelling out as much as they can—more than their rivals, they hope—on AI chips and data centers that could put them in the lead in a race they feel they can’t afford to lose. That in turn is heightening competition over who can use AI to help do more with a lot less, freeing up money to spend on expensive chips.”

Hyperscalers, such as Microsoft and Meta Platforms (Meta), are the latest to  their significantly reduce their workforces to scale AI-driven operations. Meta is reportedly reducing its headcount by approximately 8,000, while Microsoft has initiated a “voluntary retirement program” (aka a buyout) targeting 7% of its U.S. workforce—a strategic move to trim payroll before resorting to involuntary layoffs.

This trend is industry-wide: Oracle and Snap have executed significant reductions, while Block announced plans to cut 40% of its staff (over 4,000 employees).  March 2026 represented a two-year peak in tech industry contraction, with Layoffs.fyi reporting 45,800 tech job reductions.

…………………………………………………………………………………………..

Source:  Layoffs.fyi
……………………………………………………………………………………………………………………

The AI Transformation Narrative vs. Financial Reality:

Executive leadership is framing these cuts as a strategic pivot toward an AI-native future where automated workflows replace legacy human-centric processes. While CEOs like Block’s Jack Dorsey insist these decisions aren’t driven by distress, a “game of chicken” is unfolding in capital planning.

Companies are locked in an escalating race to secure AI silicon (GPUs), High Bandwidth Memory (HBM) and expand Data Center footprints, creating a massive drain on liquidity.  This heightens the pressure to achieve “doing more with less”—using AI to automate internal functions and free up the capital necessary for expensive infrastructure. However, in many cases, these cuts are simply corrective measures for pandemic-era overhiring or efforts to normalize efficiency metrics:

  • Oracle: Annual revenue per employee remains significantly below industry leaders like Microsoft.
  • Snap: Headcount remains 65% above pre-COVID levels despite consistent operating losses.

Strategic Risks and “Off-Balance-Sheet” Engineering:

While slashing headcounts improves Revenue Per Employee (RPE)—a key KPI for Wall Street—it introduces significant long-term risks:

  • Talent Attrition & Brain Drain: Aggressive layoffs degrade morale and may drive elite engineering talent toward startups, potentially creating new competitors.
  • Governance & Safety: Reducing human oversight during AI deployment could lead to safety and business model integration failures.
  • Regulatory & Public Backlash: The “AI as a job killer” narrative is fueling community opposition to massive data center builds, complicating infrastructure rollouts.

The CAPEX Burden:

The financial strain is becoming evident even for “Deep Pocket” firms. Alphabet, Meta, Amazon, and Microsoft are projected to spend $674 billion in CAPEX this year—more than double their 2022 spend.

  • Amazon is projected to be cash-flow negative this year.
  • Meta’s CAPEX is set to exceed 50% of its annual revenue, with its debt-to-equity ratio climbing to 39% (up from 8% five years ago).
  • Some firms are reportedly utilizing “off-balance-sheet financial wizardry” to maintain their AI compute growth without alarming debt markets.

Verdict of the Market?

Markets are sending mixed signals. While analysts are obsessed with efficiency metrics (questions about efficiency on earnings calls have tripled in two years), they are becoming “skittish” regarding unbridled spending. Tesla (TSLA), for instance, saw a 4% stock dip after raising its spending target to $25 billion.

Ultimately, tech giants—who already average $2M in annual revenue per employee—are betting that further workforce reductions will juice efficiency and fund the AI arms race. The trade-off remains whether these “leaner” organizations can maintain the innovation and safety standards required to lead the next technological cycle.

………………………………………………………………………………………………………..

The telecom sector is particularly vulnerable, as AI-native “zero-touch” operations begin to replace legacy roles permanently.

  • Network Operators:BT has announced plans to replace up to 10,000 roles with AI by 2030, specifically targeting network management and customer service.
  • Network Equipment Vendors: Equipment giants Ericsson and Nokia have collectively shed over 36,000 roles in recent years, pivoting from traditional hardware to AI-optimized software and networking.
  • Integrators:Accenture and IBM are utilizing AI to automate junior-level coding and back-office HR tasks, signaling that AI reskilling is now a prerequisite for workforce retention.

Strategic Outlook – Monetization and the “RPE” Battle:   

For both MNOs and tech giants, the coming years are about monetization. Investors have shifted from cheering bold AI visions to demanding tangible results, with a heavy focus on Revenue Per Employee (RPE)—a metric that workforce reductions are designed to “juice.”

That “Great Realignment” is a high-stakes gamble, in this author’s opinion.  The firms that successfully bridge the gap between massive infrastructure investments and scalable, profitable AI-native services will lead the next generation of global technology. Those that fail to balance efficiency with talent retention may find themselves outpaced by leaner, AI-native startups born from the very talent they have released.

……………………………………………………………………………………………………………….

References:

https://www.wsj.com/tech/ai/the-ai-splurge-is-costing-big-tech-its-workforce-34a88e68

AI spending boom accelerates: Big tech to invest an aggregate of $400 billion in 2025; much more in 2026!

AI infrastructure spending boom: a path towards AGI or speculative bubble?

Gartner: AI spending >$2 trillion in 2026 driven by hyperscalers data center investments

AI spending is surging; companies accelerate AI adoption, but job cuts loom large

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Canalys & Gartner: AI investments drive growth in cloud infrastructure spending

Will Google Cloud’s AI and data analytics revenue +TPU IP licensing income offset huge AI CAPEX to produce a decent ROI?

An April 24th Investors Business Daily (IBD) article asserts that Google’s AI position is strong, but the real test will be monetization.  Specifically, can Gemini translate technical lead and user scale into durable profits for parent company Alphabet?  The company has benefited from AI enthusiasm and Google Cloud momentum, but investors are now focused on whether heavy AI spending will generate sufficient revenues to justify the enormous capex ramp up.  The article highlights Gemini’s growing traction, Google Cloud’s rapid expansion, and a very large backlog as signs of demand, but it also stresses that those positives must offset rising infrastructure costs.

With its Gemini family, Google continues to push its AI technology across the “stack,” (see quote below) deploying it to Google Maps, enterprise Workplace productivity tools, and YouTube’s content and ad platforms. AI technology is even making Google’s autonomous vehicle company, Waymo, better and safer amid its large market expansion.

A key theme is that Google has multiple ways to earn revenue from AI, including consumer subscriptions, enterprise software, and cloud services. The article points to Gemini Advanced as an example of paid AI packaging, while also implying that the larger opportunity is converting AI usage into higher-value cloud and platform revenue rather than just user growth. However, Alphabet is planning very large AI infrastructure spending (much more below), and the article questions whether the company can turn that investment into sustainable high-margin revenue fast enough to satisfy investors.

Google has also ventured into AI semiconductors with its AI accelerator Tensor Processing Unit, known as TPU, co-developed with Broadcom and manufactured by TSMC (Taiwan Semiconductor Manufacturing Company). Google is shifting future TPU generation designs to include MediaTek for design support, with TSMC continuing as the primary fabrication partner for advanced 2nm, 3nm, and 5nm nodes.

Google has recently introduced the 7th-gen “Ironwood TPU 7x and revealed plans for the 8th-gen TPU 8t and TPU 8i for 2027.  Long time colleague Amin Vadat, PhD wrote in a blog post, “We are introducing the eighth generation of Google’s custom Tensor Processor Unit (TPU), coming soon with two distinct, purpose-built architectures for training and inference: TPU 8t and TPU 8i. These two chips are designed to power our custom-built supercomputers, to drive everything from cutting-edge model training and agent development, to massive inference workloads. TPUs have been powering leading foundation models, including Gemini, for years. These 8th generation TPUs together will deliver scale, efficiency and capabilities across training, serving and agentic workloads.”

Image credit:  Google.

Indeed, Google’s TPUs have emerged as a threat to Nvidia’s dominance in the AI chip market. Anthropic has licensed Google’s TPU accelerators for use in data centers. Broadcom will modify the TPUs for Anthropic before the customized chips are made by TSMC. Wells Fargo estimates that Google could bring in over $10 billion in high-margin intellectual property (IP) licensing fees from TPUs in 2026 and 2027.

“What stands out about Google is that they’ve been investing up and down the technology stack, from silicon to the AI models,” said Daniel Flax, managing director at investment management firm Neuberger Berman. “While competition is fierce, they’ve been able to innovate. What we’re focused on is (Google’s) ability to execute on their product road map from one generation of AI models to the next.”

………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

AI Competition from OpenAI and Anthropic:

Google faces lots of AI competition from other hyperscalers (Amazon, Microsoft, Meta, etc) and especially from two private AI companies:.

  1. OpenAI remains a major AI player, powered by the rapid advance of ChatGPT, which launched in 2022.  In its latest funding round, OpenAI landed $122 billion in capital commitments, which values the company at $852 billion. OpenAI’s  GPT-6 is its next-generation AI model, as soon as late 2026.  GPT-6 is expected to include new memory features that support the personalization of AI chatbots. It’ll also offer more support for autonomous AI agents that perform tasks over the internet.
  2. Anthropic’s Claude AI model family has grabbed the spotlight this year. With Claude-based coding and other AI tools, Anthropic shook up the enterprise software market.  Anthropic is preparing a next-generation, more powerful AI model called Mythos.  Anthropic recently raised $30 billion in a funding round that valued the AI company at $380 billion.

………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

AI Cloud Competition:

Google’s cloud computing business is one area that should benefit from the company’s AI spending. The unit has excellent momentum. Cloud revenue climbed 47% to over $16 billion in the December quarter, up from 34% growth in the previous quarter. And Google’s cloud computing sales backlog grew 55% to $240 billion from the September quarter.  AWS still has the largest cloud market share, with Azure second and Google Cloud third.  Google Cloud’s edge is AI and data analytics, especially through Vertex AI, Gemini-related services, and TPU-based infrastructure. The company has developed AI Gemini models targeting specific industries, such as financial services and pharmaceutical companies.  With the recent $32 billion purchase of Wiz, Google plans to offer AI-based cybersecurity threat detection tools.

Google Cloud is growing faster than AWS on an AI-driven basis, but it still trails Azure in the most AI-sensitive growth comparisons and remains third in overall cloud share. The broad pattern is: AWS leads in scale, Azure leads in AI momentum and enterprise pull, and Google Cloud is the strongest “AI-first” challenger with faster growth than AWS but a smaller base.  Recent comparisons show AWS revenue growth around 18% year over year, while Google Cloud grew about 32%, and Azure’s estimated growth was about 39% in the same period.

Microsoft reported Intelligent Cloud segment growth was also faster than AWS. The rough share split cited in recent coverage is AWS about 30%, Azure about 20%, and Google Cloud about 13%.  Azure’s edge is enterprise distribution and the Azure OpenAI ecosystem, while AWS offers the broadest infrastructure catalog and strong AI tooling but is less clearly identified as the AI growth leader. Investor takeaway For investors, Google Cloud looks like the fastest-improving AI cloud franchise relative to its size, but not the biggest one. The real question is whether Google’sAI-led growth can stay above AWS while also narrowing the gap with Azure’s enterprise AI momentum.

Monetization is a Major Issue:

Many analyst say it’s unclear how many consumers will pay for AI. Only about 5% of ChatGPT’s user base is paid.  “Consumer AI is becoming a distribution channel and brand builder, while enterprise agents are where the high-margin, sticky revenue is actually getting locked in,” Ben Lorica, editor of the Gradient Flow AI newsletter, told IBD in an interview. “Widespread platform promiscuity across ChatGPT, Gemini and Claude signals low switching costs and thin margins, which is not a great recipe for durable revenue.”

“Cloud, AI revenues have to scale fast enough for people to say, ‘OK, this is actually working,'” said Michael Landsberg, chief executive of Landsberg Bennett Private Wealth Management. “With Google, a lot of things are going very well, but when is it going to translate into money in the pocket? Gemini is doing really well gaining market share from ChatGPT. But there’s no money yet,” Landsberg added. “The big issue around Google search is, ‘Are they going to be able to put advertising in Gemini?'”

“I think most people want free AI because we’ve been trained that free is how we do this computer thing,” said Kimberly Forrest, Bokeh Capital Partners’ chief investment officer. “Facebook, Instagram — it’s all free now. There might be some people willing to spend $20 monthly on AI, but probably not enough to generate the income that these models need to be continually improved.”

Alphabet has historically monetized consumer products through advertising rather than subscriptions. “I think the average consumer doesn’t want to pay for AI, and if they do, they certainly don’t want to pay much for AI,” said Tim Ghriskey, senior portfolio strategist at Ingalls & Snyder.

Author’s Note:  I regularly use Gemini for Home on my Google Smart Speaker and a different Gemini on PCs and my Samsung phone.  There’s a huge difference in performance with the former making many more mistakes and “AI Hallucinations” than the latter.   The reason is the Gemini for Home and regular Gemini run on two totally different AI systems.  For reasons neither I or Gemini for Home can explain, the Home version is severely deficient with many wrong answers and hallucinations that you don’t get when you use Gemini on a pc or the Gemini app on a smartphone.

One particularly bothersome Gemini for Home response to a question asked or a complaint is: “These pictures should match” or “Here are your photos” or “check out these pictures” with corresponding pics/photos displayed on the speaker’s screen.

–>THAT HAS ABSOLUTELY NOTHING TO DO WITH ANYTHING yet it happens frequently AFTER the Google speaker promises never to repeat it!  Ugggh!!!!

……………………………………………………………………………………………………………………………………….

Google/Alphabet’s Surging CAPEX and ROI:

Alphabet said its 2026 capex will be $175 billion to $185 billion, and management has framed the spending as overwhelmingly AI/infrastructure-related which will support revenue growth in Google Cloud, Gemini, and AI-enhanced Search.

The clearest breakdown disclosed to date is roughly 60% to servers and 40% to data centers and networking equipment. Using the company’s forward guidance ranges:

  • AI Compute Servers: about $105 billion to $111 billion.

  • Data centers and networking equipment: about $70 billion to $74 billion.

That means most of the spend is going into fast-depreciating compute hardware, with the rest funding the physical and network buildout needed to host AI workloads. Google says the investment is meant to expand AI compute, support Google Cloud demand, and scale Gemini and enterprise AI offerings.

The company also pointed to a $240 billion cloud backlog and strong cloud revenue growth as signs that the spending is tied to real demand rather than just speculative buildout.  The key issue for investors is whether this capital intensity converts into enough cloud and AI revenue to justify the return profile.  Alphabet has not given a specific ROI number for its 2026 AI investments. What it has said, and what analysts infer, is that the return should come from faster cloud growth, higher AI-related search usage, and paid enterprise adoption rather than a near-term accounting yield.

In conclusion, 2026 is an AI scale-up year for Google, but the ROI question is still open.

………………………………………………………………………………………………………………………………………………………..

References:

Google’s AI Reckoning: Can Gemini Turn Dominance Into Dollars?

 

https://blog.google/innovation-and-ai/infrastructure-and-cloud/google-cloud/eighth-generation-tpu-agentic-era/

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Big tech spending on AI data centers and infrastructure vs the fiber optic buildout during the dot-com boom (& bust)

AI infrastructure spending boom: a path towards AGI or speculative bubble?

Expose: AI is more than a bubble; it’s a data center debt bomb

China vs U.S.: Race to Generate Power for AI Data Centers as Electricity Demand Soars

Anthropic’s Project Glasswing aims to reshape IT cybersecurity

IDC Survey of Networking Leaders: Enterprise AI progress stalls despite ambitious goals

Will “AI at the Edge” transform telecom or be yet another telco monetization failure?

Nvidia Survey Reveals How Telcos Plan to Use AI; Quantifying ROI is a Challenge

Analysis: Cisco, HPE/Juniper, and Nvidia network equipment for AI data centers

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

 

STL Partners webinar: Agentic AI needed for RAN autonomy & efficiency

Yesterday, a STL Partners webinar titled “Turning autonomy into margin: Agentic AI and the autonomous RAN,” suggested agentic AI is the missing layer that can turn RAN autonomy from a technical goal into a direct profit margin booster. It argues that operators should prioritize autonomy use cases by business impact, not just by how much automation coverage they add, and that the right roadmap can move autonomy from an engineering KPI to a commercial advantage.

The central message was that autonomy only matters if it improves economics (see poll results below). The webinar revealed that network operators need a dual-axis framework that combines the usual autonomous-network maturity view with a value-creation lens, so they can focus on the capabilities that scale into measurable business outcomes.

Agentic AI is presented as the practical enabler for moving beyond human-in-the-loop operations. In this framing, agents help orchestrate tasks, make decisions, and coordinate network actions in ways that support more closed-loop automation than traditional workflows can deliver.

The results of an “actuality” poll relating to RAN autonomy revealed that controlling costs and reliability were most important, with the enablement of new revenue growth through APIs and sensing only scoring 10.87% of respondents.  Similarly, results for an “aspirations” poll for RAN autonomy were also fairly evenly spread between reducing costs and optimizing the customer experience, with just 13.21% citing new revenue growth.

Source: STL Partners

Terje Jensen, SVP, global business security officer and head of network and cloud technology strategy at Telenor, said that he had expected to see network operators’ aspirations shift more clearly towards improving customer experience and even revenue generation, not just efficiency.

Darwin Janz, strategic technology planner at SaskTel, also thought network operators’ ambitions would be higher, but he noted that they still struggle to identify concrete, monetizable use cases. Without that, there’s a real risk of building technical solutions in search of a problem, rather than starting from clear enterprise needs and value, Darwin noted. “We really need to see those use cases and enterprise customer needs,” he added.

……………………………………………………………………………………………………………………….

The webinar was built around four practical questions:

  1. Which use cases create real commercial impact?
  2. How to shift from autonomy as an engineering metric to a margin driver?
  3. Where agentic does AI add value today?
  4. What data, orchestration, and organizational foundations are needed to scale beyond pilots.

For network operators, the implication is that autonomous RAN strategy should be tied to P&L outcomes such as lower operating cost, better resource utilization, and faster optimization cycles. The webinar’s message is that autonomy becomes strategically important only when it is deployed in a way that compounds across the network and business.

…………………………………………………………………………………………………………………..

References:

https://www.lightreading.com/network-automation/telcos-showing-limited-aspiration-for-ran-autonomy-benefit

The Financial Trap of Autonomous Networks: Scaling Agentic AI in the Telecom Core

Nokia to showcase agentic AI network slicing; Ericsson partners with Ookla to measure 5G network slicing performance

 

 

Anthropic’s Project Glasswing aims to reshape IT cybersecurity

Backgrounder:

Late last year, Anthropic said that state-sponsored Chinese hackers had used its artificial intelligence (AI) technology in an effort to infiltrate the computer systems of roughly 30 companies and government agencies around the world. The company said it was the first reported case of a cyberattack in which AI technologies had gathered sensitive information with limited help from human operators.

As Anthropic and its chief rival, OpenAI, prepare to release new and more powerful AI systems, cybersecurity experts are increasingly vocal in their warnings that AI is fundamentally changing cybersecurity.  AI technology could allow hackers to identify security holes in computer systems far faster than in the past, vastly raising the stakes in the decades-long fight between hackers and the security experts guarding computer networks.  As hackers deploy AI to break and steal, security experts are also leaning on AI to spot flaws in their systems — including some that had gone unnoticed for decades.

“This is the most change in the cyber environment, ever,” said Francis deSouza, the chief operating officer and president of security products at Google Cloud. “You have to fight A.I. “This is the most change in the cyber environment, ever,” said Francis deSouza, the chief operating officer and president of security products at Google Cloud. “You have to fight AI with AI.”

Hackers have used AI chatbots to draft phishing emails and ransom notes, cybersecurity experts said. Others have used AI to parse large quantities of stolen data and determine what information might be valuable. Without help from AI attackers could sometimes break into computer networks within minutes, Mr. deSouza said, but with the help of AI breaches can take just seconds.  Some hackers specialize in breaking into systems and then selling off their access to other attackers. Those handoffs used to take as much as eight hours, as hackers negotiated the sales and passed along the compromised entry points, deSouza added. Now that process has accelerated to about 20 seconds, he said, with hackers sometimes using A.I. agents to speed up the process.

Some experts argue that the guardrails added by companies like Anthropic and OpenAI can actually provide an advantage to malicious attackers. Guardrails could cause an AI chatbot to deny help to a user trying to defend a system from an attack, they argue, but persistent hackers could be more diligent about finding vulnerabilities — and keeping those tricks to themselves.

In February, Anthropic said it had used its A.I. technologies to find over 500 so-called zero-day vulnerabilities — security holes that were unknown to software makers — in various pieces of commonly used open source software. The next month, a researcher at Anthropic revealed that he had used A.I. to find a serious security vulnerability in the core of the Linux operating system, which is software that powers much of the internet and is used in computer servers, cloud computing services, Android phones and Teslas. The bug had existed, apparently undiscovered, since 2003.

Project Glasswing Overview:

Anthropic has announced Project Glasswing – a new initiative that brings together Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks – in an effort to secure the world’s most critical software.

The fast growing AI private company has found that AI models (like its own Claude) have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities. Their Mythos Preview language model has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser.

Given the rate of AI progress, it will not be long before such capabilities proliferate, potentially beyond actors who are committed to deploying them safely. The fallout—for economies, public safety, and national security—could be severe. Project Glasswing is an urgent attempt to put these capabilities to work for defensive purposes.

The Project Glasswig partners will use Mythos Preview as part of their defensive security work. Anthropic will share what they learn so the entire IT industry can benefit. They have also extended access to a group of over 40 additional organizations that build or maintain critical software infrastructure so they can use the model to scan and secure both first-party and open-source systems.

Anthropic is committing up to $100M in usage credits for Mythos Preview across these efforts, as well as $4M in direct donations to open-source security organizations.

Project Glasswing Core Objectives:
  • Give Defenders a Head Start: The initiative aims to use Mythos’s capabilities to find and fix zero-day vulnerabilities in critical codebases before they can be discovered by malicious actors.
  • Secure Critical Infrastructure: Partners use the model to scan first-party systems and open-source software that underpin global banking, energy, and logistics networks.
  • Modernize Defense Practices: Anthropic is collaborating with partners to evolve security workflows, such as patching and disclosure processes, to match the “machine speed” of AI-driven vulnerability discovery.
Claude Mythos Capabilities:
The Glasswing initiative was formed after Anthropic researchers observed that the Mythos model had reached a threshold where its reasoning and coding skills surpassed all but the most skilled human security researchers.
  • Zero-Day Discovery: In early testing, the model autonomously found thousands of high-severity vulnerabilities, including a 27-year-old bug in OpenBSD and a 16-year-old flaw in FFmpeg code that had been scanned by automated tools millions of times without detection.
  • Performance Benchmarks: Mythos Preview scored 83% on the CyberGym cybersecurity benchmark, significantly outperforming previous models like Claude Opus.

 

References:

https://www.anthropic.com/glasswing

https://www.nytimes.com/2026/04/06/technology/ai-cybersecurity-hackers.html

Anthropic Glasswing: AI Vulnerability Detection Has Crossed a Threshold

Anthropic Claude Users Reveal AI Hallucinations as their Top Concern

Nvidia CEO Huang: AI is the largest infrastructure buildout in human history; AI Data Center CAPEX will generate new revenue streams for operators

New Linux Foundation white paper: How to integrate AI applications with telecom networks using standardized CAMARA APIs and the Model Context Protocol (MCP)

Nokia’s AI Applications Study: “Physical AI” may require RAN redesign to support high‑volume, low‑latency uplink traffic

According to Nokia, AI-generated traffic in most mobile networks is at an early stage, with application maturity and adoption by consumers and enterprises only at the start of a broader AI super cycle.  The Finland based company analyzed more than 50 AI applications and came to three conclusions: higher uplink traffic, overall data growth and increasing sensitivity to delay in conversational services such as chat and voice. Also, the mobile network industry is moving toward “AI-RAN” or “6G-native” structures that embed AI into the network, transforming radio sites into “robotic” nodes capable of edge inference and handling these new demands.

–>Do those findings require a structural change in Radio Access Network (RAN) design?  Let’s take a fresh look…..

Mobile networks traditionally support a heterogeneous mix of traffic, ranging from high-throughput video streaming to low-bandwidth, delay-tolerant messaging. Network operators typically address escalating capacity demands through infrastructure expansion and overprovisioning, relying on best-effort delivery—a model that has proven remarkably resilient. However, capacity alone is insufficient for new use cases.

The transition from circuit-switched voice to packet-switched (voice/video/data) IP traffic requires a redesign to accommodate variable packet sizes instead of predictable, continuous voice patterns. The proliferation of Internet of Things (IoT) devices introduced requirements for massive machine-type communications (mMTC), driving the development of LTE-M and NB-IoT to optimize for deep indoor penetration and power efficiency.  Conversely, consumer web-based services and video streaming scale seamlessly by adding RAN and core capacity. Existing AI applications, such as generative AI chatbots, follow this model, making current RAN architectures adequate for the present load.

A paradigm shift is emerging with Physical AI [1.], which enables machines like autonomous vehicles and robots to interact with the environment in real time. Unlike traditional video streaming, these applications cannot leverage buffering to absorb network jitter. In Physical AI, high-definition video frames and sensor data must arrive within stringent time-to-live (TTL) constraints to remain actionable. This shifts the focus from average throughput to consistent low latency. Maintaining this strict QoS, particularly in the uplink, requires abandoning best-effort, overprovisioned models in favor of guaranteed scheduling, which necessitates substantial reserved capacity or specialized AI-RAN functionalities.

Note 1. Physical AI combines sensors, perception, decision-making, and actuators so machines can understand their environment and take physical (real world) action. Physical AI is used by robots, vehicles, drones, industrial machines, and smart infrastructure that generate and consume real-time sensor, video, and control traffic. These systems need tight coupling between low latency, high reliability, and continuous feedback loops because decisions in software immediately affect physical motion or control. Physical AI is different from typical generative AI because the output is not text or images; it is real-world action. That makes network performance critical, especially for uplink-heavy, latency-sensitive traffic where delays can affect safety, control accuracy, and operational efficiency.

Physical AI introduces the possibility that large-volume uplink video with strict latency requirements. It will become a meaningful part of mobile traffic, creating both a design challenge and a monetization opportunity,” says Harish Viswanathan, Head of the Radio Systems Research Group at Nokia.

Image Credit: Techslang

Delivering uplink video with sub‑20 ms end-to-end latency can require provisioning three to four times the average uplink capacity. While this level of redundancy is manageable for low-bandwidth services such as voice or control signaling, it becomes prohibitively expensive when supporting high-throughput video streams.

As device densities increase, the required headroom for reserved capacity grows disproportionately, significantly constraining network scalability and driving up cost per bit. This makes Physical AI traffic—characterized by real-time sensor and video inputs for machine analysis—fundamentally different from conventional services, and unsuited to existing best‑effort transport models.  From a Nokia blog post:

“Physical AI will rely on low latency videos to enable real-time control. While the machines or robots will perform most functions locally, there will be situations where they need to rely on more powerful models or human operators to provide remote control via the network. For example, driverless taxis may require remote assistance in unexpected scenarios; service robots may need guidance in complex environments; drones may depend on real‑time video analysis at the point of delivery; and field workers using AR may require timely visual instructions. In all these cases, the network must deliver fresh video information with low and predictable latency.”

To address these challenges, telecom operators are expected to adopt a multi‑layer approach encompassing network architecture, traffic management, and service monetization.

At the Application layer, not all traffic requires identical latency treatment. When video or sensor data is processed by AI rather than consumed by humans, only semantically relevant information may need immediate uplink transmission. This emerging paradigm, known as semantic communication, allows for significant data reduction while preserving information integrity within latency‑critical loops.

Within the network domain, established mechanisms such as Quality of Service (QoS) and network slicing remain essential. QoS enables prioritization of specific traffic classes, while slicing supports logically isolated virtual networks with guaranteed service-level attributes—latency, jitter, bandwidth, and reliability.

At the service and business model level, supporting low-latency, bandwidth-intensive applications reshapes network economics. Operators must evolve beyond best‑effort pricing structures toward differentiated service tiers or performance-based charging models aligned with enterprise and industrial use cases.

For the RAN, Physical AI underscores the need for greater programmability and elasticity. Future RAN designs will depend on dynamic resource allocation, real-time traffic classification, and AI-driven orchestration to balance throughput, latency, and reliability at scale.

As Physical AI deployments expand—from autonomous mobility to precision manufacturing and tele‑robotics—managing high‑volume, low‑latency uplink traffic will become a defining capability for next‑generation network strategy and differentiation. Unlike conventional mobile data, Physical AI cannot rely on buffering to manage traffic spikes. The requirement for continuous video and sensor data to arrive within strict time limits to inform real-time actions makes traditional “best-effort” network approaches inefficient and costly.

Reasons for RAN Redesign:
  • Uplink-Centric Demand: Physical AI shifts the network requirement from downlink-heavy (human consumption) to uplink-heavy (machine-generated) traffic.
  • Strict Latency & Throughput: Maintaining consistent low latency (e.g., around 20 milliseconds) for high-volume video uploads can require 3x to 4x more capacity than average, making overprovisioning unsustainable.
  • Need for Programmable Architectures: To support this, RAN must move toward more flexible, AI-native architectures that prioritize critical data and provide deterministic, rather than best-effort, performance.
  • Semantic Communication: To reduce data volume while maintaining performance, the RAN will need to adopt semantic communication—transmitting only the essential data needed for the AI to make decisions.

………………………………………………………………………………………………………………………………………………………..

References:

https://www.nokia.com/asset/215147/

https://www.nokia.com/blog/physical-ai-redefining-ran-and-telco-monetization/

https://telcomagazine.com/news/nokia-report-points-to-ai-driven-shift-in-mobile-traffic

What Is Physical AI?

Arm Holdings unveils “Physical AI” business unit to focus on robotics and automotive

Is the “far edge” a bridge to far to cross for AI inferencing? What about “Distributed AI Grids”?

The Financial Trap of Autonomous Networks: Scaling Agentic AI in the Telecom Core

Ericsson and Intel collaborate to accelerate AI-Native 6G; other AI-Native 6G advancements at MWC 2026

NVIDIA and global telecom leaders to build 6G on open and secure AI-native platforms + Linux Foundation launches OCUDU

Comparing AI Native mode in 6G (IMT 2030) vs AI Overlay/Add-On status in 5G (IMT 2020)

AI-RAN Reality Check: hype vs hesitation, shaky business case, no specific definition, no standards?

Page 1 of 15
1 2 3 15