Nvidia – IEEE ComSoc Technology Blog

Nvidia’s networking solutions give it an edge over competitive AI chip makers

Posted on August 6, 2025 by Alan Weissberger

Nvidia’s networking equipment and module sales accounted for $12.9 billion of its $115.1 billion in data center revenue in its prior fiscal year. Composed of its NVLink, InfiniBand, and Ethernet solutions, Nvidia’s networking products (from its Mellanox acquisition) are what allow its GPU chips to communicate with each other, let servers talk to each other inside massive data centers, and ultimately ensure end users can connect to it all to run AI applications.

“The most important part in building a supercomputer is the infrastructure. The most important part is how you connect those computing engines together to form that larger unit of computing,” explained Gilad Shainer, senior vice president of networking at Nvidia.

In Q1-2025, networking made up $4.9 billion of Nvidia’s $39.1 billion in data center revenue. And it’ll continue to grow as customers continue to build out their AI capacity, whether that’s at research universities or massive data centers.

“It is the most underappreciated part of Nvidia’s business, by orders of magnitude,” Deepwater Asset Management managing partner Gene Munster told Yahoo Finance. “Basically, networking doesn’t get the attention because it’s 11% of revenue. But it’s growing like a rocket ship. “[Nvidia is a] very different business without networking,” Munster explained. “The output that the people who are buying all the Nvidia chips [are] desiring wouldn’t happen if it wasn’t for their networking.”

Nvidia senior vice president of networking Kevin Deierling says the company has to work across three different types of networks:

NVLink technology connects GPUs to each other within a server or multiple servers inside of a tall, cabinet-like server rack, allowing them to communicate and boost overall performance.
InfiniBand connects multiple server nodes across data centers to form what is essentially a massive AI computer.
Ethernet connectivity for front-end network for storage and system management.

Note: Industry groups also have their own competing networking technologies including UALink, which is meant to go head-to-head with NVLink, explained Forrester analyst Alvin Nguyen.

“Those three networks are all required to build a giant AI-scale, or even a moderately sized enterprise-scale, AI computer,” Deierling explained. Low latency is key as longer transit times for data going to/from GPUs slows the entire operation, delaying other processes and impacting the overall efficiency of an entire data center.

Nvidia CEO Jensen Huang presents a Grace Blackwell NVLink72 as he delivers a keynote address at the Consumer Electronics Show (CES) in Las Vegas, Nevada on January 6, 2025. Photo by PATRICK T. FALLON/AFP via Getty Images

As companies continue to develop larger AI models and autonomous and semi-autonomous agentic AI capabilities that can perform tasks for users, making sure those GPUs work in lockstep with each other becomes increasingly important.

The AI industry is in the midst of a broad reordering around the idea of inferencing, which requires more powerful data center systems to run AI models. “I think there’s still a misperception that inferencing is trivial and easy,” Deierling said.

“It turns out that it’s starting to look more and more like training as we get to [an] agentic workflow. So all of these networks are important. Having them together, tightly coupled to the CPU, the GPU, and the DPU [data processing unit], all of that is vitally important to make inferencing a good experience.”

Competitor AI chip makers, like AMD are looking to grab more market share from Nvidia, and cloud giants like Amazon, Google, and Microsoft continue to design and develop their own AI chips. However, none of them have the low latency, high speed connectivity solutions provided by Nvidia (again, think Mellanox).

References:

https://finance.yahoo.com/news/nvidias-most-underappreciated-business-is-taking-off-like-a-rocket-ship-183615113.html

https://www.nvidia.com/en-us/networking/

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Nvidia enters Data Center Ethernet market with its Spectrum-X networking platform

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

Does AI change the business case for cloud networking?

The case for and against AI-RAN technology using Nvidia or AMD GPUs

Telecom sessions at Nvidia’s 2025 AI developers GTC: March 17–21 in San Jose, CA

Huawei launches CloudMatrix 384 AI System to rival Nvidia’s most advanced AI system

Posted on July 27, 2025 by Alan Weissberger

On Saturday, Huawei Technologies displayed an advanced AI computing system in China, as the Chinese technology giant seeks to capture market share in the country’s growing artificial intelligence sector. Huawei’s CloudMatrix 384 system made its first public debut at the World Artificial Intelligence Conference (WAIC), a three-day event in Shanghai where companies showcase their latest AI innovations, drawing a large crowd to the company’s booth.

The Huawei CloudMatrix 384 is a high-density AI computing system featuring 384 Huawei Ascend 910C chips, designed to rival Nvidia’s GB200 NVL72 (more below). The AI system employs a “supernode” architecture with high-speed internal chip interconnects. The system is built with optical links for low-latency, high-bandwidth communication. Huawei has also integrated the CloudMatrix 384 into its cloud platform. The system has drawn close attention from the global AI community since Huawei first announced it in April.

The CloudMatrix 384 resides on the super-node Ascend platform and uses high-speed bus interconnection capability, resulting in low latency linkage between 384 Ascend NPUs. Huawei says that “compared to traditional AI clusters that often stack servers, storage, network technology, and other resources, Huawei CloudMatrix has a super-organized setup. As a result, it also reduces the chance of facing failures at times of large-scale training.

Attendees visit a Huawei booth during the World Artificial Intelligence Conference in Shanghai, China July 26, 2025.

Photo Credit: REUTERS/Go Nakamura

Huawei staff at its WAIC booth declined to comment when asked to introduce the CloudMatrix 384 system. A spokesperson for Huawei did not respond to questions. However, Huawei says that “early reports revealed that the CloudMatrix 384 can offer 300 PFLOPs of dense BF16 computing. That’s double of Nvidia GB200 NVL72 AI tech system. It also excels in terms of memory capacity (3.6x) and bandwidth (2.1x).” Indeed, industry analysts view the CloudMatrix 384 as a direct competitor to Nvidia’s GB200 NVL72, the U.S. GPU chipmaker’s most advanced system-level product currently available in the market.

One industry expert has said the CloudMatrix 384 system rivals Nvidia’s most advanced offerings. Dylan Patel, founder of semiconductor research group SemiAnalysis, said in an April article that Huawei now had AI system capabilities that could beat Nvidia’s AI system. The CloudMatrix 384 incorporates 384 of Huawei’s latest 910C chips and outperforms Nvidia’s GB200 NVL72 on some metrics, which uses 72 B200 chips, according to SemiAnalysis. The performance stems from Huawei’s system design capabilities, which compensate for weaker individual chip performance through the use of more chips and system-level innovations, SemiAnalysis said.

Huawei has become widely regarded as China’s most promising domestic supplier of chips essential for AI development, even though the company faces U.S. export restrictions. Nvidia CEO Jensen Huang told Bloomberg in May that Huawei had been “moving quite fast” and named the CloudMatrix as an example.

Huawei says the system uses “supernode” architecture that allows the chips to interconnect at super-high speeds and in June, Huawei Cloud CEO Zhang Pingan said the CloudMatrix 384 system was operational on Huawei’s cloud platform.

According to Huawei, the Ascend AI chip-based CloudMatrix 384 with three important benefits:

Ultra-large bandwidth
Ultra-Low Latency
Ultra-Strong Performance

These three perks can help enterprises achieve better AI training as well as stable reasoning performance for models. They could further retain long-term reliability.

References:

https://www.huaweicentral.com/huawei-launches-cloudmatrix-384-ai-chip-cluster-against-nvidia-nvl72/

https://www.reuters.com/world/china/huawei-shows-off-ai-computing-system-rival-nvidias-top-product-2025-07-26/

https://semianalysis.com/2025/04/16/huawei-ai-cloudmatrix-384-chinas-answer-to-nvidia-gb200-nvl72/

U.S. export controls on Nvidia H20 AI chips enables Huawei’s 910C GPU to be favored by AI tech giants in China

Huawei’s “FOUR NEW strategy” for carriers to be successful in AI era

FT: Nvidia invested $1bn in AI start-ups in 2024

Big Tech and VCs invest hundreds of billions in AI while salaries of AI experts reach the stratosphere

Posted on June 30, 2025 by Alan Weissberger

Introduction:

Two and a half years after OpenAI set off the generative artificial intelligence (AI) race with the release of the ChatGPT, big tech companies are accelerating their A.I. spending, pumping hundreds of billions of dollars into their frantic effort to create systems that can mimic or even exceed the abilities of the human brain. The areas of super huge AI spending are data centers, salaries for experts, and VC investments. Meanwhile, the UAE is building one of the world’s largest AI data centers while Softbank CEO Masayoshi Son believes that Artificial General Intelligence (AGI) will surpass human-level cognitive abilities (Artificial General Intelligence or AGI) within a few years. And that Artificial Super Intelligence (ASI) will surpass human intelligence by a factor of 10,000 within the next 10 years.

AI Data Center Build-out Boom:

Tech industry’s giants are building AI data centers that can cost more than $100 billion and will consume more electricity than a million American homes. Meta, Microsoft, Amazon and Google have told investors that they expect to spend a combined $320 billion on infrastructure costs this year. Much of that will go toward building new data centers — more than twice what they spent two years ago.

As OpenAI and its partners build a roughly $60 billion data center complex for A.I. in Texas and another in the Middle East, Meta is erecting a facility in Louisiana that will be twice as large. Amazon is going even bigger with a new campus in Indiana. Amazon’s partner, the A.I. start-up Anthropic, says it could eventually use all 30 of the data centers on this 1,200-acre campus to train a single A.I system. Even if Anthropic’s progress stops, Amazon says that it will use those 30 data centers to deliver A.I. services to customers.

Amazon is building a data center complex in New Carlisle, Ind., for its work with the A.I. company Anthropic. Photo Credit…AJ Mast for The New York Times

……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..

Stargate UAE:

OpenAI is partnering with United Arab Emirates firm G42 and others to build a huge artificial-intelligence data center in Abu Dhabi, UAE. The project, called Stargate UAE, is part of a broader push by the U.A.E. to become one of the world’s biggest funders of AI companies and infrastructure—and a hub for AI jobs. The Stargate project is led by G42, an AI firm controlled by Sheikh Tahnoon bin Zayed al Nahyan, the U.A.E. national-security adviser and brother of the president. As part of the deal, an enhanced version of ChatGPT would be available for free nationwide, OpenAI said.

The first 200-megawatt chunk of the data center is due to be completed by the end of 2026, while the remainder of the project hasn’t been finalized. The buildings’ construction will be funded by G42, and the data center will be operated by OpenAI and tech company Oracle, G42 said. Other partners include global tech investor, AI/GPU chip maker Nvidia and network-equipment company Cisco.

Softbank and ASI:

Not wanting to be left behind, SoftBank, led by CEO Masayoshi Son, has made massive investments in AI and has a bold vision for the future of AI development. Son has expressed a strong belief that Artificial Super Intelligence (ASI), surpassing human intelligence by a factor of 10,000, will emerge within the next 10 years. For example, Softbank has:

Significant investments in OpenAI, with planned investments reaching approximately $33.2 billion. Son considers OpenAI a key partner in realizing their ASI vision.
Acquired Ampere Computing (chip designer) for $6.5 billion to strengthen their AI computing capabilities.
Invested in the Stargate Project alongside OpenAI, Oracle, and MGX. Stargate aims to build large AI-focused data centers in the U.S., with a planned investment of up to $500 billion.

Son predicts that AI will surpass human-level cognitive abilities (Artificial General Intelligence or AGI) within a few years. He then anticipates a much more advanced form of AI, ASI, to be 10,000 times smarter than humans within a decade. He believes this progress is driven by advancements in models like OpenAI’s o1, which can “think” for longer before responding.

Super High Salaries for AI Researchers:

Salaries for A.I. experts are going through the roof and reaching the stratosphere. OpenAI, Google DeepMind, Anthropic, Meta, and NVIDIA are paying over $300,000 in base salary, plus bonuses and stock options. Other companies like Netflix, Amazon, and Tesla are also heavily invested in AI and offer competitive compensation packages.

Meta has been offering compensation packages worth as much as $100 million per person. The owner of Facebook made more than 45 offers to researchers at OpenAI alone, according to a person familiar with these approaches. Meta’s CTO Andrew Bosworth implied that only a few people for very senior leadership roles may have been offered that kind of money, but clarified “the actual terms of the offer” wasn’t a “sign-on bonus. It’s all these different things.” Tech companies typically offer the biggest chunks of their pay to senior leaders in restricted stock unit (RSU) grants, dependent on either tenure or performance metrics. A four-year total pay package worth about $100 million for a very senior leader is not inconceivable for Meta. Most of Meta’s named officers, including Bosworth, have earned total compensation of between $20 million and nearly $24 million per year for years.

Meta CEO Mark Zuckerberg on Monday announced its new artificial intelligence organization, Meta Superintelligence Labs, to its employees, according to an internal post reviewed by The Information. The organization includes Meta’s existing AI teams, including its Fundamental AI Research lab, as well as “a new lab focused on developing the next generation of our models,” Zuckerberg said in the post. Scale AI CEO Alexandr Wang has joined Meta as its Chief AI Officer and will partner with former GitHub CEO Nat Friedman to lead the organization. Friedman will lead Meta’s work on AI products and applied research.

“I’m excited about the progress we have planned for Llama 4.1 and 4.2,” Zuckerberg said in the post. “In parallel, we’re going to start research on our next generation models to get to the frontier in the next year or so,” he added.

On Thursday, researcher Lucas Beyer confirmed he was leaving OpenAI to join Meta along with the two others who led OpenAI’s Zurich office. He tweeted: “1) yes, we will be joining Meta. 2) no, we did not get 100M sign-on, that’s fake news.” (Beyer politely declined to comment further on his new role to TechCrunch.) Beyer’s expertise is in computer vision AI. That aligns with what Meta is pursuing: entertainment AI, rather than productivity AI, Bosworth reportedly said in that meeting. Meta already has a stake in the ground in that area with its Quest VR headsets and its Ray-Ban and Oakley AI glasses.

VC investments in AI are off the charts:

Venture capitalists are strongly increasing their AI spending. U.S. investment in A.I. companies rose to $65 billion in the first quarter, up 33% from the previous quarter and up 550% from the quarter before ChatGPT came out in 2022, according to data from PitchBook, which tracks the industry.

This astounding VC spending, critics argue, comes with a huge risk. A.I. is arguably more expensive than anything the tech industry has tried to build, and there is no guarantee it will live up to its potential. But the bigger risk, many executives believe, is not spending enough to keep pace with rivals.

“The thinking from the big C.E.O.s is that they can’t afford to be wrong by doing too little, but they can afford to be wrong by doing too much,” said Jordan Jacobs, a partner with the venture capital firm Radical Ventures. “Everyone is deeply afraid of being left behind,” said Chris V. Nicholson, an investor with the venture capital firm Page One Ventures who focuses on A.I. technologies.

Indeed, a significant driver of investment has been a fear of missing out on the next big thing, leading to VCs pouring billions into AI startups at “nosebleed valuations” without clear business models or immediate paths to profitability.

Conclusions:

Big tech companies and VCs acknowledge that they may be overestimating A.I.’s potential. Developing and implementing AI systems, especially large language models (LLMs), is incredibly expensive due to hardware (GPUs), software, and expertise requirements. One of the chief concerns is that revenue for many AI companies isn’t matching the pace of investment. Even major players like OpenAI reportedly face significant cash burn problems. But even if the technology falls short, many executives and investors believe, the investments they’re making now will be worth it.

References:

https://www.nytimes.com/2025/06/27/technology/ai-spending-openai-amazon-meta.html

Meta is offering multimillion-dollar pay for AI researchers, but not $100M ‘signing bonuses’

https://www.theinformation.com/briefings/meta-announces-new-superintelligence-lab

OpenAI partners with G42 to build giant data center for Stargate UAE project

AI adoption to accelerate growth in the $215 billion Data Center market

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers

OpenAI partners with G42 to build giant data center for Stargate UAE project

Posted on May 24, 2025 by Alan Weissberger

OpenAI, the maker of ChatGPT, said it was partnering with United Arab Emirates firm G42 and others to build a huge artificial-intelligence data center in Abu Dhabi, UAE. It will be the company’s first large-scale project outside the U.S. OpenAI and G42 said Thursday the data center would have a capacity of 1 gigawatt (1 GW) [1], putting it among the most powerful in the world. OpenAI and G42 didn’t disclose a cost for the huge data center, although similar projects planned in the U.S. run well over $10 billion.

………………………………………………………………………………………………………………………………………………………………………..

Note 1. 1 GW of continuous power is enough to run roughly one million top‑end Nvidia GPUs once cooling and power‑conversion overheads are included. That’s roughly the annual electricity used by a city the size of San Francisco or Washington.

“Think of 1MW as the backbone for a mid‑sized national‑language model serving an entire country,” Mohammed Soliman, director of the strategic technologies and cybersecurity programme at the Washington-based Middle East Institute think tank, told The National.

………………………………………………………………………………………………………………………………………………………………………..

The project, called Stargate UAE, is part of a broader push by the U.A.E. to become one of the world’s biggest funders of AI companies and infrastructure—and a hub for AI jobs. The Stargate project is led by G42, an AI firm controlled by Sheikh Tahnoon bin Zayed al Nahyan, the U.A.E. national-security adviser and brother of the president. As part of the deal, an enhanced version of ChatGPT would be available for free nationwide, OpenAI said.

Data centers are grouped into three sizes: small, measuring up to about 1,000 square feet (93 square metres), medium, around 10,000 sqft to 50,000 sqft, and large, which are more than 50,000 sqft, according to Data Centre World. On a monthly basis, they are estimated to consume as much as 36,000kWh, 2,000MW and 10MW, respectively.

UAE has at least 17 data centers, according to data compiled by industry tracker DataCentres.com. AFP

The data-center project is the fruit of months of negotiations between the Gulf petrostate and the Trump administration that culminated in a deal last week to allow the U.A.E. to import up to 500,000 AI chips a year, people familiar with the deal have said.

That accord overturned Biden administration restrictions that limited access to cutting-edge AI chips to only the closest of U.S. allies, given concerns that the technology could fall into the hands of adversaries, particularly China.

To convince the Trump administration it was a reliable partner, the U.A.E. embarked on a multipronged charm offensive. Officials from the country publicly committed to investing more than $1.4 trillion in the U.S., used $2 billion of cryptocurrency from Trump’s World Liberty Financial to invest in a crypto company, and hosted the CEOs of the top U.S. tech companies for chats in a royal palace in Abu Dhabi.

As part of the U.S.-U.A.E agreement, the Gulf state “will fund the build-out of AI infrastructure in the U.S. at least as large and powerful as that in the UAE,” David Sacks, the Trump administration’s AI czar, said earlier this week on social media.

U.A.E. fund MGX is already an investor in Stargate, the planned $100 billion network of U.S. data centers being pushed by OpenAI and SoftBank.

Similar accords with other U.S. tech companies are expected in the future, as U.A.E. leaders seek to find other tenants for their planned 5-gigawatt data-center cluster. The project was revealed last week during Trump’s visit to the region, where local leaders showed the U.S. president a large model of the project.

The U.A.E. is betting U.S. tech giants will want servers running near users in Africa and India, slightly shaving off the time it takes to transmit data there.

Stargate U.A.E. comes amid a busy week for OpenAI. On Wednesday, a developer said it secured $11.6 billion in funding to push ahead with an expansion of a data center planned for OpenAI in Texas. OpenAI also announced it was purchasing former Apple designer Jony Ive’s startup for $6.5 billion.

References:

https://www.wsj.com/tech/open-ai-abu-dhabi-data-center-1c3e384d?mod=ai_lead_pos6

https://www.thenationalnews.com/future/technology/2025/05/24/stargate-uae-ai-g42/

Wedbush: Middle East (Saudi Arabia and UAE) to be next center of AI infrastructure boom

Cisco to join Stargate UAE consortium as a preferred tech partner

U.S. export controls on Nvidia H20 AI chips enables Huawei’s 910C GPU to be favored by AI tech giants in China

Posted on April 21, 2025 by Alan Weissberger

Damage of U.S. Export Controls and Trade War with China:

The U.S. big tech sector, especially needs to know what the rules of the trade game will be looking ahead instead of the on-again/off-again Trump tariffs and trade war with China which includes 145% tariffs and export controls on AI chips from Nvidia, AMD, and other U.S. semiconductor companies.

The latest export restriction on Nvidia’s H20 AI chips are a case in point. Nvidia said it would record a $5.5 billion charge on its quarterly earnings after it disclosed that the U.S. will now require a license for exporting the company’s H20 processors to China and other countries. The U.S. government told the chip maker on April 14th that the new license requirement would be in place “indefinitely.”

Nvidia designed the H20 chip to comply with existing U.S. export controls that limit sales of advanced AI processors to Chinese customers. That meant the chip’s capabilities were significantly degraded; Morgan Stanley analyst Joe Moore estimates the H20’s performance is about 75% below that of Nvidia’s H100 family. The Commerce Department said it was issuing new export-licensing requirements covering H20 chips and AMD’s MI308 AI processors.

Big Chinese cloud companies like Tencent, ByteDance (TikTok’s parent), Alibaba, Baidu, and iFlytek have been left scrambling for domestic alternatives to the H20, the primary AI chip that Nvidia had until recently been allowed to sell freely into the Chinese market. Some analysts suggest that H20 bulk orders to build a stockpile were a response to concerns about future U.S. export restrictions and a race to secure limited supplies of Nvidia chips. The estimate is that there’s a 90 days supply of H20 chips, but it’s uncertain what China big tech companies will use when that runs out.

The inability to sell even a low-performance chip into the Chinese market shows how the trade war will hurt Nvidia’s business. The AI chip king is now caught between the world’s two superpowers as they jockey to take the lead in AI development.

Nvidia CEO Jensen Huang “flew to China to do damage control and make sure China/Xi knows Nvidia wants/needs China to maintain its global ironclad grip on the AI Revolution,” the analysts note. The markets and tech world are tired of “deal progress” talks from the White House and want deals starting to be inked so they can plan their future strategy. The analysts think this is a critical week ahead to get some trade deals on the board, because Wall Street has stopped caring about words and comments around “deal progress.”

Raa | Nurphoto | Getty Images

………………………………………………………………………………………………………………………………………………………………………………………………………

Domestic (China) Alternatives to Nvidia’s H20:

“There are several local Chinese companies that produce chips to compete with Nvidia,” said Brady Wang, associate director at Counterpoint Research. In particular:

Baidu is developing its own AI chips called Kunlun. It recently placed an order for 1,600 of Huawei’s Ascend 910B AI chips for 200 servers. This order was made in anticipation of further U.S. export restrictions on AI chips.
Alibaba (T-Head) has developed AI chips like the Hanguang 800 inference chip, used to accelerate its e-commerce platform and other services.
Cambricon Technologies: Designs various types of semiconductors, including those for training AI models and running AI applications on devices.
Biren Technology: Designs general-purpose GPUs and software development platforms for AI training and inference, with products like the BR100 series.
Moore Threads: Develops GPUs designed for training large AI models, with data center products like the MTT KUAE.
Horizon Robotics: Focuses on AI chips for smart driving, including the Sunrise and Journey series, collaborating with automotive companies.
Enflame Technology: Designs chips for data centers, specializing in AI training and inference.

“With Nvidia’s H20 and other advanced GPUs restricted, domestic alternatives like Huawei’s Ascend series are gaining traction,” said Doug O’Laughlin, an industry analyst at independent semiconductor research company SemiAnalysis. “While there are still gaps in software maturity and overall ecosystem readiness, hardware performance is closing in fast,” O’Laughlin added. According to the SemiAnalysis report, Huawei’s Ascend chip shows how China’s export controls have failed to stop firms like Huawei from accessing critical foreign tools and sub-components needed for advanced GPUs. “While Huawei’s Ascend chip can be fabricated at SMIC, this is a global chip that has HBM from Korea, primary wafer production from TSMC, and is fabricated by 10s of billions of wafer fabrication equipment from the US, Netherlands, and Japan,” the report stated.

Huawei’s New AI Chip May Dominate in China:

Huawei Technologies plans to begin mass shipments of its advanced 910C artificial intelligence chip to Chinese customers as early as next month, according to Reuters. Some shipments have already been made, people familiar with the matter said. Huawei’s 910C, a graphics processing unit (GPU), represents an architectural evolution rather than a technological breakthrough, according to one of the two people and a third source familiar with its design. It achieves performance comparable to Nvidia’s H100 chip by combining two 910B processors into a single package through advanced integration techniques, they said. That means it has double the computing power and memory capacity of the 910B and it also has incremental improvements, including enhanced support for diverse AI workload data.

China’s SMIC is manufacturing some main components of the GPUs using its N+2 7nm process technology although its chip yield rates are low, a source has previously said. At least some of Huawei’s 910C GPUs use semiconductors that were made by TSMC for China-based Sophgo, according to one of the sources and a fourth person. SMIC is under its own export controls, which prevents it from accessing some of the world’s most advanced chipmaking equipment.

Huawei reiterated that it has not used TSMC-made Sophgo [1.] chips. TSMC said it complies with regulatory requirements and it has not supplied Huawei since mid-September 2020. Of course, the 910C GPU will be used in Huawei’s cloud service offering, which has ~ 20% of China’s cloud market

………………………………………………………………………………………………………………………………………………………………………………………………………………..

Note 1. The U.S. Commerce Department has been investigating work done by the Taiwanese contract chip manufacturing giant for Sophgo after one of its TSMC-made chips was found in a 910B processor.

TSMC made nearly three million chips in recent years that matched the design ordered by Sophgo, according to Lennart Heim, a researcher at RAND’s Technology and Security and Policy Center in Arlington, Virginia, who is tracking Chinese developments in AI.

Conclusions:

The U.S. Commerce Department’s latest export curbs on Nvidia’s H20 “will mean that Huawei’s Ascend 910C GPU will now become the hardware of choice for (Chinese) AI model developers and for deploying inference capacity,” said Paul Triolo, a partner at consulting firm Albright Stonebridge Group.

The markets, tech world and the global economy urgently need U.S. – China trade negotiations in some form to start as soon as possible, Wedbush analysts say in a research note today. The analysts expect minimal or no guidance from tech companies during this earnings season as they are “playing darts blindfolded.”

References:

https://www.reuters.com/world/china/huawei-readies-new-ai-chip-mass-shipment-china-seeks-nvidia-alternatives-sources-2025-04-21/

https://qz.com/china-six-tigers-ai-startup-zhipu-moonshot-minimax-01ai-1851768509#

https://www.cnbc.com/2025/04/21/us-chip-controls-boon-for-china-nvidia-rivals-like-huawei-analysts-.html

https://www.huaweicloud.com/intl/en-us/

Goldman Sachs: Big 3 China telecom operators are the biggest beneficiaries of China’s AI boom via DeepSeek models; China Mobile’s ‘AI+NETWORK’ strategy

Telecom sessions at Nvidia’s 2025 AI developers GTC: March 17–21 in San Jose, CA

Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?

FT: Nvidia invested $1bn in AI start-ups in 2024

Omdia: Huawei increases global RAN market share due to China hegemony

Huawei’s “FOUR NEW strategy” for carriers to be successful in AI era

Telecom sessions at Nvidia’s 2025 AI developers GTC: March 17–21 in San Jose, CA

Posted on March 17, 2025 by Alan Weissberger

Nvidia’s annual AI developers conference (GTC) used to be a relatively modest affair, drawing about 9,000 people in its last year before the Covid outbreak. But the event now unofficially dubbed “AI Woodstock” is expected to bring more than 25,000 in-person attendees!

Nvidia’s Blackwell AI chips, the main showcase of last year’s GTC (GPU Technology Conference), have only recently started shipping in high volume following delays related to the mass production of their complicated design. Blackwell is expected to be the main anchor of Nvidia’s AI business through next year. Analysts expect Nvidia Chief Executive Jensen Huang to showcase a revved-up version of that family called Blackwell Ultra at his keynote address on Tuesday.

March 18th Update: The next Blackwell Ultra NVL72 chips, which have one-and-a-half times more memory and two times more bandwidth, will be used to accelerate building AI agents, physical AI, and reasoning models, Huang said. Blackwell Ultra will be available in the second half of this year. The Rubin AI chip, is expected to launch in late 2026. Rubin Ultra will take the stage in 2027.

Nvidia watchers are especially eager to hear more about the next generation of AI chips called Rubin, which Nvidia has only teased at in prior events. Ross Seymore of Deutsche Bank expects the Rubin family to show “very impressive performance improvements” over Blackwell. Atif Malik of Citigroup notes that Blackwell provided 30 times faster performance than the company’s previous generation on AI inferencing, which is when trained AI models generate output. “We don’t rule out Rubin seeing similar improvement,” Malik wrote in a note to clients this month.

Rubin products aren’t expected to start shipping until next year. But much is already expected of the lineup; analysts forecast Nvidia’s data-center business will hit about $237 billion in revenue for the fiscal year ending in January of 2027, more than double its current size. The same segment is expected to eclipse $300 billion in annual revenue two years later, according to consensus estimates from Visible Alpha. That would imply an average annual growth rate of 30% over the next four years, for a business that has already exploded more than sevenfold over the last two.

Nvidia has also been haunted by worries about competition with in-house chips designed by its biggest customers like Amazon and Google. Another concern has been the efficiency breakthroughs claimed by Chinese AI startup DeepSeek, which would seemingly lessen the need for the types of AI chip clusters that Nvidia sells for top dollar.

…………………………………………………………………………………………………………………………………………………………………………………………………………………….

Telecom Sessions of Interest:

Wednesday Mar 19 | 2:00 PM – 2:40 PM

Delivering Real Business Outcomes With AI in Telecom [S73438]

Chris Penrose | VP & Head of Business Development Telco | NVIDIA

Andy Markus | SVP and Chief Data and Artificial Intelligence Officer | AT&T

Kaniz Mahdi | Director Technology, AWS Industries | Amazon Web Services

Anil Kumar | VP, Head of AI Center | Verizon

Hans Bendik Jahren | VP Network and Infrastructure | Telenor

In this session, executives from three leading telcos will share their unique journeys of embedding AI into their organizations. They’ll discuss how AI is driving measurable value across critical areas such as network optimization, customer experience, operational efficiency, and revenue growth. Gain insights into the challenges and lessons learned, key strategies for successful AI implementation, and the transformative potential of AI in addressing evolving industry demands.

Thursday Mar 20 | 11:00 AM – 11:40 AM PDT

AI-RAN in Action [S72987]

Soma Velayutham | VP, AI and Telecoms | NVIDIA

Aji Ed | VP & Head of CloudRAN, Mobile Networks | Nokia

Ryuji Wakikawa | VP, Research Institute of Advanced Technology | Softbank

Freddie Södergren | VP and Head of Technology & Strategy | Ericsson

Karri Kuoppamaki | SVP, Advanced and Emerging Technologies | T-Mobile

Two AI trends are driving the need for an AI-enabled infrastructure at the 5G/6G radio access network (RAN) edge. Inferencing for generative AI and AI agents requires AI compute infrastructure to be distributed from edge to central clouds. At the same time, it’s clear that the RAN is evolving to an AI-native infrastructure. The AI-RAN brings these trends to fruition, providing an accelerated computing infrastructure that accelerates both radio signal processing and AI workloads. In the past six months, both Softbank in Japan and T-Mobile in the United States have taken leading roles toward transforming their networks to AI-RAN. Our panel will explore the motivations and practicalities of delivering a commercial grade AI-RAN network to transform commercial return on investment and increase spectral efficiency, network capacity, and network utilization.

Thursday Mar 20 | 9:00 AM – 9:40 AM PDTHow Indonesia Delivered a Telco-led Sovereign AI Platform for 270M Users [S73440]

Anissh Pandey | Sr. Director NCP Asia Pacific | NVIDIA

Lilach Ilan | Global Head of Business Development -Telco Operations | NVIDIA

Munjal Shah | Co-Founder and CEO | Hippocratic AI

Vikram Sinha | CEO | Indosat Ooredoo Hutchison

Senthil Ramani | Global Lead – Data & AI | Accenture

Sovereign AI enables nations to create their own AI and participate in the new global economy. Nations are partnering up with telcos to enable the foundational AI platform and ecosystem required for their universities, startups, enterprises, and government agencies to create AI. Indosat, a major telecoms provider with over 100 million customers, working with Lintasarta, GoTo, Accenture, and Hippocratic, has deployed a Sovereign AI Factory and a collection of LLMs for Indonesia leveraging the NVIDIA Cloud Partner program. We’ll explore how Indosat set up the sovereign cloud and launched its own foundational large language model, Sahabat.ai, for the 277 million native Bahasa speakers with NVIDIA NEMO and NIMs. We’ll also discuss how partners like Accenture and Hippocratic AI are accelerating AI use-case deployment in Indonesia for critical sectors including banking and healthcare using NVIDIA AI Platform.

Thursday Mar 20 | 3:00 PM – 3:40 PM PDT

Driving 6G Development With Advanced Simulation Tools [S72994]

CC Chong | Senior Director, Aerial Product Management | NVIDIA

Balaji Raghothaman | Chief Technologist, 6G | Keysight Technologies

Arien Sligar | Senior Principal Product Specialist | Ansys, Inc.

Tommaso Melodia | William Lincoln Smith Professor | Northeastern University

The NVIDIA Aerial Omniverse Digital Twin (AODT) is a platform that leverages the power of NVIDIA accelerated computing, NVIDIA Omniverse, and NVIDIA Aerial to provide a comprehensive, scalable, and flexible solution for 6G research and development. The platform enables researchers and developers to customize, program, and test 6G networks in near-real time, with AI/ML in the loop, and to simulate and optimize the network performance and quality of service based on site-specific data and system-level simulation. The platform has been adopted by Keysight and Ansys with advanced simulation tools. One is leveraging our state-of-the-art 3D electromagnetic ray tracing model and radio access network (RAN) modules for advanced 5G/6G end-to-end simulations. The other has integrated their PerceiveEM Solver into the AODT to leverage NVIDIA AI/ML with full-stack RAN in the loop to accelerate and simplify 6G research and development. The NVIDIA Aerial Omniverse Digital Twin has also empowered academia to unleash the potential of cloud and AI-native networks for 6G system research, design, and development.

Thursday Mar 20 | 2:00 PM – 2:40 PM PDT

Defining AI-Native RAN for 6G [S72985]

Chris Dick | Senior Distinguished Engineer, 5G/6G | NVIDIA

Ardavan Tehrani | Director Engineering | Samsung Research

Moe Win | Professor | MIT

Jim Shea | CEO | DeepSig, Inc.

Kai Mao | Head of AI-RAN Strategy for Fujitsu Network Business | Fujitsu

The telecoms industry is working to integrate AI into the design, operation, and optimization of 6G networks to enable unprecedented levels of automation, efficiency, and adaptability. The AI-native 6G radio access network (RAN) embodies this expectation and makes AI a foundational element of the 6G stack. In this session, leading telecoms industry stakeholders will outline how they’re advancing 6G research, with AI integration, to revolutionize connectivity, performance, and societal transformation. They’ll explore how the AI-native 6G RAN can deliver extreme levels of radio access performance, integrating technologies like deep learning and edge computing, promoting sustainability, enhancing public safety, and driving economic growth through advanced applications like smart cities, digital twins, and immersive communication.

Thursday Mar 20 | 4:00 PM – 4:40 PM PDT

Pushing Spectral Efficiency Limits on CUDA-accelerated 5G/6G RAN [S72990]

Vikrama Ditya | Senior Director of Software, Aerial | NVIDIA

Tommaso Balercia | Principal Engineer, Aerial | NVIDIA

Yuan Gao | Senior Software Engineer, 5G – Algorithms | NVIDIA

The demand for mobile broadband continues to grow, and correspondingly the radio access networks (RAN) delivering it are asked to support an extraordinary rate of innovation. RAN-based on software-defined platforms are of key importance in enabling such a rate of innovation sustainably. Among all software-defined platforms, the one based on the GPU allows for the introduction of complex algorithms to pursue new levels of spectral efficiency. This is both to the ease with which the GPU can be programmed and their computational efficiency. Moreover, GPU-based 5G and 6G RAN stacks can run coherently with the code of a new class of simulation tools: digital twins. The efficiency with which the GPU can run both physics simulation chains as well as the 5G/6G RAN stack enable simulations to reach new scales and levels of accuracy. Such simulation environments, in turn, enable the creation and the detailed characterization of new algorithms. In this talk, we will demonstrate how a series of key algorithms that make use of the GPU computational power allow us to push the boundary of system spectral efficiency. The characterization is performed in a digital twin environment, where the radio environment is closely resembling the field.

Thursday Mar 20 | 4:00 PM – 4:40 PM PDT

Enable AI-Native Networking for Telcos with Kubernetes [S72993]

Erwan Gallen | Senior Principal Product Manager | Red Hat

Elad Blatt | Global Head Business Development Telco Networking | NVIDIA

Ahmed Guetari | VP and GM, Service Providers | F5

As telcos transform their networks for the age of AI, utilizing the Bluefield 3 (BF3) DPU engines has been challenging for developers and clients alike. Yet, BF3 is a critical part in deploying and securing an accelerated compute cluster to enable this networking-for-AI infrastructure. The DOCA Platform Framework (DPF) simplifies this task, providing a framework for LCM (life cycle management) and provisioning of both the BF3, as a platform, and the services running on it as K8 containers. DPF is deployed via two network operators in the K8 environment that allow you to deploy and service chain NVIDIA and third-party services. With this, for the first time, independent software vendors (ISVs), operating systems (OS) vendors, and developers can deploy and orchestrate services with ease, as well as onboarding these tools to their environment. In this session, OS and ISV partner companies who’ve adopted the DPF will share their experience, what they are able to achieve, and what comes next.

Monday Mar 17 | 3:00 PM – 4:45 PM PDT

Automate 5G Network Configurations With NVIDIA AI LLM Agents and Kinetica Accelerated Database [DLIT72350]

Maria Amparo Canaveras Galdon | Senior Solutions Architect Generative AI | NVIDIA

Swastika Dutta | Solutions Architect Generative AI | NVIDIA

Shibani Likhite | Solutions Architect | NVIDIA

Nick Reamaroon | Solutions Architect | NVIDIA

Anna Daccache | Solutions Architect Generative AI | NVIDIA

Learn how to create AI agents using LangGraph and NVIDIA NIM to automate 5G network configurations. You’ll deploy LLM agents to monitor real-time network quality of service (QoS) and dynamically respond to congestion by creating new network slices. LLM agents will process logs to detect when QoS falls below a threshold, then automatically trigger a new slice for the affected user equipment. Using graph-based models, the agents understand the network configuration, identifying impacted elements. This ensures efficient, AI-driven adjustments that consider the overall network architecture.

We’ll use the Open Air Interface 5G lab to simulate the 5G network, demonstrating how AI can be integrated into real-world telecom environments. You’ll also gain practical knowledge on using Python with LangGraph and NVIDIA AI endpoints to develop and deploy LLM agents that automate complex network tasks.

Prerequisite: Python programming.

………………………………………………………………………………………………………………………………………………………………………………………………………..

References:

https://www.wsj.com/tech/ai/nvidia-growth-ai-gtc-conference-f9f7881f

https://register.nvidia.com/flow/nvidia/gtcs25/vap/page/explore

Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?

The case for and against AI-RAN technology using Nvidia or AMD GPUs

FT: Nvidia invested $1bn in AI start-ups in 2024

Nvidia AI-RAN survey results; AI inferencing as a reinvention of edge computing?

Posted on February 21, 2025 by Alan Weissberger

An increasing focus on deploying AI into radio access networks (RANs) was among the key findings of NVIDIA’s third annual “State of AI in Telecommunications” survey of 450 telecom professionals, as more than a third of respondents indicated they’re investing or planning to invest in AI-RAN. The survey polled more than 450 telecommunications professionals worldwide, revealing continued momentum for AI adoption — including growth in generative AI use cases — and how the technology is helping optimize customer experiences and increase employee productivity. The percentage of network operators planning to use open source tools increased from 28% in 2023 to 40% in 2025. AvidThink Founder and Principal Roy Chua said one of the biggest challenges network operators will have when using open source models is vetting the outputs they get during training.

Of the telecommunications professionals surveyed, almost all stated that their company is actively deploying or assessing AI projects. Here are some top insights on impact and use cases:

84% said AI is helping to increase their company’s annual revenue
77% said AI helped reduce annual operating costs
60% said increased employee productivity was their biggest benefit from AI
44% said they’re investing in AI for customer experience optimization, which is the No. 1 area of investment for AI in telecommunications
40% said they’re deploying AI into their network planning and operations, including RAN

The percentage of respondents who indicated they will build AI solutions in-house rose from 27% in 2024 to 37% this year. “Telcos are really looking to do more of this work themselves,” Nvidia’s Global Head of Business Development for Telco Chris Penrose [1.] said. “They’re seeing the importance of them taking control and ownership of becoming an AI center of excellence, of doing more of the training of their own resources.”

With respect to using AI inferencing, Chris said, “”We’ve got 14 publicly announced telcos that are doing this today, and we’ve got an equally big funnel.” Penrose noted that the AI skills gap remains the biggest hurdle for operators. Why? Because, as he put it, just because someone is an AI scientist doesn’t mean they are also necessarily a generative AI or agentic AI scientist specifically. And in order to attract the right talent, operators need to demonstrate that they have the infrastructure that will allow top-tier employees to do amazing work. See also: GPUs, data center infrastructure, etc.

Note 1. Penrose represented AT&T’s IoT business for years at various industry trade shows and events before leaving the company in 2020.

Rather than the large data centers processing AI Large Language Models (LLMs), AI inferencing could be done more quickly at smaller “edge” facilities that are closer to end users. That’s where telecom operators might step in. “Telcos are in a unique position,” Penrose told Light Reading. He explained that many countries want to ensure that their AI data and operations remain inside the boundaries of that country. Thus, telcos can be “the trusted providers of [AI] infrastructure in their nations.”

“We’ll call it AI RAN-ready infrastructure. You can make money on it today. You can use it for your own operations. You can use it to go drive some services into the market. … Ultimately your network itself becomes a key anchor workload,” Penrose said.

Source: Skorzewiak/Alamy Stock Photo

Nvidia proposes that network operators can not only run their own AI workloads on Nvidia GPUs, they can also sell those inferencing services to third parties and make a profit by doing so. “We’ve got lots of indications that many [telcos] are having success, and have not only deployed their first [AI compute] clusters, but are making reinvestments to deploy additional compute in their markets,” Penrose added.

Nvidia specifically pointed to AI inferencing announcements by Singtel, Swisscom, Telenor, Indosat and SoftBank.

Other vendors are hoping for similar sales. “I think this vision of edge computing becoming AI inferencing at the end of the network is massive for us,” HPE boss Antonio Neri said last year, in discussing HPE’s $14 billion bid for Juniper Networks.

That comes after multi-access edge computing (MEC) has not lived up to its potential, partially because a 5G SA core network is needed for that and few have been commercially deployed. Edge computing disillusionment is clear among hyperscalers and also network operators. For example, Cox folded its edge computing business into its private networks operation. AT&T no longer discusses the edge computing locations it was building with Microsoft and Google. And Verizon has admitted to edge computing “miscalculations.”

Will AI inferencing be the savior for MEC? The jury is out on that topic. However, Nvidia said that 40% of its revenues already come from AI inferencing. Presumably that inferencing is happening in larger data centers and then delivered to nearby users. Meaning, a significant amount of inferencing is being done today without additional facilities, distributed at a network’s edge, that could enable speedier, low-latency AI services.

“The idea that AI inferencing is going to be all about low-latency connections, and hence stuff like AI RAN and and MEC and assorted other edge computing concepts, doesn’t seem to be a really good fit with the current main direction of AI applications and models,” argued Disruptive Wireless analyst Dean Bubley in a Linked In post.

References:

https://blogs.nvidia.com/blog/ai-telcos-survey-2025/

State of AI in Telecommunications

https://www.lightreading.com/ai-machine-learning/telcos-profiting-from-ai-inferencing-we-ve-been-here-before

https://www.fierce-network.com/premium/whitepaper/edge-computing-powered-global-ai-inference

https://www.fierce-network.com/cloud/are-ai-services-telcos-magic-revenue-bullet

The case for and against AI-RAN technology using Nvidia or AMD GPUs

Ericsson’s sales rose for the first time in 8 quarters; mobile networks need an AI boost

AI RAN Alliance selects Alex Choi as Chairman

Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029

AI sparks huge increase in U.S. energy consumption and is straining the power grid; transmission/distribution as a major problem

Tata Consultancy Services: Critical role of Gen AI in 5G; 5G private networks and enterprise use cases

CES 2025: Intel announces edge compute processors with AI inferencing capabilities

Posted on January 6, 2025 by Alan Weissberger

At CES 2025 today, Intel unveiled the new Intel® Core™ Ultra (Series 2) processors, designed to revolutionize mobile computing for businesses, creators and enthusiast gamers. Intel said “the new processors feature cutting-edge AI enhancements, increased efficiency and performance improvements.”

“Intel Core Ultra processors are setting new benchmarks for mobile AI and graphics, once again demonstrating the superior performance and efficiency of the x86 architecture as we shape the future of personal computing,” said Michelle Johnston Holthaus, interim co-CEO of Intel and CEO of Intel Products. “The strength of our AI PC product innovation, combined with the breadth and scale of our hardware and software ecosystem across all segments of the market, is empowering users with a better experience in the traditional ways we use PCs for productivity, creation and communication, while opening up completely new capabilities with over 400 AI features. And Intel is only going to continue bolstering its AI PC product portfolio in 2025 and beyond as we sample our lead Intel 18A product to customers now ahead of volume production in the second half of 2025.”

Intel also announced new edge computing processors, designed to provide scalability and superior performance across diverse use cases. Intel Core Ultra processors were said to deliver remarkable power efficiency, making them ideal for AI workloads at the edge, with performance gains that surpass competing products in critical metrics like media processing and AI analytics. Those edge processors are targeted at compute servers running in hospitals, retail stores, factory floors and other “edge” locations that sit between big data centers and end-user devices. Such locations are becoming increasingly important to telecom network operators hoping to sell AI capabilities, private wireless networks, security offerings and other services to those enterprise locations.

Intel edge products launching today at CES include:

Intel® Core™ Ultra 200S/H/U series processors (code-named Arrow Lake).
Intel® Core™ 200S/H series processors (code-named Bartlett Lake S and Raptor Lake H Refresh).
Intel® Core™ 100U series processors (code-named Raptor Lake U Refresh).
Intel® Core™ 3 processor and Intel® Processor (code-named Twin Lake).

“Intel has been powering the edge for decades,” said Michael Masci, VP of product management in Intel’s edge computing group, during a media presentation last week. According to Masci, AI is beginning to expand the edge opportunity through inferencing [1.]. “Companies want more local compute. AI inference at the edge is the next major hotbed for AI innovation and implementation,” he added.

Note 1. Inferencing in AI refers to the process where a trained AI model makes predictions or decisions based on new data, rather than previously stored “training models.” It’s essentially AI’s ability to apply learned knowledge on fresh inputs in real-time. Edge computing plays a critical role in inferencing, because it brings it closer to users. That lowers latency (much faster AI responses) and can also reduce bandwidth costs and ensure privacy and security as well.

Editor’s Note: Intel’s edge compute business – the one pursuing AI inferencing – is in in its Client Computing Group (CCG) business unit. Intel’s chips for telecom operators reside inside its NEX business unit.

Intel’s Masci specifically called out Nvidia’s GPU chips, claiming Intel’s new silicon lineup supports up to 5.8x faster performance and better usage per watt. Indeed, Intel claims their “Core™ Ultra 7 processor uses about one-third fewer TOPS (Trillions Operations Per Second) than Nvidia’s Jetson AGX Orin, but beats its competitor with media performance that is up to 5.6 times faster, video analytics performance that is up to 3.4x faster and performance per watt per dollar up to 8.2x better.”

………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

However, Nvidia has been using inference in its AI chips for quite some time. Company officials last month confirmed that 40% of Nvidia’s revenues come from AI inference, rather than AI training efforts in big data centers. Colette Kress, Nvidia Executive Vice President and Chief Financial Officer, said, “Our architectures allows an end-to-end scaling approach for them to do whatever they need to in the world of accelerated computing and Ai. And we’re a very strong candidate to help them, not only with that infrastructure, but also with the software.”

“Inference is super hard. And the reason why inference is super hard is because you need the accuracy to be high on the one hand. You need the throughput to be high so that the cost could be as low as possible, but you also need the latency to be low,” explained Nvidia CEO Jensen Huang during his company’s recent quarterly conference call.

“Our hopes and dreams is that someday, the world does a ton of inference. And that’s when AI has really succeeded, right? It’s when every single company is doing inference inside their companies for the marketing department and forecasting department and supply chain group and their legal department and engineering, and coding, of course. And so we hope that every company is doing inference 24/7.”

……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….

Sadly for its many fans (including this author), Intel continues to struggle in both data center processors and AI/ GPU chips. The Wall Street Journal recently reported that “Intel’s perennial also-ran, AMD, actually eclipsed Intel’s revenue for chips that go into data centers. This is a stunning reversal: In 2022, Intel’s data-center revenue was three times that of AMD.”

Even worse for Intel, more and more of the chips that go into data centers are GPUs and Intel has minuscule market share of these high-end chips. GPUs are used for training and delivering AI. The WSJ notes that many of the companies spending the most on building out new data centers are switching to chips that have nothing to do with Intel’s proprietary architecture, known as x86, and are instead using a combination of a competing architecture from ARM and their own custom chip designs. For example, more than half of the CPUs Amazon has installed in its data centers over the past two years were its own custom chips based on ARM’s architecture, Dave Brown, Amazon vice president of compute and networking services, said recently.

This displacement of Intel is being repeated all across the big providers and users of cloud computing services. Microsoft and Google have also built their own custom, ARM-based CPUs for their respective clouds. In every case, companies are moving in this direction because of the kind of customization, speed and efficiency that custom silicon supports.

References:

https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qbu4

https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qdhd

https://seekingalpha.com/article/4741811-nvidia-corporation-nvda-ubs-global-technology-conference-transcript

https://www.wsj.com/tech/intel-microchip-competitors-challenges-562a42e3

https://www.lightreading.com/the-edge-network/intel-desperate-for-an-edge-over-nvidia-with-ai-inferencing

Massive layoffs and cost cutting will decimate Intel’s already tiny 5G network business

WSJ: China’s Telecom Carriers to Phase Out Foreign Chips; Intel & AMD will lose out

The case for and against AI-RAN technology using Nvidia or AMD GPUs

Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers

FT: Nvidia invested $1bn in AI start-ups in 2024

AI winner Nvidia faces competition with new super chip delayed

AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions

Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections

Posted on January 4, 2025 by Alan Weissberger

A growing portion of the billions of dollars being spent on AI data centers will go to the suppliers of networking chips, lasers, and switches that integrate thousands of GPUs and conventional micro-processors into a single AI computer cluster. AI can’t advance without advanced networks, says Nvidia’s networking chief Gilad Shainer. “The network is the most important element because it determines the way the data center will behave.”

Networking chips now account for just 5% to 10% of all AI chip spending, said Broadcom CEO Hock Tan. As the size of AI server clusters hits 500,000 or a million processors, Tan expects that networking will become 15% to 20% of a data center’s chip budget. A data center with a million or more processors will cost $100 billion to build.

The firms building the biggest AI clusters are the hyperscalers, led by Alphabet’s Google, Amazon.com, Facebook parent Meta Platforms, and Microsoft. Not far behind are Oracle, xAI, Alibaba Group Holding, and ByteDance. Earlier this month, Bloomberg reported that capex for those four hyperscalers would exceed $200 billion this year, making the year-over-year increase as much as 50%. Goldman Sachs estimates that AI data center spending will rise another 35% to 40% in 2025. Morgan Stanley expects Amazon and Microsoft to lead the pack with $96.4bn and $89.9bn of capex respectively, while Google and Meta will follow at $62.6bn and $52.3bn.

AI compute server architectures began scaling in recent years for two reasons.

1.] High end processor chips from Intel neared the end of speed gains made possible by shrinking a chip’s transistors.

2.] Computer scientists at companies such as Google and OpenAI built AI models that performed amazing feats by finding connections within large volumes of training material.

As the components of these “Large Language Models” (LLMs) grew to millions, billions, and then trillions, they began translating languages, doing college homework, handling customer support, and designing cancer drugs. But training an AI LLM is a huge task, as it calculates across billions of data points, rolls those results into new calculations, then repeats. Even with Nvidia accelerator chips to speed up those calculations, the workload has to be distributed across thousands of Nvidia processors and run for weeks.

To keep up with the distributed computing challenge, AI data centers all have two networks:

The “front end” network which sends and receives data to/from external users —like the networks of every enterprise data center or cloud-computing center. It’s placed on the network’s outward-facing front end or boundary and typically includes equipment like high end routers, web servers, DNS servers, application servers, load balancers, firewalls, and other devices which connect to the public internet, IP-MPLS VPNs and private lines.
A “back end” network that connects every AI processor (GPUs and conventional MPUs) and memory chip with every other processor within the AI data center. “It’s just a supercomputer made of many small processors,” says Ram Velaga, Broadcom’s chief of core switching silicon. “All of these processors have to talk to each other as if they are directly connected.” AI’s back-end networks need high bandwidth switches and network connections. Delays and congestion are expensive when each Nvidia compute node costs as much as $400,000. Idle processors waste money. Back-end networks carry huge volumes of data. When thousands of processors are exchanging results, the data crossing one of these networks in a second can equal all of the internet traffic in America.

Nvidia became one of today’s largest vendors of network gear via its acquisition of Israel based Mellanox in 2020 for $6.9 billion. CEO Jensen Huang and his colleagues realized early on that AI workloads would exceed a single box. They started using InfiniBand—a network designed for scientific supercomputers—supplied by Mellanox. InfiniBand became the standard for AI back-end networks.

While most AI dollars still go to Nvidia GPU accelerator chips, back-end networks are important enough that Nvidia has large networking sales. In the September quarter, those network sales grew 20%, to $3.1 billion. However, Ethernet is now challenging InfiniBand’s lock on AI networks. Fortunately for Nvidia, its Mellanox subsidiary also makes high speed Ethernet hardware modules. For example, xAI uses Nvidia Ethernet products in its record-size Colossus system.

While current versions of Ethernet lack InfiniBand’s tools for memory and traffic management, those are now being added in a version called Ultra Ethernet [1.]. Many hyperscalers think Ethernet will outperform InfiniBand, as clusters scale to hundreds of thousands of processors. Another attraction is that Ethernet has many competing suppliers. “All the largest guys—with an exception of Microsoft—have moved over to Ethernet,” says an anonymous network industry executive. “And even Microsoft has said that by summer of next year, they’ll move over to Ethernet, too.”

Note 1. Primary goals and mission of Ultra Ethernet Consortium (UEC): Deliver a complete architecture that optimizes Ethernet for high performance AI and HPC networking, exceeding the performance of today’s specialized technologies. UEC specifically focuses on functionality, performance, TCO, and developer and end-user friendliness, while minimizing changes to only those required and maintaining Ethernet interoperability. Additional goals: Improved bandwidth, latency, tail latency, and scale, matching tomorrow’s workloads and compute architectures. Backwards compatibility to widely-deployed APIs and definition of new APIs that are better optimized to future workloads and compute architectures.

……………………………………………………………………………………………………………………………………………………………………………………………………………………………….

Ethernet back-end networks offer a big opportunity for Arista Networks, which builds switches using Broadcom chips. In the past two years, AI data centers became an important business for Arista. AI provides sales to Arista switch rivals Cisco and Juniper Networks (soon to be a part of Hewlett Packard Enterprise), but those companies aren’t as established among hyperscalers. Analysts expect Arista to get more than $1 billion from AI sales next year and predict that the total market for back-end switches could reach $15 billion in a few years. Three of the five big hyperscale operators are using Arista Ethernet switches in back-end networks, and the other two are testing them. Arista CEO Jayshree Ullal (a former SCU EECS grad student of this author/x-adjunct Professor) says that back-end network sales seem to pull along more orders for front-end gear, too.

The network chips used for AI switching are feats of engineering that rival AI processor chips. Cisco makes its own custom Ethernet switching chips, but some 80% of the chips used in other Ethernet switches comes from Broadcom, with the rest supplied mainly by Marvell. These switch chips now move 51 terabits of data a second; it’s the same amount of data that a person would consume by watching videos for 200 days straight. Next year, switching speeds will double.

The other important parts of a network are connections between computing nodes and cables. As the processor count rises, connections increase at a faster rate. A 25,000-processor cluster needs 75,000 interconnects. A million processors will need 10 million interconnects. More of those connections will be fiber optic, instead of copper or coax. As networks speed up, copper’s reach shrinks. So, expanding clusters have to “scale-out” by linking their racks with optics. “Once you move beyond a few tens of thousand, or 100,000, processors, you cannot connect anything with copper—you have to connect them with optics,” Velaga says.

AI processing chips (GPUs) exchange data at about 10 times the rate of a general-purpose processor chip. Copper has been the preferred conduit because it’s reliable and requires no extra power. At current network speeds, copper works well at lengths of up to five meters. So, hyperscalers have tried to “scale-up” within copper’s reach by packing as many processors as they can within each shelf, and rack of shelves.

Back-end connections now run at 400 gigabits per second, which is equal to a day and half of video viewing. Broadcom’s Velaga says network speeds will rise to 800 gigabits in 2025, and 1.6 terabits in 2026.

Nvidia, Broadcom, and Marvell sell optical interface products, with Marvell enjoying a strong lead in 800-gigabit interconnects. A number of companies supply lasers for optical interconnects, including Coherent, Lumentum Holdings, Applied Optoelectronics, and Chinese vendors Innolight and Eoptolink. They will all battle for the AI data center over the next few years.

A 500,000-processor cluster needs at least 750 megawatts, enough to power 500,000 homes. When AI models scale to a million or more processors, they will require gigawatts of power and have to span more than one physical data center, says Velaga.

The opportunity for optical connections reaches beyond the AI data center. That’s because there isn’t enough power. In September, Marvell, Lumentum, and Coherent demonstrated optical links for data centers as far apart as 300 miles. Nvidia’s next-generation networks will be ready to run a single AI workload across remote locations.

Some worry that AI performance will stop improving as processor counts scale. Nvidia’s Jensen Huang dismissed those concerns on his last conference call, saying that clusters of 100,000 processors or more will just be table stakes with Nvidia’s next generation of chips. Broadcom’s Velaga says he is grateful: “Jensen (Nvidia CEO) has created this massive opportunity for all of us.”

References:

https://www.barrons.com/articles/ai-networking-nvidia-cisco-broadcom-arista-bce88c76?mod=hp_WIND_B_1_1 (PAYWALL)

https://www.msn.com/en-us/news/technology/networking-companies-ride-the-ai-wave-it-isn-t-just-nvidia/ar-AA1wJXGa?ocid=BingNewsSerp

https://www.datacenterdynamics.com/en/news/morgan-stanley-hyperscaler-capex-to-reach-300bn-in-2025/

https://ultraethernet.org/ultra-ethernet-specification-update/

Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!

Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?

Canalys & Gartner: AI investments drive growth in cloud infrastructure spending

AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms

AI wave stimulates big tech spending and strong profits, but for how long?

Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029

Using a distributed synchronized fabric for parallel computing workloads- Part I

Using a distributed synchronized fabric for parallel computing workloads- Part II

FT: Nvidia invested $1bn in AI start-ups in 2024

Posted on January 1, 2025 by Alan Weissberger

Nvidia invested $1bn in artificial intelligence companies in 2024, as it emerged as a crucial backer of start-ups using the company’s graphics processing units (GPUs). The king of AI semiconductors, which surpassed a $3tn market capitalization in June due to huge demand for its high-performing GPUs, has significantly invested into some of its own customers.

According to corporate filings and Dealroom research, Nvidia spent a total of $1bn across 50 start-up funding rounds and several corporate deals in 2024, compared with 2023, which saw 39 start-up rounds and $872mn in spending. The vast majority of deals were with “core AI” companies with high computing infrastructure demands, and so in some cases also buyers of its own chips. Tech companies have spent tens of billions of dollars on Nvidia’s chips over the past year since the debut of ChatGPT two years ago kick-started an unprecedented surge of investment in AI. Nvidia’s uptick in deals comes after it amassed a $9bn war chest of cash with its GPUs becoming one of the world’s hottest commodities.

The company’s shares rose more than 170% in 2024, as it and other tech giants helped power the S&P 500 index to its best two-year run this century. Nvidia’s $1bn worth of investments in “non-affiliated entities” in the first nine months last year includes both its venture and corporate investment arms.

According to company filings, that sum was 15% more than in 2023 and more than 10 times as much as it invested in 2022. Some of Nvidia’s largest customers, such as Microsoft, Amazon and Google, are actively working to reduce their reliance on its GPUs by developing their own custom chips. Such a development could make smaller AI companies a more important generator of revenues for Nvidia in the future.

“Right now Nvidia wants there to be more competition and it makes sense for them to have these new players in the mix,” said a fund manager with a stake in a number of companies it had invested in.

In 2024, Nvidia struck more deals than Microsoft and Amazon, although Google remains far more active, according to Dealroom. Such prolific dealmaking has raised concerns about Nvidia’s grip over the AI industry, at a time when it is facing heightened antitrust scrutiny in the US, Europe and China. Bill Kovacic, former chair of the US Federal Trade Commission, said competition watchdogs were “keen” to investigate a “dominant enterprise making these big investments” to see if buying company stakes was aimed at “achieving exclusivity”, although he said investments in a customer base could prove beneficial. Nvidia strongly rejects the idea that it connects funding with any requirement to use its technology.

The company said it was “working to grow our ecosystem, support great companies and enhance our platform for everyone. We compete and win on merit, independent of any investments we make.” It added: “Every company should be free to make independent technological choices that best suit their needs and strategies.”

The Santa Clara based company’s most recent start-up deal was a strategic investment in Elon Musk’s xAI. Other significant 2024 investments included its participation in funding rounds for OpenAI, Cohere, Mistral and Perplexity, some of the most prominent AI model providers.

Nvidia also has a start-up incubator, Inception, which separately has helped the early evolution of thousands of fledgling companies. The Inception program offers start-ups “preferred pricing” on hardware, as well as cloud credits from Nvidia’s partners.

There has been an uptick in Nvidia’s acquisitions, including a takeover of Run:ai, an Israeli AI workload management platform. The deal closed this week after coming under scrutiny from the EU’s antitrust regulator, which ultimately cleared the transaction. The US Department of Justice was also looking at the deal, according to Politico. Nvidia also bought AI software groups Nebulon, OctoAI, Brev.dev, Shoreline.io and Deci. Collectively it has made more acquisitions in 2024 than the previous four years combined, according to Dealroom. Recommended News in-depthArtificial intelligence Wall Street frenzy creates $11bn debt market for AI groups buying Nvidia chips.

The company is investing widely, pouring millions of dollars into AI groups involved in medical technology, search engines, gaming, drones, chips, traffic management, logistics, data storage and generation, natural language processing and humanoid robots. Its portfolio includes a number of start-ups whose valuations have soared to billions of dollars. CoreWeave, an AI cloud computing service provider and significant purchaser of Nvidia chips, is preparing to float early this year at a valuation as high as $35bn — increasing from about $7bn a year ago.

Nvidia invested $100mn in CoreWeave in early 2023, and participated in a $1bn equity fundraising round by the company in May. Another start-up, Applied Digital, was facing a plunging share price in 2024, with revenue misses and considerable debt obligations, before a group of investors led by Nvidia provided $160mn of equity capital in September, prompting a 65 per cent surge in its share price.

“Nvidia is using their massive market cap and huge cash flow to keep purchasers alive,” said Nate Koppikar, a short seller at Orso Partners. “If Applied Digital had died, that’s [a large volume] of sales that would have died with it.”

Neocloud groups such as CoreWeave, Crusoe and Lambda Labs have acquired tens of thousands of Nvidia’s high-performance GPUs, that are crucial for developing generative AI models. Those Nvidia AI chips are now also being used as collateral for huge loans. The frenzied dealmaking has shone a light on a rampant GPU economy in Silicon Valley that is increasingly being supported by deep-pocketed financiers in New York. However, its rapid growth has raised concerns about the potential for more risky lending, circular financing and Nvidia’s chokehold on the AI market.

References:

https://www.ft.com/content/f8acce90-9c4d-4433-b189-e79cad29f74e

https://www.ft.com/content/41bfacb8-4d1e-4f25-bc60-75bf557f1f21

AI cloud start-up Vultr valued at $3.5B; Hyperscalers gorge on Nvidia GPUs while AI semiconductor market booms

Page 1 of 2

1 2 Next »

Telecom Sessions of Interest:

Wednesday Mar 19 | 2:00 PM – 2:40 PM

Delivering Real Business Outcomes With AI in Telecom [S73438]

Thursday Mar 20 | 11:00 AM – 11:40 AM PDT

AI-RAN in Action [S72987]

Thursday Mar 20 | 9:00 AM – 9:40 AM PDTHow Indonesia Delivered a Telco-led Sovereign AI Platform for 270M Users [S73440]

Thursday Mar 20 | 3:00 PM – 3:40 PM PDT

Driving 6G Development With Advanced Simulation Tools [S72994]

Thursday Mar 20 | 2:00 PM – 2:40 PM PDT

Thursday Mar 20 | 4:00 PM – 4:40 PM PDT

Pushing Spectral Efficiency Limits on CUDA-accelerated 5G/6G RAN [S72990]

Thursday Mar 20 | 4:00 PM – 4:40 PM PDT

Enable AI-Native Networking for Telcos with Kubernetes [S72993]

Monday Mar 17 | 3:00 PM – 4:45 PM PDT

Automate 5G Network Configurations With NVIDIA AI LLM Agents and Kinetica Accelerated Database [DLIT72350]

References:

Archives

Archives

Recent Posts