AI in Networks Market
CES 2025: Intel announces edge compute processors with AI inferencing capabilities
At CES 2025 today, Intel unveiled the new Intel® Core™ Ultra (Series 2) processors, designed to revolutionize mobile computing for businesses, creators and enthusiast gamers. Intel said “the new processors feature cutting-edge AI enhancements, increased efficiency and performance improvements.”
“Intel Core Ultra processors are setting new benchmarks for mobile AI and graphics, once again demonstrating the superior performance and efficiency of the x86 architecture as we shape the future of personal computing,” said Michelle Johnston Holthaus, interim co-CEO of Intel and CEO of Intel Products. “The strength of our AI PC product innovation, combined with the breadth and scale of our hardware and software ecosystem across all segments of the market, is empowering users with a better experience in the traditional ways we use PCs for productivity, creation and communication, while opening up completely new capabilities with over 400 AI features. And Intel is only going to continue bolstering its AI PC product portfolio in 2025 and beyond as we sample our lead Intel 18A product to customers now ahead of volume production in the second half of 2025.”
Intel also announced new edge computing processors, designed to provide scalability and superior performance across diverse use cases. Intel Core Ultra processors were said to deliver remarkable power efficiency, making them ideal for AI workloads at the edge, with performance gains that surpass competing products in critical metrics like media processing and AI analytics. Those edge processors are targeted at compute servers running in hospitals, retail stores, factory floors and other “edge” locations that sit between big data centers and end-user devices. Such locations are becoming increasingly important to telecom network operators hoping to sell AI capabilities, private wireless networks, security offerings and other services to those enterprise locations.
Intel edge products launching today at CES include:
- Intel® Core™ Ultra 200S/H/U series processors (code-named Arrow Lake).
- Intel® Core™ 200S/H series processors (code-named Bartlett Lake S and Raptor Lake H Refresh).
- Intel® Core™ 100U series processors (code-named Raptor Lake U Refresh).
- Intel® Core™ 3 processor and Intel® Processor (code-named Twin Lake).
“Intel has been powering the edge for decades,” said Michael Masci, VP of product management in Intel’s edge computing group, during a media presentation last week. According to Masci, AI is beginning to expand the edge opportunity through inferencing [1.]. “Companies want more local compute. AI inference at the edge is the next major hotbed for AI innovation and implementation,” he added.
Note 1. Inferencing in AI refers to the process where a trained AI model makes predictions or decisions based on new data, rather than previously stored “training models.” It’s essentially AI’s ability to apply learned knowledge on fresh inputs in real-time. Edge computing plays a critical role in inferencing, because it brings it closer to users. That lowers latency (much faster AI responses) and can also reduce bandwidth costs and ensure privacy and security as well.
Editor’s Note: Intel’s edge compute business – the one pursuing AI inferencing – is in in its Client Computing Group (CCG) business unit. Intel’s chips for telecom operators reside inside its NEX business unit.
Intel’s Masci specifically called out Nvidia’s GPU chips, claiming Intel’s new silicon lineup supports up to 5.8x faster performance and better usage per watt. Indeed, Intel claims their “Core™ Ultra 7 processor uses about one-third fewer TOPS (Trillions Operations Per Second) than Nvidia’s Jetson AGX Orin, but beats its competitor with media performance that is up to 5.6 times faster, video analytics performance that is up to 3.4x faster and performance per watt per dollar up to 8.2x better.”
………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
However, Nvidia has been using inference in its AI chips for quite some time. Company officials last month confirmed that 40% of Nvidia’s revenues come from AI inference, rather than AI training efforts in big data centers. Colette Kress, Nvidia Executive Vice President and Chief Financial Officer, said, “Our architectures allows an end-to-end scaling approach for them to do whatever they need to in the world of accelerated computing and Ai. And we’re a very strong candidate to help them, not only with that infrastructure, but also with the software.”
“Inference is super hard. And the reason why inference is super hard is because you need the accuracy to be high on the one hand. You need the throughput to be high so that the cost could be as low as possible, but you also need the latency to be low,” explained Nvidia CEO Jensen Huang during his company’s recent quarterly conference call.
“Our hopes and dreams is that someday, the world does a ton of inference. And that’s when AI has really succeeded, right? It’s when every single company is doing inference inside their companies for the marketing department and forecasting department and supply chain group and their legal department and engineering, and coding, of course. And so we hope that every company is doing inference 24/7.”
……………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………….
Sadly for its many fans (including this author), Intel continues to struggle in both data center processors and AI/ GPU chips. The Wall Street Journal recently reported that “Intel’s perennial also-ran, AMD, actually eclipsed Intel’s revenue for chips that go into data centers. This is a stunning reversal: In 2022, Intel’s data-center revenue was three times that of AMD.”
Even worse for Intel, more and more of the chips that go into data centers are GPUs and Intel has minuscule market share of these high-end chips. GPUs are used for training and delivering AI. The WSJ notes that many of the companies spending the most on building out new data centers are switching to chips that have nothing to do with Intel’s proprietary architecture, known as x86, and are instead using a combination of a competing architecture from ARM and their own custom chip designs. For example, more than half of the CPUs Amazon has installed in its data centers over the past two years were its own custom chips based on ARM’s architecture, Dave Brown, Amazon vice president of compute and networking services, said recently.
This displacement of Intel is being repeated all across the big providers and users of cloud computing services. Microsoft and Google have also built their own custom, ARM-based CPUs for their respective clouds. In every case, companies are moving in this direction because of the kind of customization, speed and efficiency that custom silicon supports.
References:
https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qbu4
https://www.intel.com/content/www/us/en/newsroom/news/2025-ces-client-computing-news.html#gs.j0qdhd
https://www.wsj.com/tech/intel-microchip-competitors-challenges-562a42e3
Massive layoffs and cost cutting will decimate Intel’s already tiny 5G network business
WSJ: China’s Telecom Carriers to Phase Out Foreign Chips; Intel & AMD will lose out
The case for and against AI-RAN technology using Nvidia or AMD GPUs
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
FT: Nvidia invested $1bn in AI start-ups in 2024
AI winner Nvidia faces competition with new super chip delayed
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions
Networking chips and modules for AI data centers: Infiniband, Ultra Ethernet, Optical Connections
A growing portion of the billions of dollars being spent on AI data centers will go to the suppliers of networking chips, lasers, and switches that integrate thousands of GPUs and conventional micro-processors into a single AI computer cluster. AI can’t advance without advanced networks, says Nvidia’s networking chief Gilad Shainer. “The network is the most important element because it determines the way the data center will behave.”
Networking chips now account for just 5% to 10% of all AI chip spending, said Broadcom CEO Hock Tan. As the size of AI server clusters hits 500,000 or a million processors, Tan expects that networking will become 15% to 20% of a data center’s chip budget. A data center with a million or more processors will cost $100 billion to build.
The firms building the biggest AI clusters are the hyperscalers, led by Alphabet’s Google, Amazon.com, Facebook parent Meta Platforms, and Microsoft. Not far behind are Oracle, xAI, Alibaba Group Holding, and ByteDance. Earlier this month, Bloomberg reported that capex for those four hyperscalers would exceed $200 billion this year, making the year-over-year increase as much as 50%. Goldman Sachs estimates that AI data center spending will rise another 35% to 40% in 2025. Morgan Stanley expects Amazon and Microsoft to lead the pack with $96.4bn and $89.9bn of capex respectively, while Google and Meta will follow at $62.6bn and $52.3bn.
AI compute server architectures began scaling in recent years for two reasons.
1.] High end processor chips from Intel neared the end of speed gains made possible by shrinking a chip’s transistors.
2.] Computer scientists at companies such as Google and OpenAI built AI models that performed amazing feats by finding connections within large volumes of training material.
As the components of these “Large Language Models” (LLMs) grew to millions, billions, and then trillions, they began translating languages, doing college homework, handling customer support, and designing cancer drugs. But training an AI LLM is a huge task, as it calculates across billions of data points, rolls those results into new calculations, then repeats. Even with Nvidia accelerator chips to speed up those calculations, the workload has to be distributed across thousands of Nvidia processors and run for weeks.
To keep up with the distributed computing challenge, AI data centers all have two networks:
- The “front end” network which sends and receives data to/from external users —like the networks of every enterprise data center or cloud-computing center. It’s placed on the network’s outward-facing front end or boundary and typically includes equipment like high end routers, web servers, DNS servers, application servers, load balancers, firewalls, and other devices which connect to the public internet, IP-MPLS VPNs and private lines.
- A “back end” network that connects every AI processor (GPUs and conventional MPUs) and memory chip with every other processor within the AI data center. “It’s just a supercomputer made of many small processors,” says Ram Velaga, Broadcom’s chief of core switching silicon. “All of these processors have to talk to each other as if they are directly connected.” AI’s back-end networks need high bandwidth switches and network connections. Delays and congestion are expensive when each Nvidia compute node costs as much as $400,000. Idle processors waste money. Back-end networks carry huge volumes of data. When thousands of processors are exchanging results, the data crossing one of these networks in a second can equal all of the internet traffic in America.
Nvidia became one of today’s largest vendors of network gear via its acquisition of Israel based Mellanox in 2020 for $6.9 billion. CEO Jensen Huang and his colleagues realized early on that AI workloads would exceed a single box. They started using InfiniBand—a network designed for scientific supercomputers—supplied by Mellanox. InfiniBand became the standard for AI back-end networks.
While most AI dollars still go to Nvidia GPU accelerator chips, back-end networks are important enough that Nvidia has large networking sales. In the September quarter, those network sales grew 20%, to $3.1 billion. However, Ethernet is now challenging InfiniBand’s lock on AI networks. Fortunately for Nvidia, its Mellanox subsidiary also makes high speed Ethernet hardware modules. For example, xAI uses Nvidia Ethernet products in its record-size Colossus system.
While current versions of Ethernet lack InfiniBand’s tools for memory and traffic management, those are now being added in a version called Ultra Ethernet [1.]. Many hyperscalers think Ethernet will outperform InfiniBand, as clusters scale to hundreds of thousands of processors. Another attraction is that Ethernet has many competing suppliers. “All the largest guys—with an exception of Microsoft—have moved over to Ethernet,” says an anonymous network industry executive. “And even Microsoft has said that by summer of next year, they’ll move over to Ethernet, too.”
Note 1. Primary goals and mission of Ultra Ethernet Consortium (UEC): Deliver a complete architecture that optimizes Ethernet for high performance AI and HPC networking, exceeding the performance of today’s specialized technologies. UEC specifically focuses on functionality, performance, TCO, and developer and end-user friendliness, while minimizing changes to only those required and maintaining Ethernet interoperability. Additional goals: Improved bandwidth, latency, tail latency, and scale, matching tomorrow’s workloads and compute architectures. Backwards compatibility to widely-deployed APIs and definition of new APIs that are better optimized to future workloads and compute architectures.
……………………………………………………………………………………………………………………………………………………………………………………………………………………………….
Ethernet back-end networks offer a big opportunity for Arista Networks, which builds switches using Broadcom chips. In the past two years, AI data centers became an important business for Arista. AI provides sales to Arista switch rivals Cisco and Juniper Networks (soon to be a part of Hewlett Packard Enterprise), but those companies aren’t as established among hyperscalers. Analysts expect Arista to get more than $1 billion from AI sales next year and predict that the total market for back-end switches could reach $15 billion in a few years. Three of the five big hyperscale operators are using Arista Ethernet switches in back-end networks, and the other two are testing them. Arista CEO Jayshree Ullal (a former SCU EECS grad student of this author/x-adjunct Professor) says that back-end network sales seem to pull along more orders for front-end gear, too.
The network chips used for AI switching are feats of engineering that rival AI processor chips. Cisco makes its own custom Ethernet switching chips, but some 80% of the chips used in other Ethernet switches comes from Broadcom, with the rest supplied mainly by Marvell. These switch chips now move 51 terabits of data a second; it’s the same amount of data that a person would consume by watching videos for 200 days straight. Next year, switching speeds will double.
The other important parts of a network are connections between computing nodes and cables. As the processor count rises, connections increase at a faster rate. A 25,000-processor cluster needs 75,000 interconnects. A million processors will need 10 million interconnects. More of those connections will be fiber optic, instead of copper or coax. As networks speed up, copper’s reach shrinks. So, expanding clusters have to “scale-out” by linking their racks with optics. “Once you move beyond a few tens of thousand, or 100,000, processors, you cannot connect anything with copper—you have to connect them with optics,” Velaga says.
AI processing chips (GPUs) exchange data at about 10 times the rate of a general-purpose processor chip. Copper has been the preferred conduit because it’s reliable and requires no extra power. At current network speeds, copper works well at lengths of up to five meters. So, hyperscalers have tried to “scale-up” within copper’s reach by packing as many processors as they can within each shelf, and rack of shelves.
Back-end connections now run at 400 gigabits per second, which is equal to a day and half of video viewing. Broadcom’s Velaga says network speeds will rise to 800 gigabits in 2025, and 1.6 terabits in 2026.
Nvidia, Broadcom, and Marvell sell optical interface products, with Marvell enjoying a strong lead in 800-gigabit interconnects. A number of companies supply lasers for optical interconnects, including Coherent, Lumentum Holdings, Applied Optoelectronics, and Chinese vendors Innolight and Eoptolink. They will all battle for the AI data center over the next few years.
A 500,000-processor cluster needs at least 750 megawatts, enough to power 500,000 homes. When AI models scale to a million or more processors, they will require gigawatts of power and have to span more than one physical data center, says Velaga.
The opportunity for optical connections reaches beyond the AI data center. That’s because there isn’t enough power. In September, Marvell, Lumentum, and Coherent demonstrated optical links for data centers as far apart as 300 miles. Nvidia’s next-generation networks will be ready to run a single AI workload across remote locations.
Some worry that AI performance will stop improving as processor counts scale. Nvidia’s Jensen Huang dismissed those concerns on his last conference call, saying that clusters of 100,000 processors or more will just be table stakes with Nvidia’s next generation of chips. Broadcom’s Velaga says he is grateful: “Jensen (Nvidia CEO) has created this massive opportunity for all of us.”
References:
https://www.datacenterdynamics.com/en/news/morgan-stanley-hyperscaler-capex-to-reach-300bn-in-2025/
https://ultraethernet.org/ultra-ethernet-specification-update/
Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!
Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?
Canalys & Gartner: AI investments drive growth in cloud infrastructure spending
AI Echo Chamber: “Upstream AI” companies huge spending fuels profit growth for “Downstream AI” firms
AI wave stimulates big tech spending and strong profits, but for how long?
Markets and Markets: Global AI in Networks market worth $10.9 billion in 2024; projected to reach $46.8 billion by 2029
Using a distributed synchronized fabric for parallel computing workloads- Part I
Using a distributed synchronized fabric for parallel computing workloads- Part II
FT: Nvidia invested $1bn in AI start-ups in 2024
Nvidia invested $1bn in artificial intelligence companies in 2024, as it emerged as a crucial backer of start-ups using the company’s graphics processing units (GPUs). The king of AI semiconductors, which surpassed a $3tn market capitalization in June due to huge demand for its high-performing GPUs, has significantly invested into some of its own customers.
According to corporate filings and Dealroom research, Nvidia spent a total of $1bn across 50 start-up funding rounds and several corporate deals in 2024, compared with 2023, which saw 39 start-up rounds and $872mn in spending. The vast majority of deals were with “core AI” companies with high computing infrastructure demands, and so in some cases also buyers of its own chips. Tech companies have spent tens of billions of dollars on Nvidia’s chips over the past year since the debut of ChatGPT two years ago kick-started an unprecedented surge of investment in AI. Nvidia’s uptick in deals comes after it amassed a $9bn war chest of cash with its GPUs becoming one of the world’s hottest commodities.
The company’s shares rose more than 170% in 2024, as it and other tech giants helped power the S&P 500 index to its best two-year run this century. Nvidia’s $1bn worth of investments in “non-affiliated entities” in the first nine months last year includes both its venture and corporate investment arms.
According to company filings, that sum was 15% more than in 2023 and more than 10 times as much as it invested in 2022. Some of Nvidia’s largest customers, such as Microsoft, Amazon and Google, are actively working to reduce their reliance on its GPUs by developing their own custom chips. Such a development could make smaller AI companies a more important generator of revenues for Nvidia in the future.
“Right now Nvidia wants there to be more competition and it makes sense for them to have these new players in the mix,” said a fund manager with a stake in a number of companies it had invested in.
In 2024, Nvidia struck more deals than Microsoft and Amazon, although Google remains far more active, according to Dealroom. Such prolific dealmaking has raised concerns about Nvidia’s grip over the AI industry, at a time when it is facing heightened antitrust scrutiny in the US, Europe and China. Bill Kovacic, former chair of the US Federal Trade Commission, said competition watchdogs were “keen” to investigate a “dominant enterprise making these big investments” to see if buying company stakes was aimed at “achieving exclusivity”, although he said investments in a customer base could prove beneficial. Nvidia strongly rejects the idea that it connects funding with any requirement to use its technology.
The company said it was “working to grow our ecosystem, support great companies and enhance our platform for everyone. We compete and win on merit, independent of any investments we make.” It added: “Every company should be free to make independent technological choices that best suit their needs and strategies.”
The Santa Clara based company’s most recent start-up deal was a strategic investment in Elon Musk’s xAI. Other significant 2024 investments included its participation in funding rounds for OpenAI, Cohere, Mistral and Perplexity, some of the most prominent AI model providers.
Nvidia also has a start-up incubator, Inception, which separately has helped the early evolution of thousands of fledgling companies. The Inception program offers start-ups “preferred pricing” on hardware, as well as cloud credits from Nvidia’s partners.
There has been an uptick in Nvidia’s acquisitions, including a takeover of Run:ai, an Israeli AI workload management platform. The deal closed this week after coming under scrutiny from the EU’s antitrust regulator, which ultimately cleared the transaction. The US Department of Justice was also looking at the deal, according to Politico. Nvidia also bought AI software groups Nebulon, OctoAI, Brev.dev, Shoreline.io and Deci. Collectively it has made more acquisitions in 2024 than the previous four years combined, according to Dealroom. Recommended News in-depthArtificial intelligence Wall Street frenzy creates $11bn debt market for AI groups buying Nvidia chips.
The company is investing widely, pouring millions of dollars into AI groups involved in medical technology, search engines, gaming, drones, chips, traffic management, logistics, data storage and generation, natural language processing and humanoid robots. Its portfolio includes a number of start-ups whose valuations have soared to billions of dollars. CoreWeave, an AI cloud computing service provider and significant purchaser of Nvidia chips, is preparing to float early this year at a valuation as high as $35bn — increasing from about $7bn a year ago.
Nvidia invested $100mn in CoreWeave in early 2023, and participated in a $1bn equity fundraising round by the company in May. Another start-up, Applied Digital, was facing a plunging share price in 2024, with revenue misses and considerable debt obligations, before a group of investors led by Nvidia provided $160mn of equity capital in September, prompting a 65 per cent surge in its share price.
“Nvidia is using their massive market cap and huge cash flow to keep purchasers alive,” said Nate Koppikar, a short seller at Orso Partners. “If Applied Digital had died, that’s [a large volume] of sales that would have died with it.”
Neocloud groups such as CoreWeave, Crusoe and Lambda Labs have acquired tens of thousands of Nvidia’s high-performance GPUs, that are crucial for developing generative AI models. Those Nvidia AI chips are now also being used as collateral for huge loans. The frenzied dealmaking has shone a light on a rampant GPU economy in Silicon Valley that is increasingly being supported by deep-pocketed financiers in New York. However, its rapid growth has raised concerns about the potential for more risky lending, circular financing and Nvidia’s chokehold on the AI market.
References:
https://www.ft.com/content/f8acce90-9c4d-4433-b189-e79cad29f74e
https://www.ft.com/content/41bfacb8-4d1e-4f25-bc60-75bf557f1f21
The case for and against AI-RAN technology using Nvidia or AMD GPUs
Nvidia is proposing a new approach to telco networks dubbed “AI radio access network (AI-RAN).” The GPU king says: “Traditional CPU or ASIC-based RAN systems are designed only for RAN use and cannot process AI traffic today. AI-RAN enables a common GPU-based infrastructure that can run both wireless and AI workloads concurrently, turning networks from single-purpose to multi-purpose infrastructures and turning sites from cost-centers to revenue sources. With a strategic investment in the right kind of technology, telcos can leap forward to become the AI grid that facilitates the creation, distribution, and consumption of AI across industries, consumers, and enterprises. This moment in time presents a massive opportunity for telcos to build a fabric for AI training (creation) and AI inferencing (distribution) by repurposing their central and distributed infrastructures.”
One of the first principles of AI-RAN technology is to be able to run RAN and AI workloads concurrently and without compromising carrier-grade performance. This multi-tenancy can be either in time or space: dividing the resources based on time of day or based on percentage of compute. This also implies the need for an orchestrator that can provision, de-provision, or shift workloads seamlessly based on available capacity.
Image Credit: Pitinan Piyavatin/Alamy Stock Photo
ARC-1, an appliance Nvidia showed off earlier this year, comes with a Grace Blackwell “superchip” that would replace either a traditional vendor’s application-specific integrated circuit (ASIC) or an Intel processor. Ericsson and Nokia are exploring the possibilities with Nvidia. Developing RAN software for use with Nvidia’s chips means acquiring competency in compute unified device architecture (CUDA), Nvidia’s instruction set. “They do have to reprofile into CUDA,” said Soma Velayutham, the general manager of Nvidia’s AI and telecom business, during a recent interview with Light Reading. “That is an effort.”
Proof of Concept:
SoftBank has turned the AI-RAN vision into reality, with its successful outdoor field trial in Fujisawa City, Kanagawa, Japan, where NVIDIA-accelerated hardware and NVIDIA Aerial software served as the technical foundation. That achievement marks multiple steps forward for AI-RAN commercialization and provides real proof points addressing industry requirements on technology feasibility, performance, and monetization:
- World’s first outdoor 5G AI-RAN field trial running on an NVIDIA-accelerated computing platform. This is an end-to-end solution based on a full-stack, virtual 5G RAN software integrated with 5G core.
- Carrier-grade virtual RAN performance achieved.
- AI and RAN multi-tenancy and orchestration achieved.
- Energy efficiency and economic benefits validated compared to existing benchmarks.
- A new solution to unlock AI marketplace integrated on an AI-RAN infrastructure.
- Real-world AI applications showcased, running on an AI-RAN network.
Above all, SoftBank aims to commercially release their own AI-RAN product for worldwide deployment in 2026. To help other mobile network operators get started on their AI-RAN journey now, SoftBank is also planning to offer a reference kit comprising the hardware and software elements required to trial AI-RAN in a fast and easy way.
SoftBank developed their AI-RAN solution by integrating hardware and software components from NVIDIA and ecosystem partners and hardening them to meet carrier-grade requirements. Together, the solution enables a full 5G vRAN stack that is 100% software-defined, running on NVIDIA GH200 (CPU+GPU), NVIDIA Bluefield-3 (NIC/DPU), and Spectrum-X for fronthaul and backhaul networking. It integrates with 20 radio units and a 5G core network and connects 100 mobile UEs.
The core software stack includes the following components:
- SoftBank-developed and optimized 5G RAN Layer 1 functions such as channel mapping, channel estimation, modulation, and forward-error-correction, using NVIDIA Aerial CUDA-Accelerated-RAN libraries
- Fujitsu software for Layer 2 functions
- Red Hat’s OpenShift Container Platform (OCP) as the container virtualization layer, enabling different types of applications to run on the same underlying GPU computing infrastructure
- A SoftBank-developed E2E AI and RAN orchestrator, to enable seamless provisioning of RAN and AI workloads based on demand and available capacity
AI marketplace solution integrated with SoftBank AI-RAN. Image Credit: Nvidia
The underlying hardware is the NVIDIA GH200 Grace Hopper Superchip, which can be used in various configurations from distributed to centralized RAN scenarios. This implementation uses multiple GH200 servers in a single rack, serving AI and RAN workloads concurrently, for an aggregated-RAN scenario. This is comparable to deploying multiple traditional RAN base stations.
In this pilot, each GH200 server was able to process 20 5G cells using 100-MHz bandwidth, when used in RAN-only mode. For each cell, 1.3 Gbps of peak downlink performance was achieved in ideal conditions, and 816Mbps was demonstrated with carrier-grade availability in the outdoor deployment.
……………………………………………………………………………………………………………………………………..
Could AMD GPU’s be an alternative to Nvidia AI-RAN?
AMD is certainly valued by NScale, a UK business with a GPU-as-a-service offer, as an AI alternative to Nvidia. “AMD’s approach is quite interesting,” said David Power, NScale’s chief technology officer. “They have a very open software ecosystem. They integrate very well with common frameworks.” So far, though, AMD has said nothing publicly about any AI-RAN strategy.
The other telco concern is about those promised revenues. Nvidia insists it was conservative when estimating that a telco could realize $5 in inferencing revenues for every $1 invested in AI-RAN. But the numbers met with a fair degree of skepticism in the wider market. Nvidia says the advantage of doing AI inferencing at the edge is that latency, the time a signal takes to travel around the network, would be much lower compared with inferencing in the cloud. But the same case was previously made for hosting other applications at the edge, and they have not taken off.
Even if AI changes that, it is unclear telcos would stand to benefit. Sales generated by the applications available on the mobile Internet have gone largely to hyperscalers and other software developers, leaving telcos with a dwindling stream of connectivity revenues. Expect AI-RAN to be a big topic for 2025 as operators carefully weigh their options. Many telcos are unconvinced there is a valid economic case for AI-RAN, especially since GPUs generate a lot of power (they are perceived as “energy hogs”).
References:
AI-RAN Goes Live and Unlocks a New AI Opportunity for Telcos
https://www.lightreading.com/ai-machine-learning/2025-preview-ai-ran-would-be-a-paradigm-shift
Nvidia bid to reshape 5G needs Ericsson and Nokia buy-in
Softbank goes radio gaga about Nvidia in nervy days for Ericsson
T-Mobile emerging as Nvidia’s big AI cheerleader
AI cloud start-up Vultr valued at $3.5B; Hyperscalers gorge on Nvidia GPUs while AI semiconductor market booms
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
Nvidia enters Data Center Ethernet market with its Spectrum-X networking platform
FT: New benchmarks for Gen AI models; Neocloud groups leverage Nvidia chips to borrow >$11B
Will AI clusters be interconnected via Infiniband or Ethernet: NVIDIA doesn’t care, but Broadcom sure does!
AI cloud start-up Vultr valued at $3.5B; Hyperscalers gorge on Nvidia GPUs while AI semiconductor market booms
Over the past two years, AI model builders OpenAI, Anthropic and Elon Musk’s xAI have raised nearly $40bn between them. Other sizeable investment rounds this week alone included $500mn for Perplexity, an AI-powered search engine, and $333mn for Vultr, part of a new band of companies running specialized cloud data centers to support AI.
Cloud AI startup Vultr raised $333 million in a financing round this week from Advanced Micro Devices (AMD) and hedge fund LuminArx Capital Management. That’s a sign of the super hot demand for AI infrastructure. West Palm Beach, Fla.-based Vultr said it is now valued at $3.5 billion and plans to use the financing to acquire more graphics processing units (GPUs) which process AI models. The funding is Vultr’s first injection of outside capital. That’s unusually high for a company that had not previously raised external equity capital. The average valuation for companies receiving first-time financing is $51mn, according to PitchBook.
Vultr said its AI cloud service, in which it leases GPU access to customers, will soon become the biggest part of its business. Earlier this month, Vultr announced plans to build its first “super-compute” cluster with thousands of AMD GPUs at its Chicago-area data center. Vultr said its cloud platform serves hundreds of thousands of businesses, including Activision Blizzard, the Microsoft-owned videogame company, and Indian telecommunications giant Bharti Airtel. Vultr’s customers also use its decade-old cloud platform for their core IT systems, said Chief Executive J.J. Kardwell. Like most cloud platform providers, Vultr isn’t using just one GPU supplier. It offers Nvidia and AMD GPUs to customers, and plans to keep doing so, Kardwell said. “There are different parts of the market that value each of them,” he added.
Vultr’s plan to expand its network of data centers, currently in 32 locations, is a bet that customers will seek greater proximity to their computing infrastructure as they move from training to “inference” — industry parlance for using models to perform calculations and make decisions.
Vultr runs a cloud computing platform on which customers can run applications and store data remotely © Vultr
……………………………………………………………………………………………………………………………………………………………………………………………..
The 10 biggest cloud companies — dubbed hyperscalers — are on track to allocate $326bn to capital expenditure in 2025, according to analysts at Morgan Stanley. While most depend heavily on chips made by Nvidia, large companies including Google, Amazon and Facebook are designing their own customized silicon to perform specialized tasks. Away from the tech mega-caps, emerging “neo-cloud” companies such as Vultr, CoreWeave, Lambda Labs and Nebius have raised billions of dollars of debt and equity in the past year in a bet on the expanding power and computing needs of AI models.
AI chip market leader Nvidia, which alongside other investors, provided more than $400 million to AI cloud provider CoreWeave [1.] in 2023. CoreWeave last year also secured $2.3 billion in debt financing by using its Nvidia GPUs as collateral.
Note 1. CoreWeave is a New Jersey-based company that got its start in cryptocurrency mining.
The race to train sophisticated AI models has inspired the commissioning of increasingly large “supercomputers” (aka AI Clusters) that link up hundreds of thousands of high-performance GPU chips. Elon Musk’s start-up xAI built its Colossus supercomputer in just three months and has pledged to increase it tenfold. Meanwhile, Amazon is building a GPU cluster alongside Anthropic, developer of the Claude AI models. The ecommerce group has invested $8bn in Anthropic.
Hyperscalers are big buyers of Nvidia GPUs:
Analysts at market research firm Omdia (an Informa company) estimate that Microsoft bought 485,000 of Nvidia’s “Hopper” chips this year. With demand outstripping supply of Nvidia’s most advanced graphics processing units for much of the past two years, Microsoft’s chip hoard has given it an edge in the race to build the next generation of AI systems.
This year, Big Tech companies have spent tens of billions of dollars on data centers running Nvidia’s latest GPU chips, which have become the hottest commodity in Silicon Valley since the debut of ChatGPT two years ago kick-started an unprecedented surge of investment in AI.
- Microsoft’s Azure cloud infrastructure was used to train OpenAI’s latest o1 model, as they race against a resurgent Google, start-ups such as Anthropic and Elon Musk’s xAI, and rivals in China for dominance of the next generation of computing. Omdia estimates
- ByteDance and Tencent each ordered about 230,000 of Nvidia’s chips this year, including the H20 model, a less powerful version of Hopper that was modified to meet U.S. export controls for Chinese customers.
- Meta bought 224,000 Hopper chips.
- Amazon and Google, which along with Meta are stepping up deployment of their own custom AI chips as an alternative to Nvidia’s, bought 196,000 and 169,000 Hopper chips, respectively, the analysts said. Omdia analyses companies’ publicly disclosed capital spending, server shipments and supply chain intelligence to calculate its estimates.
The top 10 buyers of data center infrastructure — which now include relative newcomers xAI and CoreWeave — make up 60% of global investment in computing power. Vlad Galabov, director of cloud and data center research at Omdia, said some 43% cent of spending on compute servers went to Nvidia in 2024. “Nvidia GPUs claimed a tremendously high share of the server capex,” he said.
What’s telling is that the biggest buyers of Nvidia GPUs are the hyperscalers who design their own compute servers and outsource the detailed implementation and manufacturing to Taiwan and China ODMs! U.S. compute server makers Dell and HPE are not even in the ball park!
What about #2 GPU maker AMD?
Dave McCarthy, a research vice president in cloud and edge services at research firm International Data Corp (IDC). “For AMD to be able to get good billing with an up-and-coming cloud provider like Vultr will help them get more visibility in the market.” AMD has also invested in cloud providers such as TensorWave, which also offers an AI cloud service. In August, AMD bought the data-center equipment designer ZT Systems for nearly $5 billion. Microsoft, Meta Platforms and Oracle have said they use AMD’s GPUs. A spokesperson for Amazon’s cloud unit said the company works closely with AMD and is “actively looking at offering AMD’s AI chips.”
Promising AI Chip Startups:
Nuvia: Founded by former Apple engineers, Nuvia is focused on creating high-performance processors tailored for AI workloads. Their chips are designed to deliver superior performance while maintaining energy efficiency, making them ideal for data centers and edge computing.
SambaNova Systems: This startup is revolutionizing AI with its DataScale platform, which integrates hardware and software to optimize AI workloads. Their unique architecture allows for faster training and inference, catering to enterprises looking to leverage AI for business intelligence.
Graphcore: Known for its Intelligence Processing Unit (IPU), Graphcore is making waves in the AI chip market. The IPU is designed specifically for machine learning tasks, providing significant speed and efficiency improvements over traditional GPUs.
Market for AI semiconductors:
- IDC estimates it will reach $193.3 billion by the end of 2027 from an estimated $117.5 billion this year. Nvidia commands about 95% of the market for AI chips, according to IDC.
- Bank of America analysts forecast the market for AI chips will be worth $276 billion by 2027.
References:
https://www.ft.com/content/946069f6-e03b-44ff-816a-5e2c778c67db
https://www.restack.io/p/ai-chips-answer-top-ai-chip-startups-2024-cat-ai
Lumen Technologies to connect Prometheus Hyperscale’s energy efficient AI data centers
The need for more cloud computing capacity and AI applications has been driving huge investments in data centers. Those investments have led to a steady demand for fiber capacity between data centers and more optical networking innovation inside data centers. Here’s the latest example of that:
Prometheus Hyperscale has chosen Lumen Technologies to connect its energy-efficient data centers to meet growing AI data demands. Lumen network services will help Prometheus with the rapid growth in AI, big data, and cloud computing as they address the critical environmental challenges faced by the AI industry.
Rendering of Prometheus Hyperscale flagship Data Center in Evanston, Wyoming:
……………………………………………………………………………….
Prometheus Hyperscale, known for pioneering sustainability in the hyperscale data center industry, is deploying a Lumen Private Connectivity Fabric℠ solution, including new network routes built with Lumen next generation wavelength services and Dedicated Internet Access (DIA) [1.] services with Distributed Denial of Service (DDoS) protection layered on top.
Note 1. Dedicated Internet Access (DIA) is a premium internet service that provides a business with a private, high-speed connection to the internet.
This expanded network will enable high-density compute in Prometheus facilities to deliver scalable and efficient data center solutions while maintaining their commitment to renewable energy and carbon neutrality. Lumen networking technology will provide the low-latency, high-performance infrastructure critical to meet the demands of AI workloads, from training to inference, across Prometheus’ flagship facility in Wyoming and four future data centers in the western U.S.
“What Prometheus Hyperscale is doing in the data center industry is unique and innovative, and we want to innovate alongside of them,” said Ashley Haynes-Gaspar, Lumen EVP and chief revenue officer. “We’re proud to partner with Prometheus Hyperscale in supporting the next generation of sustainable AI infrastructure. Our Private Connectivity Fabric solution was designed with scalability and security to drive AI innovation while aligning with Prometheus’ ambitious sustainability goals.”
Prometheus, founded as Wyoming Hyperscale in 2020, turned to Lumen networking solutions prior to the launch of its first development site in Aspen, WY. This facility integrates renewable energy sources, sustainable cooling systems, and AI-driven energy optimization, allowing for minimal environmental impact while delivering the computational power AI-driven enterprises demand. The partnership with Lumen reinforces Prometheus’ dedication to both technological innovation and environmental responsibility.
“AI is reshaping industries, but it must be done responsibly,” said Trevor Neilson, president of Prometheus Hyperscale. “By joining forces with Lumen, we’re able to offer our customers best-in-class connectivity to AI workloads while staying true to our mission of building the most sustainable data centers on the planet. Lumen’s network expertise is the perfect complement to our vision.”
Prometheus’ data center campus in Evanston, Wyoming will be one of the biggest data centers in the world with facilities expected to come online in late 2026. Future data centers in Pueblo, Colorado; Fort Morgan, Colorado; Phoenix, Arizona; and Tucson, Arizona, will follow and be strategically designed to leverage clean energy resources and innovative technology.
About Prometheus Hyperscale:
Prometheus Hyperscale, founded by Trenton Thornock, is revolutionizing data center infrastructure by developing sustainable, energy-efficient hyperscale data centers. Leveraging unique, cutting-edge technology and working alongside strategic partners, Prometheus is building next-generation, liquid-cooled hyperscale data centers powered by cleaner energy. With a focus on innovation, scalability, and environmental stewardship, Prometheus Hyperscale is redefining the data center industry for a sustainable future. This announcement follows recent news of Bernard Looney, former CEO of bp, being appointed Chairman of the Board.
To learn more visit: www.prometheushyperscale.com
About Lumen Technologies:
Lumen uses the scale of their network to help companies realize AI’s full potential. From metro connectivity to long-haul data transport to edge cloud, security, managed service, and digital platform capabilities, Lumenn meets its customers’ needs today and is ready for tomorrow’s requirements.
In October, Lumen CTO Dave Ward told Light Reading that a “fundamentally different order of magnitude” of compute power, graphics processing units (GPUs) and bandwidth is required to support AI workloads. “It is the largest expansion of the Internet in our lifetime,” Ward said.
Lumen is constructing 130,000 fiber route miles to support Meta and other customers seeking to interconnect AI-enabled data centers. According to a story by Kelsey Ziser, the fiber conduits in this buildout would contain anywhere from 144 to more than 500 fibers to connect multi-gigawatt data centers.
REFERENCES:
https://www.lightreading.com/data-centers/2024-in-review-data-center-shifts
Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
Initiatives and Analysis: Nokia focuses on data centers as its top growth market
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
Deutsche Telekom with AWS and VMware demonstrate a global enterprise network for seamless connectivity across geographically distributed data centers
Will billions of dollars big tech is spending on Gen AI data centers produce a decent ROI?
One of the big tech themes in 2024 was the buildout of data center infrastructure to support generative (Gen) artificial intelligence (AI) compute servers. Gen AI requires massive computational power, which only huge, powerful data centers can provide. Big tech companies like Amazon (AWS), Microsoft (Azure), Google (Google Cloud), Meta (Facebook) and others are building or upgrading their data centers to provide the infrastructure necessary for training and deploying AI models. These investments include high-performance GPUs, specialized hardware, and cutting-edge network infrastructure.
- Barron’s reports that big tech companies are spending billions on that initiative. In the first nine months of 2024, Amazon, Microsoft, and Alphabet spent a combined $133 billion building AI capacity, up 57% from the previous year, according to Barron’s. Much of the spending accrued to Nvidia, whose data center revenue reached $80 billion over the past three quarters, up 174%. The infrastructure buildout will surely continue in 2025, but tough questions from investors about return on investment (ROI) and productivity gains will take center stage from here.
- Amazon, Google, Meta and Microsoft expanded such investments by 81% year over year during the third quarter of 2024, according to an analysis by the Dell’Oro Group, and are on track to have spent $180 billion on data centers and related costs by the end of the year. The three largest public cloud providers, Amazon Web Services (AWS), Azure and Google Cloud, each had a spike in their investment in AI during the third quarter of this year. Baron Fung, a senior director at Dell’Oro Group, told Newsweek: “We think spending on AI infrastructure will remain elevated compared to other areas over the long-term. These cloud providers are spending many billions to build larger and more numerous AI clusters. The larger the AI cluster, the more complex and sophisticated AI models that can be trained. Applications such as Copilot, chatbots, search, will be more targeted to each user and application, ultimately delivering more value to users and how much end-users will pay for such a service,” Fung added.
- Efficient and scalable data centers can lower operational costs over time. Big tech companies could offer AI cloud services at scale, which might result in recurring revenue streams. For example, AI infrastructure-as-a-service (IaaS) could be a substantial revenue driver in the future, but no one really knows when that might be.
Microsoft has a long history of pushing new software and services products to its large customer base. In fact, that greatly contributed to the success of its Azure cloud computing and storage services. The centerpiece of Microsoft’s AI strategy is getting many of those customers to pay for Microsoft 365 Copilot, an AI assistant for its popular apps like Word, Excel, and PowerPoint. Copilot costs $360 a year per user, and that’s on top of all the other software, which costs anywhere from $72 to $657 a year. Microsoft’s AI doesn’t come cheap. Alistair Speirs, senior director of Microsoft Azure Global Infrastructure told Newsweek: “Microsoft’s datacenter construction has been accelerating for the past few years, and that growth is guided by the growing demand signals that we are seeing from customers for our cloud and AI offerings. “As we grow our infrastructure to meet the increasing demand for our cloud and AI services, we do so with a holistic approach, grounded in the principle of being a good neighbor in the communities in which we operate.”
Venture capitalist David Cahn of Sequoia Capital estimates that for AI to be profitable, every dollar invested on infrastructure needs four dollars in revenue. Those profits aren’t likely to come in 2025, but the companies involved (and there investors) will no doubt want to see signs of progress. One issue they will have to deal with is the popularity of free AI, which doesn’t generate any revenue by itself.
An August 2024 survey of over 4,600 adult Americans from researchers at the Federal Reserve Bank of St. Louis, Vanderbilt University, and Harvard University showed that 32% of respondents had used AI in the previous week, a faster adoption rate than either the PC or the internet. When asked what services they used, free options like OpenAI’s ChatGPT, Google’s Gemini, Meta Platform’s Meta AI, and Microsoft’s Windows Copilot were cited most often. Unlike 365, versions of Copilot built into Windows and Bing are free.
The unsurprising popularity of free AI services creates a dilemma for tech firms. It’s expensive to run AI in the cloud at scale, and as of now there’s no revenue behind it. The history of the internet suggests that these free services will be monetized through advertising, an arena where Google, Meta, and Microsoft have a great deal of experience. Investors should expect at least one of these services to begin serving ads in 2025, with the others following suit. The better AI gets—and the more utility it provides—the more likely consumers will go along with those ads.
Productivity Check:
We’re at the point in AI’s rollout where novelty needs to be replaced by usefulness—and investors will soon be looking for signs that AI is delivering productivity gains to business. Here we can turn to macroeconomic data for answers. According to the U.S. Bureau of Labor Statistics, since the release of ChatGPT in November 2022, labor productivity has risen at an annualized rate of 2.3% versus the historical median of 2.0%. It’s too soon to credit AI for those gains, but if above-median productivity growth continues into 2025, the conversation gets more interesting.
There’s also the continued question of AI and jobs, a fraught conversation that isn’t going to get any easier. There may already be AI-related job loss happening in the information sector, home to media, software, and IT. Since the release of ChatGPT, employment is down 3.9% in the sector, even as U.S. payrolls overall have grown by 3.3%. The other jobs most at risk are in professional and business services and in the financial sector. To be sure, the history of technological change is always complicated. AI might take away jobs, but it’s sure to add some, too.
“Some jobs will likely be automated. But at the same time, we could see new opportunities in areas requiring creativity, judgment, or decision-making,” economists Alexander Bick of the Federal Reserve Bank of St. Louis and Adam Blandin of Vanderbilt University tell Barron’s. “Historically, every big tech shift has created new types of work we couldn’t have imagined before.”
Closing Quote:
“Generative AI (GenAI) is being felt across all technology segments and subsegments, but not to everyone’s benefit,” said John-David Lovelock, Distinguished VP Analyst at Gartner. “Some software spending increases are attributable to GenAI, but to a software company, GenAI most closely resembles a tax. Revenue gains from the sale of GenAI add-ons or tokens flow back to their AI model provider partner.”
References:
AI Stocks Face a New Test. Here Are the 3 Big Questions Hanging Over Tech in 2025
Big Tech Increases Spending on Infrastructure Amid AI Boom – Newsweek
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
Ciena CEO sees huge increase in AI generated network traffic growth while others expect a slowdown
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
SK Telecom unveils plans for AI Infrastructure at SK AI Summit 2024
Huawei’s “FOUR NEW strategy” for carriers to be successful in AI era
Initiatives and Analysis: Nokia focuses on data centers as its top growth market
India Mobile Congress 2024 dominated by AI with over 750 use cases
Reuters & Bloomberg: OpenAI to design “inference AI” chip with Broadcom and TSMC
Ciena CEO sees huge increase in AI generated network traffic growth while others expect a slowdown
Today, Ciena reported better than expected revenue of $1.12 billion in its 4th quarter, which was above analyst expectations of around $1.103 billion. Orders were once again ahead of revenue, even though the company had expected orders to be below revenue just a few months ago. A closer look at key metrics reveals mixed results, with some segments like Software and Services showing strong growth (+20.6% year-over-year) and others like Routing and Switching experiencing significant declines (-38.4% year-over-year).
Increased demand for the company’s Reconfigurable Line Systems (RLS), primarily from large cloud providers. And he said the company was also doing well selling its WaveLogic coherent optical pluggables, which optimize performance in data centers as they support traffic from AI and machine learning.
Ciena’s Managed Optical Fiber Networks (MOFN) technology is designed for global service providers that are building dedicated private optical networks for cloud providers. MOFN came about a few years ago when cloud providers wanted to enter countries where they weren’t allowed to build their own fiber networks. “They had to go with the incumbent carrier, but they wanted to have control of their network within country. It was sort of a niche-type play. But we’ve seen more recently, over the last 6-9 months, that model being more widely adopted,” Smith said. MOFN is becoming more widely utilized, and the good news for Ciena is that cloud providers often request that Ciena equipment be used so that it matches with the rest of their network, according to Smith.
Image Credit: Midjourney for Fierce Network
…………………………………………………………………………………………………………………………..
The company also said it now expects average annual revenue growth of approximately 8% to 11% over the next three years. “Our business is linked heavily into the growth of bandwidth around the world,” CEO Gary Smith said after Ciena’s earnings call. “Traffic growth has been between 20% and 40% per year very consistently for the last two decades,” Smith told Light Reading.
Ciena believes huge investments in data centers with AI compute servers will ultimately result in more traffic traveling over U.S. and international broadband networks. “It has to come out of the data center and onto the network,” Smith said of AI data. “Now, quite where it ends up being, who can know. As an exact percentage, a lot of people are working through that, including the cloud guys,” he said about the data traffic growth rate over the next few years. “But one would expect [AI data] to layer on top of that 30% growth, is the point I’m making,” he added.
AI comes at a fortuitous time for Ciena. “You’re having to connect these GPU clusters over greater distances. We’re beginning to see general, broader traffic growth in things like inference and training. And that’s going to obviously drive our business, which is why we’re forecasting greater than normal growth,” Smith said.
Smith’s positive comments on AI traffic are noteworthy in light of some data points showing a slowdown in the rate of growth in data traffic on global networks. For example:
- OpenVault recently reported that monthly average broadband data consumption in the third quarter inched up 7.2%, the lowest rate of growth seen since the company began reporting these trends in 2012.
- In Ericsson’s newest report, Fredrik Jejdling, EVP and head of business area networks, said: “We see continued mobile network traffic growth but at a slower rate.”
- Some of the nation’s biggest Content Data Network (CDN) providers – including Akamai, Fastly and Edgio – are struggling to come to terms with a historic slowdown in Internet traffic growth. Such companies operate the content delivery networks that convey video and other digital content online.
- “In terms of traffic growth, it is growing very slowly – at rates that we haven’t seen in the 25-plus years we’ve been in this business. So it’s growing very, very slow,” Akamai CFO Ed McGowan said recently. “It’s just been a weak traffic environment.”
“The cloud providers themselves are building bigger infrastructure and networks, and laying track for even greater growth in the future as more and more of that AI traffic comes out of the data center,” Smithsaid. “So that’s why we’re predicting greater growth than normal over the next three years. It’s early days for that traffic coming out of the data center, but I think we’re seeing clear evidence around it. So you’re looking at an enormous step function in traffic flows over the next few years,” he concluded.
References:
https://www.lightreading.com/data-centers/ciena-ceo-prepare-for-the-ai-traffic-wave
https://www.fierce-network.com/broadband/cienas-ceo-says-companys-growth-linked-ai
Summit Broadband deploys 400G using Ciena’s WaveLogic 5 Extreme
DriveNets and Ciena Complete Joint Testing of 400G ZR/ZR+ optics for Network Cloud Platform
Telco spending on RAN infrastructure continues to decline as does mobile traffic growth
Analysys Mason & Light Reading: cellular data traffic growth rates are decreasing
TechCrunch: Meta to build $10 billion Subsea Cable to manage its global data traffic
Initiatives and Analysis: Nokia focuses on data centers as its top growth market
China Telecom’s 2025 priorities: cloud based AI smartphones (?), 5G new calling (GSMA), and satellite-to-phone services
At the 2024 Digital Technology Ecosystem Conference last week, China Telecom executives identified AI, 5G new calling and satellite-to-phone services as its handset priorities for 2025. The state-owned network operator, like other China telcos, is working with local manufacturers to build the devices it wants to sell through its channels.
China Telecom’s smartphone priorities align with its major corporate objectives. As China Telecom vice president Li Jun explained, devices are critical right across the business. “Terminals are an extension of the cloud network, a carrier of services, and a user interface,” he said.
China Telecom Vice President Li Jun
………………………………………………………………………………………………………………………………………………………………………………………………………………
China Telecom Deputy General Manager Tang Ke, introduced the progress of China Telecom and its partners in AI and emerging terminal ecosystem cooperation. He stated that in 2024, China Telecom will achieve large-scale development of basic 5G services, with over 120 million new self-registered users annually and more than 140 models of phones supporting 5G messaging.
In terms of emerging businesses, leading domestic smartphone brands fully support direct satellite connection, with 20 models available and over 4 million units activated. Leading PC brands fully integrate Tianyi Cloud Computer, further enriching applications in work, education, life, and entertainment scenarios. Domestic phones fully support quantum secure calls, with over 50 million new self-registered users. Terminals fully support the upgrade to 800M, reaching over 100 million users. Besides phones to support direct-to-cell calling, it also hoped to develop low-cost positioning tech using Beidou and 5G location capabilities.
China Telecom continues to promote comprehensive AI upgrades of terminals, collaborating with partners to expand AI terminal categories and provide users with more diverse choices and higher-quality experiences. Tang Ke revealed that, at the main forum of the “2024 Digital Technology Ecosystem Conference,” China Telecom will release its first operator-customized AI phone.
Tang Ke emphasized that in the AI era, jointly building a collaborative and mutually promoting AI terminal ecosystem has become the inevitable path of industry development. Ecosystem participants must closely coordinate in technology, industry, and business to offer users the best AI experience. China Telecom will comprehensively advance technical collaboration, accelerating coordination from levels such as chips, large models, and intelligent agents, and promoting the construction of AI technology frameworks from both the device and cloud sides. The company will comprehensively push terminal AI upgrades, accelerating the AI development of wearables, healthcare, education, innovation, and industry terminals, based on key categories such as smartphones, cameras, cloud computers, and smart speakers.
Deputy Marketing Director Shao Yantao laid out the company’s device strategy for the year ahead. He said China Telecom’s business was based on networks, cloud-network integration and quantum security, with a focus on three technology directions – AI, 5G and satellites. With AI, it aims to carry out joint product development with OEM partners to build device-cloud capabilities and establish AI models. The state owned telco will pursue “domestic and foreign” projects in cloud-based AI mobile phones.
Besides smartphones, other AI-powered products next year would likely include door locks, speakers, glasses and watches, Shao said. The other big focus area is 5G new calling, based on new IMS DC (data channel) capabilities, with the aim of opening up new use cases like screen sharing and interactive games during a voice call.
China Telecom would develop its own open-source IMS DC SDK to support AI, payments, XR and other new functionality, Shao said. But he acknowledged the need to build cooperation across the industry ecosystem. The network operator and its partners would also collaborate on Voice over WiFI and 3CC carrier aggregation for 5G-Advanced devices, he added.
……………………………………………………………………………………………………………………………………………………………………………………………..
China’s Ministry of Industry and Information Technology (MIIT) claims that China has built and activated over 4.1 million 5G base stations, with the 5G network continuously extending into rural areas, achieving “5G coverage in every township.” 5G has been integrated into 80 major categories of the national economy, with over 100,000 application cases accumulated. The breadth and depth of applications continue to expand, profoundly transforming lifestyles, production methods, and governance models.
The meeting emphasized the need to leverage the implementation of the “Sailing” Action Upgrade Plan for Large-scale 5G Applications as a means to vigorously promote the large-scale development of 5G applications, supporting new types of industrialization and the modernization of the information and communications industry, thereby laying a solid foundation for building a strong network nation and advancing Chinese-style modernization.
References:
https://www.c114.com.cn/news/22/c23811.html
https://en.c114.com.cn/583/a1279613.html
https://en.c114.com.cn/583/a1279469.html
China Telecom and China Mobile invest in LEO satellite companies
China Telecom with ZTE demo single-wavelength 1.2T bps hollow-core fiber transmission system over 100T bps
ZTE and China Telecom unveil 5G-Advanced solution for B2B and B2C services
Superclusters of Nvidia GPU/AI chips combined with end-to-end network platforms to create next generation data centers
Meta Platforms and Elon Musk’s xAI start-up are among companies building clusters of computer servers with as many as 100,000 of Nvidia’s most advanced GPU chips as the race for artificial-intelligence (AI) supremacy accelerates.
- Meta Chief Executive Mark Zuckerberg said last month that his company was already training its most advanced AI models with a conglomeration of chips he called “bigger than anything I’ve seen reported for what others are doing.”
- xAI built a supercomputer called Colossus—with 100,000 of Nvidia’s Hopper GPU/AI chips—in Memphis, TN in a matter of months.
- OpenAI and Microsoft have been working to build up significant new computing facilities for AI. Google is building massive data centers to house chips that drive its AI strategy.
xAI built a supercomputer in Memphis that it calls Colossus, with 100,000 Nvidia AI chips. Photo: Karen Pulfer Focht/Reuters
A year ago, clusters of tens of thousands of GPU chips were seen as very large. OpenAI used around 10,000 of Nvidia’s chips to train the version of ChatGPT it launched in late 2022, UBS analysts estimate. Installing many GPUs in one location, linked together by superfast networking equipment and cables, has so far produced larger AI models at faster rates. But there are questions about whether ever-bigger super clusters will continue to translate into smarter chatbots and more convincing image-generation tools.
Nvidia Chief Executive Jensen Huang said that while the biggest clusters for training for giant AI models now top out at around 100,000 of Nvidia’s current chips, “the next generation starts at around 100,000 Blackwells. And so that gives you a sense of where the industry is moving. Do we think that we need millions of GPUs? No doubt. That is a certainty now. And the question is how do we architect it from a data center perspective,” Huang added.
“There is no evidence that this will scale to a million chips and a $100 billion system, but there is the observation that they have scaled extremely well all the way from just dozens of chips to 100,000,” said Dylan Patel, the chief analyst at SemiAnalysis, a market research firm.
Giant super clusters are already getting built. Musk posted last month on his social-media platform X that his 100,000-chip Colossus super cluster was “soon to become” a 200,000-chip cluster in a single building. He also posted in June that the next step would probably be a 300,000-chip cluster of Nvidia’s newest GPU chips next summer. The rise of super clusters comes as their operators prepare for Nvidia’s nexgen Blackwell chips, which are set to start shipping out in the next couple of months. Blackwell chips are estimated to cost around $30,000 each, meaning a cluster of 100,000 would cost $3 billion, not counting the price of the power-generation infrastructure and IT equipment around the chips.
Those dollar figures make building up super clusters with ever more chips something of a gamble, industry insiders say, given that it isn’t clear that they will improve AI models to a degree that justifies their cost. Indeed, new engineering challenges also often arise with larger clusters:
- Meta researchers said in a July paper that a cluster of more than 16,000 of Nvidia’s GPUs suffered from unexpected failures of chips and other components routinely as the company trained an advanced version of its Llama model over 54 days.
- Keeping Nvidia’s chips cool is a major challenge as clusters of power-hungry chips become packed more closely together, industry executives say, part of the reason there is a shift toward liquid cooling where refrigerant is piped directly to chips to keep them from overheating.
- The sheer size of the super clusters requires a stepped-up level of management of those chips when they fail. Mark Adams, chief executive of Penguin Solutions, a company that helps set up and operate computing infrastructure, said elevated complexity in running large clusters of chips inevitably throws up problems.
The continuation of the AI boom for Nvidia largely depends on how the largest clusters of GPU chips deliver a return on investment for its customers. The trend also fosters demand for Nvidia’s networking equipment, which is fast becoming a significant business. Nvidia’s networking equipment revenue in 2024 was $3.13 billion, which was a 51.8% increase from the previous year. Mostly from its Mellanox acquisition, Nvidia offers these networking platforms:
- Accelerated Ethernet Switching for AI and the Cloud
- Quantum InfiniBand for AI and Scientific Computing
- Bluefield® Network Accelerators
………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………..
Nvidia forecasts total fiscal fourth-quarter sales of about $37.5bn, up 70%. That was above average analyst projections of $37.1bn, compiled by Bloomberg, but below some projections that were as high as $41bn. “Demand for Hopper and anticipation for Blackwell – in full production – are incredible as foundation model makers scale pretraining, post-training and inference, Huang said. “Both Hopper and Blackwell systems have certain supply constraints, and the demand for Blackwell is expected to exceed supply for several quarters in fiscal 2026,” CFO Colette Kress said.
References:
https://www.wsj.com/tech/ai/nvidia-chips-ai-race-96d21d09?mod=tech_lead_pos5
https://www.nvidia.com/en-us/networking/
https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-third-quarter-fiscal-2025