Generative AI
AI winner Nvidia faces competition with new super chip delayed
The Clear AI Winner Is: Nvidia!
Strong AI spending should help Nvidia make its own ambitious numbers when it reports earnings at the end of the month (it’s 2Q-2024 ended July 31st). Analysts are expecting nearly $25 billion in data center revenue for the July quarter—about what that business was generating annually a year ago. But the latest results won’t quell the growing concern investors have with the pace of AI spending among the world’s largest tech giants—and how it will eventually pay off.
In March, Nvidia unveiled its Blackwell chip series, succeeding its earlier flagship AI chip, the GH200 Grace Hopper Superchip, which was designed to speed generative AI applications. The NVIDIA GH200 NVL2 fully connects two GH200 Superchips with NVLink, delivering up to 288GB of high-bandwidth memory, 10 terabytes per second (TB/s) of memory bandwidth, and 1.2TB of fast memory. The GH200 NVL2 offers up to 3.5X more GPU memory capacity and 3X more bandwidth than the NVIDIA H100 Tensor Core GPU in a single server for compute- and memory-intensive workloads. The GH200 meanwhile combines an H100 chip [1.] with an Arm CPU and more memory.
Photo Credit: Nvidia
Note 1. The Nvidia H100, sits in a 10.5 inch graphics card which is then bundled together into a server rack alongside dozens of other H100 cards to create one massive data center computer.
This week, Nvidia informed Microsoft and another major cloud service provider of a delay in the production of its most advanced AI chip in the Blackwell series, the Information website said, citing a Microsoft employee and another person with knowledge of the matter.
…………………………………………………………………………………………………………………………………………
Nvidia Competitors Emerge – but are their chips ONLY for internal use?
In addition to AMD, Nvidia has several big tech competitors that are currently not in the merchant market semiconductor business. These include:
- Huawei has developed the Ascend series of chips to rival Nvidia’s AI chips, with the Ascend 910B chip as its main competitor to Nvidia’s A100 GPU chip. Huawei is the second largest cloud services provider in China, just behind Alibaba and ahead of Tencent.
- Microsoft has unveiled an AI chip called the Azure Maia AI Accelerator, optimized for artificial intelligence (AI) tasks and generative AI as well as the Azure Cobalt CPU, an Arm-based processor tailored to run general purpose compute workloads on the Microsoft Cloud.
- Last year, Meta announced it was developing its own AI hardware. This past April, Meta announced its next generation of custom-made processor chips designed for their AI workloads. The latest version significantly improves performance compared to the last generation and helps power their ranking and recommendation ads models on Facebook and Instagram.
- Also in April, Google revealed the details of a new version of its data center AI chips and announced an Arm-based based central processor. Google’s 10 year old Tensor Processing Units (TPUs) are one of the few viable alternatives to the advanced AI chips made by Nvidia, though developers can only access them through Google’s Cloud Platform and not buy them directly.
As demand for generative AI services continues to grow, it’s evident that GPU chips will be the next big battleground for AI supremacy.
References:
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions
https://www.nvidia.com/en-us/data-center/grace-hopper-superchip/
https://www.theverge.com/2024/2/1/24058186/ai-chips-meta-microsoft-google-nvidia/archives/2
https://news.microsoft.com/source/features/ai/in-house-chips-silicon-to-service-to-meet-ai-demand/
Vodafone: GenAI overhyped, will spend $151M to enhance its chatbot with AI
GenAI is probably the most “overhyped” technology for many years in the telecom industry, said Vodafone Group’s chief technology officer (CTO) Scott Petty at a press briefing this week. “Hopefully, we are reaching the peak of those inflated expectations, because we are about to drop into a trough of disillusionment,” he said.
“This industry is moving too quickly,” Petty explained. “The evolution of particularly GPUs and the infrastructure means that by the time you’d actually bought them and got them installed you’d be N minus one or N minus two in terms of the technology, and you’d be spending a lot of effort and resource just trying to run the infrastructure and the LLMs that sit around that.”
Partnerships with hyper-scalers remain Vodafone’s preference, he said. Earlier this year, Vodafone and Microsoft signed a 10-year strategic agreement to use Microsoft GenAI in Vodafone’s network.
Vodafone is planning to invest some €140 million ($151 million) in artificial intelligence (AI) systems this year to improve the handling of customer inquiries, the company said on July 4th. Vodafone said it is investing in advanced AI from Microsoft and OpenAI to improve its chatbot, dubbed TOBi, so that it can respond faster and resolve customer issues more effectively.
The chatbot was introduced into Vodafone’s customer service five years ago and is equipped with the real voice of a Vodafone employee.
The new system, which is called SuperTOBi in many countries, has already been introduced in Italy and Portugal and will be rolled out in Germany and Turkey later this month with other markets to follow later in the year, Vodafone said in a press release.
According to the company, SuperTOBi “can understand and respond faster to complex customer enquiries better than traditional chatbots.” The new bot will assist customers with various tasks, such as troubleshooting hardware issues and setting up fixed-line routers, the company said.
Vodafone is not about to expose Vodafone’s data to publicly available models like ChatGPT. Nor will the UK based telco create large language models (LLMs) on its own. Instead, a team of 50 data scientists are working on fine-tuning LLMs like Anthropic and Vertex. Vodafone can expose information to those LLMs by dipping into its 24-petabyte data “ocean,” created with Google. Secure containers within public clouds ensure private information is securely cordoned off and unavailable to others.
According to Petty’s estimates, the performance speed of LLMs has improved by a factor of 12 in the last nine months alone, while operational costs have decreased by a factor of six. A telco that invested nine months ago would already have outdated and expensive technology. Petty, moreover, is not the only telco CTO wary of plunging into Nvidia’s GPU chips.
“This is a very weird moment in time where power is very expensive, natural resources are scarce and GPUs are extremely expensive,” said Bruno Zerbib, the CTO of France’s Orange, at the 2024 Mobile World Congress in Barcelona, Spain. “You have to be very careful with your investment because you might buy a GPU product from a famous company right now that has a monopolistic position.”
Petty thinks LLM processing may eventually need to be processed outside hyper-scalers’ facilities. “To really create the performance that we want, we are going to need to push those capabilities further toward the edge of the network,” he said. “It is not going to be the hype cycle of the back end of 2024. But in 2025 and 2026, you’ll start to see those applications and capabilities being deployed at speed.”
“The time it takes for that data to get up and back will dictate whether you’re happy as a consumer to use that interface as your primary interface, and the investment in latency is going to be critically important,” said Petty. “We’re fortunate that 5G standalone drives low latency capability, but it’s not deployed at scale. We don’t have ubiquitous coverage. We need to make sure that those things are available to enable those applications.”
Data from Ericsson supports that view, showing that 5G population coverage is just 70% across Europe, compared with 90% in North America and 95% in China. The figure for midband spectrum – considered a 5G sweet spot that combines decent coverage with high-speed service – is as low as 30% in Europe, against 85% in North America and 95% in China.
Non-standalone (NSA) 5G, which connects a 5G radio access network (RAN) to a 4G core (EPC), is “dominating the market,” said Ericsson.
Vodafone has pledged to spend £11 billion (US$14 billion) on the rollout of a nationwide standalone 5G network in the UK if authorities bless its proposed merger with Three. With more customers, additional spectrum and a bigger footprint, the combined company would be able to generate healthier returns and invest in network improvements, the company said. But a UK merger would not aid the operator in Europe’s four-player markets.
Petty believes a “pay for search” economic model may emerge using GenAI virtual assistants. “This will see an evolution of a two-sided economic model that probably didn’t get in the growth of the Internet in the last 20 years,” but it would not be unlike today’s market for content delivery networks (CDNs).
“Most CDNs are actually paid for by the content distribution companies – the Netflixes, the TV sports – because they want a great experience for their users for the paid content they’ve bought. When it’s free content, maybe the owner of that content is less willing to invest to build out the capabilities in the network.”
Like other industry executives, Petty must hope the debates about net neutrality and fair contribution do not plunge telcos into a long disillusionment trough.
References:
Vodafone CTO: AI will overhaul 5G networks and Internet economics (lightreading.com)
Vodafone UK report touts benefits of 5G SA for Small Biz; cover for proposed merger with Three UK?
Data infrastructure software: picks and shovels for AI; Hyperscaler CAPEX
For many years, data volumes have been accelerating. By 2025, global data volumes are expected to reach 180 zettabytes (1 zettabyte=1 sextillion bytes), up from 120 zettabytes in 2023.
In the age of AI, data is viewed as the currency for large language models (LLMs) and AI–enabled offerings. Therefore, demand for tools to integrate, store and process data is a growing priority amongst enterprises.
The median size of datasets required to train AI models increased from 5.9 million data points in 2010 to 750 billion in 2023, according to BofA Global Research. As demand rises for AI-enabled offerings, companies are prioritizing tools to integrate, store, and process data.
In BofA’s survey, data streaming/stream processing and data science/ML were selected as key use cases in regard to AI, with 44% and 37% of respondents citing usage, respectively. Further, AI enablement is accelerating data to the cloud. Gartner estimates that 74% of the data management market will be deployed in the cloud by 2027, up from 60% in 2023.
Data infrastructure software [1.] represents a top spending priority for the IT department. Survey respondents cite that data infrastructure represents 35% of total IT spending, with budgets expected to grow 9% for the next 12 months. No surprise that the public cloud hyper-scaler platforms were cited as top three vendors. Amazon AWS data warehouse/data lake offerings, Microsoft Azure database offerings, and Google BigQuery are chosen by 45%, 39% and 35% of respondents, respectively.
Note 1. Data infrastructure software refers to databases, data warehouses/lakes, data pipelines, data analytics and other software that facilitate data management, processing and analysis.
………………………………………………………………………………………………………………..
The top three factors for evaluating data infrastructure software vendors are security, enterprise capabilities (e.g., architecture scalability and reliability) and depth of technology.
BofA’s Software team estimates that the data infrastructure industry (e.g., data warehouses, data lakes, unstructured databases, etc.) is currently a $96bn market that could reach $153bn in 2028. The team’s proprietary survey revealed that data infrastructure is 35% of total IT spending with budgets expected to grow 9% over the next 12 months. Hyperscalers including Amazon and Google are among the top recipients of dollars and in-turn, those companies spend big on hardware.
Key takeaways:
- Data infrastructure is the largest and fastest growing segment of software ($96bn per our bottom-up analysis, 17% CAGR).
- AI/cloud represent enduring growth drivers. Data is the currency for LLMs, positioning data vendors well in this new cycle
- BofA survey (150 IT professionals) suggests best of breeds (MDB, SNOW and Databricks) seeing highest expected growth in spend
………………………………………………………………………………………………………….
BofA analyst Justin Post expects server and equipment capex for mega-cap internet companies (Amazon, Alphabet/Google, Meta/Facebook) to rise 43% y/y in 2024 to $145bn, which represents $27bn of the $37bn y/y total capex growth. Despite the spending surge, Mr. Post thinks these companies will keep free cash flow margins stable at 22% y/y before increasing in 2025. The technical infrastructure related capex spend at these three companies is expected to see steep rise in 2024, with the majority of the increase for servers and equipment.
Notes:
- Alphabet categorizes its technical infrastructure assets under the line item ‘Information Technology Assets‘
- Amazon take a much a broader categorization and includes Servers, networking equipment, retail related heavy equipment & fulfillment equipment under ‘Equipment‘.
- Meta gives more details and separately reports Server & Networking, and Equipment assets.
In 2024, BofA estimates CAPEX for the three hyperscalers as follows:
- Alphabet‘s capex for IT assets will increase by $12bn y/y to $28bn.
- Meta, following a big ramp in 2023, server, network and equipment asset spend is expected to increase $7bn y/y to $22bn.
- Amazon, equipment spend is expected to increase $8bn y/y to $41bn (driven by AWS, retail flattish). Amazon will see less relative growth due to retail equipment capex leverage in this line.
On a relative scale, Meta capex spend (% of revenue) remains highest in the group and the company has materially stepped up its AI related capex investments since 2022 (in–house supercomputer, LLM, leading computing power, etc.). We think it‘s interesting that Meta is spending almost as much as the hyperscalers on capex, which should likely lead to some interesting internal AI capabilities, and potential to build a “marketing cloud“ for its advertisers.
From 2016-22, the sector headcount grew 26% on average. In 2023, headcount decreased by 9%. BofA expects just 3% average. annual job growth from 2022-2026. Moreover, AI tools will likely drive higher employee efficiency, helping offset higher depreciation.
…………………………………………………………………………………………………………
Source for all of the above information: BofA Global Research
NTT & Yomiuri: ‘Social Order Could Collapse’ in AI Era
From the Wall Street Journal:
Japan’s largest telecommunications company and the country’s biggest newspaper called for speedy legislation to restrain generative artificial intelligence, saying democracy and social order could collapse if AI is left unchecked.
Nippon Telegraph and Telephone, or NTT, and Yomiuri Shimbun Group Holdings made the proposal in an AI manifesto to be released Monday. Combined with a law passed in March by the European Parliament restricting some uses of AI, the manifesto points to rising concern among American allies about the AI programs U.S.-based companies have been at the forefront of developing.
The Japanese companies’ manifesto, while pointing to the potential benefits of generative AI in improving productivity, took a generally skeptical view of the technology. Without giving specifics, it said AI tools have already begun to damage human dignity because the tools are sometimes designed to seize users’ attention without regard to morals or accuracy.
Unless AI is restrained, “in the worst-case scenario, democracy and social order could collapse, resulting in wars,” the manifesto said.
It said Japan should take measures immediately in response, including laws to protect elections and national security from abuse of generative AI.
A global push is under way to regulate AI, with the European Union at the forefront. The EU’s new law calls on makers of the most powerful AI models to put them through safety evaluations and notify regulators of serious incidents. It also is set to ban the use of emotion-recognition AI in schools and workplaces.
The Biden administration is also stepping up oversight, invoking emergency federal powers last October to compel major AI companies to notify the government when developing systems that pose a serious risk to national security. The U.S., U.K. and Japan have each set up government-led AI safety institutes to help develop AI guidelines.
Still, governments of democratic nations are struggling to figure out how to regulate AI-powered speech, such as social-media activity, given constitutional and other protections for free speech.
NTT and Yomiuri said their manifesto was motivated by concern over public discourse. The two companies are among Japan’s most influential in policy. The government still owns about one-third of NTT, formerly the state-controlled phone monopoly.
Yomiuri Shimbun, which has a morning circulation of about six million copies according to industry figures, is Japan’s most widely-read newspaper. Under the late Prime Minister Shinzo Abe and his successors, the newspaper’s conservative editorial line has been influential in pushing the ruling Liberal Democratic Party to expand military spending and deepen the nation’s alliance with the U.S.
The two companies said their executives have been examining the impact of generative AI since last year in a study group guided by Keio University researchers.
The Yomiuri’s news pages and editorials frequently highlight concerns about artificial intelligence. An editorial in December, noting the rush of new AI products coming from U.S. tech companies, said “AI models could teach people how to make weapons or spread discriminatory ideas.” It cited risks from sophisticated fake videos purporting to show politicians speaking.
NTT is active in AI research, and its units offer generative AI products to business customers. In March, it started offering these customers a large-language model it calls “tsuzumi” which is akin to OpenAI’s ChatGPT but is designed to use less computing power and work better in Japanese-language contexts.
An NTT spokesman said the company works with U.S. tech giants and believes generative AI has valuable uses, but he said the company believes the technology has particular risks if it is used maliciously to manipulate public opinion.
…………………………………………………………………………………………………………….
From the Japan News (Yomiuri Shimbun):
Challenges: Humans cannot fully control Generative AI technology
・ While the accuracy of results cannot be fully guaranteed, it is easy for people to use the technology and understand its output. This often leads to situations in which generative AI “lies with confidence” and people are “easily fooled.”
・ Challenges include hallucinations, bias and toxicity, retraining through input data, infringement of rights through data scraping and the difficulty of judging created products.
・ Journalism, research in academia and other sources have provided accurate and valuable information by thoroughly examining what information is correct, allowing them to receive some form of compensation or reward. Such incentives for providing and distributing information have ensured authenticity and trustworthiness may collapse.
A need to respond: Generative AI must be controlled both technologically and legally
・ If generative AI is allowed to go unchecked, trust in society as a whole may be damaged as people grow distrustful of one another and incentives are lost for guaranteeing authenticity and trustworthiness. There is a concern that, in the worst-case scenario, democracy and social order could collapse, resulting in wars.
・ Meanwhile, AI technology itself is already indispensable to society. If AI technology is dismissed as a whole as untrustworthy due to out-of-control generative AI, humanity’s productivity may decline.
・ Based on the points laid out in the following sections, measures must be realized to balance the control and use of generative AI from both technological and institutional perspectives, and to make the technology a suitable tool for society.
Point 1: Confronting the out-of-control relationship between AI and the attention economy
・ Any computer’s basic structure, or architecture, including that of generative AI, positions the individual as the basic unit of user. However, due to computers’ tendency to be overly conscious of individuals, there are such problems as unsound information spaces and damage to individual dignity due to the rise of the attention economy.
・ There are concerns that the unstable nature of generative AI is likely to amplify the above-mentioned problems further. In other words, it cannot be denied that there is a risk of worsening social unrest due to a combination of AI and the attention economy, with the attention economy accelerated by generative AI. To understand such issues properly, it is important to review our views on humanity and society and critically consider what form desirable technology should take.
・ Meanwhile, the out-of-control relationship between AI and the attention economy has already damaged autonomy and dignity, which are essential values that allow individuals in our society to be free. These values must be restored quickly. In doing so, autonomous liberty should not be abandoned, but rather an optimal solution should be sought based on human liberty and dignity, verifying their rationality. In the process, concepts such as information health are expected to be established.
Point 2: Legal restraints to ensure discussion spaces to protect liberty and dignity, the introduction of technology to cope with related issues
・ Ensuring spaces for discussion in which human liberty and dignity are maintained has not only superficial economic value, but also a special value in terms of supporting social stability. The out-of-control relationship between AI and the attention economy is a threat to these values. If generative AI develops further and is left unchecked like it is currently, there is no denying that the distribution of malicious information could drive out good things and cause social unrest.
・ If we continue to be unable to sufficiently regulate generative AI — or if we at least allow the unconditional application of such technology to elections and security — it could cause enormous and irreversible damage as the effects of the technology will not be controllable in society. This implies a need for rigid restrictions by law (hard laws that are enforceable) on the usage of generative AI in these areas.
・ In the area of education, especially compulsory education for those age groups in which students’ ability to make appropriate decisions has not fully matured, careful measures should be taken after considering both the advantages and disadvantages of AI usage.
・ The protection of intellectual property rights — especially copyrights — should be adapted to the times in both institutional and technological aspects to maintain incentives for providing and distributing sound information. In doing so, the protections should be made enforceable in practice, without excessive restrictions to developing and using generative AI.
・ These solutions cannot be maintained by laws alone, but rather, they also require measures such as Originator Profile (OP), which is secured by technology.
Point 2: Legal restraints to ensure discussion spaces to protect liberty and dignity, and the introduction of technology to cope with related issues
・ Ensuring spaces for discussion in which human liberty and dignity are maintained has not only superficial economic value, but also a special value in terms of supporting social stability. The out-of-control relationship between AI and the attention economy is a threat to these values. If generative AI develops further and is left unchecked like it is currently, there is no denying that the distribution of malicious information could drive out good things and cause social unrest.
・ If we continue to be unable to sufficiently regulate generative AI — or if we at least allow the unconditional application of such technology to elections and security — it could cause enormous and irreversible damage as the effects of the technology will not be controllable in society. This implies a need for rigid restrictions by law (hard laws that are enforceable) on the usage of generative AI in these areas.
・ In the area of education, especially compulsory education for those age groups in which students’ ability to make appropriate decisions has not fully matured, careful measures should be taken after considering both the advantages and disadvantages of AI usage.
・ The protection of intellectual property rights — especially copyrights — should be adapted to the times in both institutional and technological aspects to maintain incentives for providing and distributing sound information. In doing so, the protections should be made enforceable in practice, without excessive restrictions to developing and using generative AI.
・ These solutions cannot be maintained by laws alone, but rather, they also require measures such as Originator Profile (OP), which is secured by technology.
Point 3: Establishment of effective governance, including legislation
・ The European Union has been developing data-related laws such as the General Data Protection Regulation, the Digital Services Act and the Digital Markets Act. It has been developing regulations through strategic laws with awareness of the need to both control and promote AI, positioning the Artificial Intelligence Act as part of such efforts.
・ Japan does not have such a strategic and systematic data policy. It is expected to require a long time and involve many obstacles to develop such a policy. Therefore, in the long term, it is necessary to develop a robust, strategic and systematic data policy and, in the short term, individual regulations and effective measures aimed at dealing with AI and attention economy-related problems in the era of generative AI.
・However, it would be difficult to immediately introduce legislation, including individual regulations, for such issues. Without excluding consideration of future legislation, the handling of AI must be strengthened by soft laws — both for data (basic) and generative AI (applied) — that offer a co-regulatory approach that identifies stakeholders. Given the speed of technological innovation and the complexity of value chains, it is expected that an agile framework such as agile governance, rather than governance based on static structures, will be introduced.
・ In risk areas that require special caution (see Point 2), hard laws should be introduced without hesitation.
・ In designing a system, attention should be paid to how effectively it protects the people’s liberty and dignity, as well as to national interests such as industry, based on the impact on Japan of extraterritorial enforcement to the required extent and other countries’ systems.
・ As a possible measure to balance AI use and regulation, a framework should be considered in which the businesses that interact directly with users in the value chain, the middle B in “B2B2X,” where X is the user, reduce and absorb risks when generative AI is used.
・ To create an environment that ensures discussion spaces in which human liberty and dignity are maintained, it is necessary to ensure that there are multiple AIs of various kinds and of equal rank, that they keep each other in check, and that users can refer to them autonomously, so that users do not have to depend on a specific AI. Such moves should be promoted from both institutional and technological perspectives.
Outlook for the Future:
・ Generative AI is a technology that cannot be fully controlled by humanity. However, it is set to enter an innovation phase (changes accompanying social diffusion).
・ In particular, measures to ensure a healthy space for discussion, which constitutes the basis of human and social security (democratic order), must be taken immediately. Legislation (hard laws) are needed, mainly for creating zones of generative AI use (strong restrictions for elections and security).
・ In addition, from the viewpoint of ecosystem maintenance (including the dissemination of personal information), it is necessary to consider optimizing copyright law in line with the times, in a manner compatible with using generative AI itself, from both institutional and technological perspectives.
・ However, as it takes time to revise the law, the following steps must be taken: the introduction of rules and joint regulations mainly by the media and various industries, the establishment and dissemination of effective technologies, and making efforts to revise the law.
・ In this process, the most important thing is to protect the dignity and liberty of individuals in order to achieve individual autonomy. Those involved will study the situation, taking into account critical assessments based on the value of community.
References:
‘Joint Proposal on Shaping Generative AI’ by The Yomiuri Shimbun Holdings and NTT Corp.
Major technology companies form AI-Enabled Information and Communication Technology (ICT) Workforce Consortium
MTN Consulting: Generative AI hype grips telecom industry; telco CAPEX decreases while vendor revenue plummets
Cloud Service Providers struggle with Generative AI; Users face vendor lock-in; “The hype is here, the revenue is not”
Amdocs and NVIDIA to Accelerate Adoption of Generative AI for $1.7 Trillion Telecom Industry
AI sparks huge increase in U.S. energy consumption and is straining the power grid; transmission/distribution as a major problem
The AI boom is changing how data centers are built and where they’re located, and it’s already sparking a reshaping of U.S. energy infrastructure, according to Barron’s. Energy companies increasingly cite AI power consumption as a leading contributor to new demand. That is because AI compute servers in data centers require a tremendous amount of power to process large language models (LLMs). That was explained in detail in this recent IEEE Techblog post.
Fast Company reports that “The surge in AI is straining the U.S. power grid.” AI is pushing demand for energy significantly higher than anyone was anticipating. “The U.S. electric grid is not prepared for significant load growth,” Grid Strategies warned. AI is a major part of the problem when it comes to increased demand. Not only are industry leaders such as OpenAI, Amazon, Microsoft, and Google either building or looking for locations on which to build enormous data centers to house the infrastructure required to power large language models, but smaller companies in the space are also making huge energy demands, as the Washington Post reports.
Georgia Power, which is the chief energy provider for that state, recently had to increase its projected winter megawatt demand by as much as 38%. That’s, in part, due to the state’s incentive policy for computer operations, something officials are now rethinking. Meanwhile, Portland General Electric in Oregon, recently doubled its five-year forecast for new electricity demand.
Electricity demand was so great in Virginia that Dominion Energy was forced to halt connections to new data centers for about three months in 2022. Dominion says it expects demand in its service territory to grow by nearly 5% annually over the next 15 years, which would almost double the total amount of electricity it generates and sells. To prepare, the company is building the biggest offshore wind farm in the U.S. some 25 miles off Virginia Beach and is adding solar energy and battery storage. It has also proposed investing in new gas generation and is weighing whether to delay retiring some natural gas plants and one large coal plant.
Also in 2022, the CEO of data center giant Digital Realty said on an earnings call that Dominion had warned its big customers about a “pinch point” that could prevent it from supplying new projects until 2026.
AES, another Virginia-based utility, recently told investors that data centers could comprise up to 7.5% of total U.S. electricity consumption by 2030, citing data from Boston Consulting Group. The company is largely betting its growth on the ability to deliver renewable power to data centers in the coming years.
New data centers coming on line in its regions ”represent the potential for thousands of megawatts of new electric load—often hundreds of megawatts for just one project,” Sempra Energy told investors on its earnings call last month. The company operates public utilities in California and Texas and has cited AI as a major factor in its growth.
There are also environmental concerns. While there is a push to move to cleaner energy production methods, such as solar, due to large federal subsidies, many are not yet online. And utility companies are lobbying to delay the shutdown of fossil fuel plants (and some are hoping to bring more online) to meet the surge in demand.
“Annual peak demand growth forecasts appear headed for growth rates that are double or even triple those in recent years,” Grid Strategies wrote. “Transmission planners need long-term forecasts of both electricity demand and sources of electricity supply to ensure sufficient transmission will be available when and where it’s needed. Such a failure of planning could have real consequences for investments, jobs, and system reliability for all electric customers.”
According to Boston Consulting Group, the data-center share of U.S. electricity consumption is expected to triple from 126 terawatt hours in 2022 to 390 terawatt hours by 2030. That’s the equivalent usage of 40 million U.S. homes, the firm says. Much of the data-center growth is being driven by new applications of generative AI. As AI dominates the conversation, it’s likely to bring renewed focus on the nation’s energy grid. Siemens Energy CEO Christian Bruch told shareholders at the company’s recent annual meeting that electricity needs will soar with the growing use of AI. “That means one thing: no power, no AI. Or to put it more clearly: no electricity, no progress.”
The technology sector has already shown how quickly AI can recast long-held assumptions. Chips, for instance, driven by Nvidia, have replaced software as tech’s hottest commodity. Nvidia has said that the trillion dollars invested in global data-center infrastructure will eventually shift from traditional servers with central processing units, or CPUs, to AI servers with graphics processing units, or GPUs. GPUs are better able to power the parallel computations needed for AI.
For AI workloads, Nvidia says that two GPU servers can do the work of a thousand CPU servers at a fraction of the cost and energy. Still, the better performance capabilities of GPUs is leading to more aggregate power usage as developers find innovative new ways to use AI.
The overall power consumption increase will come on two fronts: an increase in the number of GPUs sold per year and a higher power draw from each GPU. Research firm 650 Group expects AI server shipments will rise from one million units last year to six million units in 2028. According to Gartner, most AI GPUs will draw 1,000 watts of electricity by 2026, up from the roughly 650 watts on average today.
Ironically, data-center operators will use AI technology to address the power demands. “AI can be used to improve efficiency, where you’re modeling temperature, humidity, and cooling,” says Christopher Wellise, vice president of sustainability for Equinix, one of the nation’s largest data-center companies. “It can also be used for predictive maintenance.” Equinix states that using AI modeling at one of its data centers has already improved energy efficiency by 9%.
Data centers will also install more-effective cooling systems. , a leading provider of power and cooling infrastructure equipment, says that AI servers generate five times more heat than traditional CPU servers and require ten times more cooling per square foot. AI server maker Super Micro estimates that switching to liquid cooling from traditional air-based cooling can reduce operating expenses by more than 40%.
But cooling, AI efficiency, and other technologies won’t fully solve the problem of satisfying AI’s energy demands. Certain regions could face issues with their local grid. Historically, the two most popular areas to build data centers were Northern Virginia and Silicon Valley. The regions’ proximity to major internet backbones enabled quicker response times for applications, which is also helpful for AI. (Northern Virginia was home to AOL in the 1990s. A decade later, Silicon Valley was hosting most of the country’s online platforms.)
Today, each region faces challenges around power capacity and data-center availability. Both areas are years away making from the grid upgrades that would be needed to run more data centers, according to DigitalBridge, an asset manager that invests in digital infrastructure. DigitalBridge CEO Marc Ganzi says the tightness in Northern Virginia and Northern California is driving data-center construction into other markets, including Atlanta; Columbus, Ohio; and Reno, Nev. All three areas offer better power availability than Silicon Valley and Northern Virginia, though the network quality is slightly inferior as of now. Reno also offers better access to renewable energy sources such as solar and wind.
Ultimately, Ganzi says the obstacle facing the energy sector—and future AI applications—is the country’s decades-old electric transmission grid. “It isn’t so much that we have a power issue. We have a transmission infrastructure issue,” he says. “Power is abundant in the United States, but it’s not efficiently transmitted or efficiently distributed.”
Yet that was one of the prime objectives of the Smart Grid initiative which apparently is a total failure! Do you think IEEE can revive that initiative with a focus on power consumption and cooling in AI data centers?
References:
https://www.barrons.com/articles/ai-chips-electricity-usage-2f92b0f3
https://www.supermicro.com/en/solutions/liquid-cooling
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions
Backgrounder:
Artificial intelligence (AI) continues both to astound and confound. AI finds patterns in data and then uses a technique called “reinforcement learning from human feedback.” Humans help train and fine-tune large language models (LLMs). Some humans, like “ethics & compliance” folks, have a heavier hand than others in tuning models to their liking.
Generative Artificial Intelligence (generative AI) is a type of AI that can create new content and ideas, including conversations, stories, images, videos, and music. AI technologies attempt to mimic human intelligence in nontraditional computing tasks like image recognition, natural language processing (NLP), and translation. Generative AI is the next step in artificial intelligence. You can train it to learn human language, programming languages, art, chemistry, biology, or any complex subject matter. It reuses training data to solve new problems. For example, it can learn English vocabulary and create a poem from the words it processes. Your organization can use generative AI for various purposes, like chatbots, media creation, and product development and design.
Review of Leading AI Company Products and Services:
1. AI poster child Nvidia’s (NVDA) market cap is about $2.3 trillion, due mainly to momentum-obsessed investors who have driven up the stock price. Nvidia currently enjoys 75% gross profit margins and has an estimated 80% share of the Graphic Processing Unit (GPU) chip market. Microsoft and Facebook are reportedly Nvidia‘s biggest customers, buying its GPUs last year in a frenzy.
Nvidia CEO Jensen Huang talks of computing going from retrieval to generative, which investors believe will require a long-run overhaul of data centers to handle AI. All true, but a similar premise about an overhaul also was true for Cisco in 1999.
During the dot-com explosion in the late 1990s, investors believed a long-run rebuild of telecom infrastructure was imminent. Worldcom executives claimed that internet traffic doubled every 100 days, or about 3.5 months. The thinking at that time was that the whole internet would run on Cisco routers at 50% gross margins.
Cisco’s valuation at its peak of the “Dot.com” mania was at 33x sales. CSCO investors lost 85% of their money when the stock price troughed in October 2002. Over the next 16 years, as investors waited to break even, the company grew revenues by 172% and earnings per share by a staggering 681%. Over the last 24 years, CSCO buy and hold investors earned only 0.67% per year!
2. Microsoft is now a cloud computing/data-center company, more utility than innovator. Microsoft invested $13 billion in OpenAI for just under 50% of the company to help develop and roll out ChatGPT. But much of that was funny money — investment not in cash but in credits for Microsoft‘s Azure data centers. Microsoft leveraged those investments into super powering its own search engine, Bing, with generative AI which is now called “Copilot.” Microsoft spends a tremendous amount of money on Nvidia H100 processors to speed up its AI calculations. It also has designed its own AI chips.
3. Amazon masquerades as an online retailer, but is actually the world’s largest cloud computing/data-center company. The company offers several generative AI products and services which include:
- Amazon CodeWhisperer, an AI-powered coding companion.
- Amazon Bedrock, a fully managed service that makes foundational models (FMs) from AI21 Labs, Anthropic, and Stability AI, along with Amazon’s own family of FMs, Amazon Titan, accessible via an API.
- A generative AI tool for sellers to help them generate copy for product titles and listings.
- Generative AI capabilities that simplify how Amazon sellers create more thorough and captivating product descriptions, titles, and listing details.
Amazon CEO Jassy recently said the the company’s generative AI services have the potential to generate tens of billions of dollars over the next few years. CFO Brian Olsavsky told analysts that interest in Amazon Web Services’ (AWS) generative AI products, such as Amazon Q and AI chatbot for businesses, had accelerated during the quarter. In September 2023, Amazon said it plans to invest up to $4 billion in startup chatbot-maker Anthropic to take on its AI based cloud rivals (i.e. Microsoft and Google). Its security teams are currently using generative AI to increase productivity
4. Google, with 190,000 employees, controls 90% of search. Google‘s recent launch of its new Gemini AI tools was a disaster, producing images of the U.S. Founding Fathers and Nazi soldiers as people of color. When asked if Elon Musk or Adolf Hitler had a more negative effect on society, Gemini responded that it was “difficult to say.” Google pulled the product over “inaccuracies.” Yet Google is still promoting its AI product: “Gemini, a multimodal model from Google DeepMind, is capable of understanding virtually any input, combining different types of information, and generating almost any output.”
5. Facebook/Meta controls social media but has lost $42 billion investing in the still-nascent metaverse. Meta is rolling out three AI features for advertisers: background generation, image cropping and copy variation. Meta also unveiled a generative AI system called Make-A-Scene that allows artists to create scenes from text prompts . Meta’s CTO Andrew Bosworth said the company aims to use generative AI to help companies reach different audiences with tailored ads.
Conclusions:
Voracious demand has outpaced production and spurred competitors to develop rival chips. The ability to secure GPUs governs how quickly companies can develop new artificial-intelligence systems. Tech CEOs are under pressure to invest in AI, or risk investors thinking their company is falling behind the competition.
As we noted in a recent IEEE Techblog post, researchers in South Korea have developed the world’s first AI semiconductor chip that operates at ultra-high speeds with minimal power consumption for processing large language models (LLMs), based on principles that mimic the structure and function of the human brain. The research team was from the Korea Advanced Institute of Science and Technology.
While it’s impossible to predict how fast additional fabricating capacity comes on line, there certainly will be many more AI chips from cloud giants and merchant semiconductor companies like AMD and Intel. Fat profit margins Nvidia is now enjoying will surely attract many competitors.
………………………………………………………………………………….,……………………………………….
References:
https://www.zdnet.com/article/how-to-use-the-new-bing-and-how-its-different-from-chatgpt/
https://cloud.google.com/ai/generative-ai
https://aws.amazon.com/what-is/generative-ai/
https://www.wsj.com/articles/amazon-is-going-super-aggressive-on-generative-ai-7681587f
Curmudgeon: 2024 AI Fueled Stock Market Bubble vs 1999 Internet Mania? (03/11)
Korea’s KAIST develops next-gen ultra-low power Gen AI LLM accelerator
Telco and IT vendors pursue AI integrated cloud native solutions, while Nokia sells point products
MTN Consulting: Generative AI hype grips telecom industry; telco CAPEX decreases while vendor revenue plummets
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
Amdocs and NVIDIA to Accelerate Adoption of Generative AI for $1.7 Trillion Telecom Industry
Cloud Service Providers struggle with Generative AI; Users face vendor lock-in; “The hype is here, the revenue is not”
Global Telco AI Alliance to progress generative AI for telcos
Bain & Co, McKinsey & Co, AWS suggest how telcos can use and adapt Generative AI
Generative AI Unicorns Rule the Startup Roost; OpenAI in the Spotlight
Generative AI in telecom; ChatGPT as a manager? ChatGPT vs Google Search
Generative AI could put telecom jobs in jeopardy; compelling AI in telecom use cases
Impact of Generative AI on Jobs and Workers
“SK Wonderland at CES 2024;” SK Group Chairman: AI-led revolution poses challenges to companies
On Tuesday at CES 2024, SK Group [1.] displayed world-leading Artificial Intelligence (AI) and carbon reduction technologies under an amusement park concept called “SK Wonderland.” It provided CES attendees a view of a world that uses the latest AI and clean technologies from SK companies and their business partners to a create a smarter, greener world. Highlights of the booth included:
- Magic Carpet Ride in a flying vehicle embedded with an AI processor that helps it navigate dense, urban areas – reducing pollution, congestion and commuting frustrations
- AI Fortune Teller powered by next-generation memory technologies that can help computers analyze and learn from massive amounts of data to predict the future
- Dancing Car that’s fully electric, able to recharge in 20 minutes or less and built to travel hundreds of miles between charges
- Clean Energy Train that’s capable of being powered by hydrogen, whose only emission is water
- Rainbow Tube that shows how plastics are finding a new life through a technology that turns waste into fuel
Note 1. SK Group is South Korea’s second-largest conglomerate, with Samsung at number one.
SK’s CES 2024 displays include participation from seven SK companies — SK Inc., SK Innovation, SK Hynix, SK Telecom, SK E&S, SK Ecoplant and SKC. While the displays are futuristic, they’re based on technologies that SK companies and their global partners have already developed and are bringing to market.
SK Group Chairman Chey Tae-won said that companies are facing challenges in navigating the transformative era led by artificial intelligence (AI) due to its unpredictable impact and speed. He said AI technology and devices with AI are the talk of the town at this year’s annual trade show and companies are showcasing their AI innovations achieved through early investment.
“We are on the starting line of the new era, and no one can predict the impact and speed of the AI revolution across the industries,” Chey told Korean reporters after touring corporate booths on the opening day of CES 2024 at the Las Vegas Convention Center in Las Vegas. Reflecting on the rapid evolution of AI technologies, he highlighted the breakthrough made by ChatGPT, a language model launched about a year ago, which has significantly influenced how AI is perceived and utilized globally. “Until ChatGPT, no one has thought of how AI would change the world. ChatGPT made a breakthrough, and everybody is trying to ride on the wave.”
SK Group Chairman Chey Tae-won speaks during a brief meeting with Korean media on the sidelines of CES 2024 at the Las Vegas Convention Center in Las Vegas on Jan. 9, 2024
…………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………
SK Hynix Inc., SK Group’s chipmaking unit, is one of the prominent companies at CES 2024, boasting its high-performance AI chips like high bandwidth memory (HBM). The latest addition is the HBM3E chips, recognized as the world’s best-performing memory product. Mass production of HBM3E is scheduled to begin in the first half of 2024.
SK Telecom Co. is also working on AI, having Sapeon, an AI chip startup under its wing. Chey stressed the importance of integrating AI services and solutions across SK Group’s diverse business sectors, ranging from energy to telecommunications and semiconductors. “It’s crucial for each company to collaborate and present a unified package or solution rather than developing them separately,” Chey said. “But I don’t think it is necessary to set up a new unit for that. I think we should come up with an integrated channel for customers.”
SK Telecom and Deutsche Telekom are jointly developing Large Language Models for generative AI to be used by telecom network providers.
…………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………………
References:
https://en.yna.co.kr/view/AEN20240110001900320#
https://eng.sk.com/news/ces-2024-sk-to-showcase-world-class-carbon-reduction-and-ai-technologies
SK Telecom inspects cell towers for safety using drones and AI
SK Telecom and Deutsche Telekom to Jointly Develop Telco-specific Large Language Models (LLMs)
SK Telecom and Thales Trial Post-quantum Cryptography to Enhance Users’ Protection on 5G SA Network
Google announces Gemini: it’s most powerful AI model, powered by TPU chips
Google claims it has developed a new Generative Artificial Intelligence (GenAI) system and Large Language Model (LLM) more powerful than any currently on the market, including technology developed by ChatGPT creator OpenAI. Gemini can summarize text, create images and answer questions. Gemini was trained on Google’s Tensor Processing Units v4 and v5e.
Google’s Bard is a generative AI based on the PaLM large language mode. Starting today, Gemini will be used to give Bard “more advanced reasoning, planning, understanding and more,” according to a Google blog post.
While global users of Google Bard and the Pixel 8 Pro will be able to run Gemini now, an enterprise product, Gemini Pro, is coming on Dec. 13th. Developers can sign up now for an early preview in Android AICore.
Gemini comes in three model sizes: Ultra, Pro and Nano. Ultra is the most capable, Nano is the smallest and most efficient, and Pro sits in the middle for general tasks. The Nano version is what Google is using on the Pixel, while Bard gets Pro. Google says it plans to run “extensive trust and safety checks” before releasing Gemini Ultra to select groups.
Gemini can code in Python, Java, C++, Go and other popular programming languages. Google used Gemini to upgrade Google’s AI-powered code generation system, AlphaCode. Next, Google plans to bring Gemini to Ads, Chrome and Duet AI. In the future, Gemini will be used in Google Search as well.
……………………………………………………………………………………………………………………………………………………
Market Impact:
Gemini’s release and use will present a litmus test for Google’s technology following a push to move faster in developing and releasing AI products. It coincides with a period of turmoil at OpenAI that has sent tremors through the tight knit AI community, suggesting the industry’s leaders is far from settled.
The announcement of the new GenAI software is the latest attempt by Google to display its AI portfolio after the launch of ChatGPT about a year ago shook up the tech industry. Google wanted outside customers to perform testing on the most advanced version of Gemini before releasing it more widely, said Demis Hassabis, chief executive officer of Google DeepMind.
“We’ve been pushing forward with a lot of focus and intensity,” Hassabis said, adding that Gemini likely represented the company’s most ambitious combined science and engineering project to date.
Google said Wednesday it would offer a range of AI programs to customers under the Gemini umbrella. It touted the software’s ability to process various media, from audio to video, an important development as users turn to chatbots for a wider range of needs.
The most powerful Gemini Ultra version outperformed OpenAI’s technology, GPT-4, on a range of industry benchmarks, according to Google. That version is expected to become widely available for software developers early next year following testing with a select group of customers.
………………………………………………………………………………………………………………………………………………………………………………………………
Role of TPUs:
While most GenAI software and LLM’s are processed using NVIDIA’s neural network processors, Google’s tensor processing units (TPUs) will power Gemini. TPUs are custom-designed AI accelerators, which are optimized for training and inference of large AI models. Cloud TPUs are optimized for training large and complex deep learning models that feature many matrix calculations, for instance building large language models (LLMs). Cloud TPUs also have SparseCores, which are dataflow processors that accelerate models relying on embeddings found in recommendation models. Other use cases include healthcare, like protein folding modeling and drug discovery.
Google’s custom AI chips, known as tensor processing units, are embedded in compute servers at the company’s data center. Photo Credit: GOOGLE
…………………………………………………………………………………………………………………………………
Competitors:
Gemini and the products built with it, such as chatbots, will compete with OpenAI’s GPT-4, Microsoft’s Copilot (which is based on OpenAI’s GPT-4), Anthropic’s Claude AI, Meta’s Llama 2 and more. Google claims Gemini Ultra outperforms GPT-4 in several benchmarks, including the massive multitask language understanding general knowledge test and in Python code generation.
…………………………………………………………………………………………………………………………………
References:
Everything to know about Gemini, Google’s new AI model (blog.google)
Google Reveals Gemini, Its Much-Anticipated Large Language Model (techrepublic.com)
MTN Consulting: Generative AI hype grips telecom industry; telco CAPEX decreases while vendor revenue plummets
Ever since Generative (Gen) AI burst into the mainstream through public-facing platforms (e.g. ChatGPT) late last year, its promising capabilities have caught the attention of many. Not surprisingly, telecom industry execs are among the curious observers wanting to try Gen AI even as it continues to evolve at a rapid pace.
MTN Consulting says the telecom industry’s bond with AI is not new though. Many telcos have deployed conventional AI tools and applications in the past several years, but Gen AI presents opportunities for telcos to deliver significant incremental value over existing AI. A few large telcos have kickstarted their quest for Gen AI by focusing on “localization.” Through localization of processes using Gen AI, telcos vow to eliminate language barriers and improve customer engagement in their respective operating markets, especially where English as a spoken language is not dominant.
Telcos can harness the power of Gen AI across a wide range of different functions, but the two vital telco domains likely to witness transformative potential of Gen AI are networks and customer service. Both these domains are crucial: network demands are rising at an unprecedented pace with increased complexity, and delivering differentiated customer experiences remains an unrealized ambition for telcos.
Several Gen AI use cases are emerging within these two telco domains to address these challenges. In the network domain, these include topology optimization, network capacity planning, and predictive maintenance, for example. In the customer support domain, they include localized virtual assistants, personalized support, and contact center documentation.
Most of the use cases leveraging Gen AI applications involve dealing with sensitive data, be it network-related or customer-related. This will have major implications from the regulatory point of view, and regulatory concerns will constrain telcos’ Gen AI adoption and deployment strategies. The big challenge is the mosaic of complex and strict regulations prevalent in different markets that telcos will have to understand and adhere to when implementing Gen AI use cases in such markets. This is an area where third-party vendors will try to cash in by offering Gen AI solutions that are compliant with regulations in the respective markets.
Vendors will also play a key role for small- and medium-sized telcos in Gen AI implementation, by eliminating constraints due to the lack of technical expertise and HW/SW resources, skilled manpower, along with opex costs burden. Key vendors to watch out for in the Gen AI space are webscale providers who possess the ideal combination of providing cloud computing resources required to train large language models (LLM) coupled with their Gen AI expertise offered through pre-trained models.
Other key points from MTN Consulting on Gen AI in the telecom industry:
- Network operations and customer support will be key transformative areas.
- Telco workforce will become leaner but smarter in the Gen AI era.
- Strict regulations will be a major barrier for telcos.
- Vendors key to Gen AI integration; webscale providers set for more telco gains.
- Lock-in risks and rising software costs are key considerations in choosing vendors.
………………………………………………………………………………………………………………………………
Separately, MTN Consulting’s latest forecast called for $320B of telco capex in 2023, down only slightly from the $328B recorded in 2022. Early 3Q23 revenue reports from vendors selling into the telco market call this forecast into question. The dip in the Americas is worse than expected, and Asia’s expected 2023 growth has not materialized.
Key vendors are reporting significant YoY drops in revenue, pointing to inventory corrections, macroeconomic uncertainty (interest rates, in particular), and weaker telco spending. Network infrastructure sales to telcos (Telco NI) for key vendors Ericsson and Nokia dropped 11% and 16% YoY in 3Q23, respectively, measured in US dollars. By the same metric, NEC, Fujitsu and Samsung saw +1%, -52%, and -41% YoY growth; Adtran, Casa, and Juniper declined 29%, 7%, and 20%; fiber-centric vendors Clearfield, Corning, CommScope, and Prysmian all saw double digit declines.
MTN Consulting will update its operator forecast formally next month. In advance, this comment flags a weaker spending outlook than expected. Telco capex for 2023 is likely to come in around $300-$310B.
MTN Consulting’s Network Operator Forecast Through 2027: “Telecom is essentially a zero-growth industry”
MTN Consulting: Top Telco Network Infrastructure (equipment) vendors + revenue growth changes favor cloud service providers
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
Cloud Service Providers struggle with Generative AI; Users face vendor lock-in; “The hype is here, the revenue is not”
Global Telco AI Alliance to progress generative AI for telcos
Amdocs and NVIDIA to Accelerate Adoption of Generative AI for $1.7 Trillion Telecom Industry
Bain & Co, McKinsey & Co, AWS suggest how telcos can use and adapt Generative AI
Generative AI Unicorns Rule the Startup Roost; OpenAI in the Spotlight
Generative AI in telecom; ChatGPT as a manager? ChatGPT vs Google Search
Generative AI could put telecom jobs in jeopardy; compelling AI in telecom use cases
MTN Consulting: Satellite network operators to focus on Direct-to-device (D2D), Internet of Things (IoT), and cloud-based services
MTN Consulting on Telco Network Infrastructure: Cisco, Samsung, and ZTE benefit (but only slightly)
MTN Consulting: : 4Q2021 review of Telco & Webscale Network Operators Capex
Proposed solutions to high energy consumption of Generative AI LLMs: optimized hardware, new algorithms, green data centers
Introduction:
Many generative AI tools rely on a type of natural-language processing called large language models (LLMs) to first learn and then make inferences about languages and linguistic structures (like code or legal-case prediction) used throughout the world. Some companies that use LLMs include: Anthropic (now collaborating with Amazon), Microsoft, OpenAI, Google, Amazon/AWS, Meta (FB), SAP, IQVIA. Here are some examples of LLMs: Google’s BERT, Amazon’s Bedrock, Falcon 40B, Meta’s Galactica, Open AI’s GPT-3 and GPT-4, Google’s LaMDA Hugging Face’s BLOOM Nvidia’s NeMO LLM.
The training process of the Large Language Models (LLMs) used in generative artificial intelligence (AI) is a cause for concern. LLMs can consume many terabytes of data and use over 1,000 megawatt-hours of electricity.
Alex de Vries is a Ph.D. candidate at VU Amsterdam and founder of the digital-sustainability blog Digiconomist published a report in Joule which predicts that current AI technology could be on track to annually consume as much electricity as the entire country of Ireland (29.3 terawatt-hours per year).
“As an already massive cloud market keeps on growing, the year-on-year growth rate almost inevitably declines,” John Dinsdale, chief analyst and managing director at Synergy, told CRN via email. “But we are now starting to see a stabilization of growth rates, as cloud provider investments in generative AI technology help to further boost enterprise spending on cloud services.”
Hardware vs Algorithmic Solutions to Reduce Energy Consumption:
Roberto Verdecchia is an assistant professor at the University of Florence and the first author of a paper published on developing green AI solutions. He says that de Vries’s predictions may even be conservative when it comes to the true cost of AI, especially when considering the non-standardized regulation surrounding this technology. AI’s energy problem has historically been approached through optimizing hardware, says Verdecchia. However, continuing to make microelectronics smaller and more efficient is becoming “physically impossible,” he added.
In his paper, published in the journal WIREs Data Mining and Knowledge Discovery, Verdecchia and colleagues highlight several algorithmic approaches that experts are taking instead. These include improving data-collection and processing techniques, choosing more-efficient libraries, and improving the efficiency of training algorithms. “The solutions report impressive energy savings, often at a negligible or even null deterioration of the AI algorithms’ precision,” Verdecchia says.
……………………………………………………………………………………………………………………………………………………………………………………………………………………
Another Solution – Data Centers Powered by Alternative Energy Sources:
The immense amount of energy needed to power these LLMs, like the one behind ChatGPT, is creating a new market for data centers that run on alternative energy sources like geothermal, nuclear and flared gas, a byproduct of oil production. Supply of electricity, which currently powers the vast majority of data centers, is already strained from existing demands on the country’s electric grids. AI could consume up to 3.5% of the world’s electricity by 2030, according to an estimate from IT research and consulting firm Gartner.
Amazon, Microsoft, and Google were among the first to explore wind and solar-powered data centers for their cloud businesses, and are now among the companies exploring new ways to power the next wave of AI-related computing. But experts warn that given their high risk, cost, and difficulty scaling, many nontraditional sources aren’t capable of solving near-term power shortages.
Exafunction, maker of the Codeium generative AI-based coding assistant, sought out energy startup Crusoe Energy Systems for training its large-language models because it offered better prices and availability of graphics processing units, the advanced AI chips primarily produced by Nvidia, said the startup’s chief executive, Varun Mohan.
AI startups are typically looking for five to 25 megawatts of data center power, or as much as they can get in the near term, according to Pat Lynch, executive managing director for commercial real-estate services firm CBRE’s data center business. Crusoe will have about 200 megawatts by year’s end, Lochmiller said. Training one AI model like OpenAI’s GPT-3 can use up to 10 gigawatt-hours, roughly equivalent to the amount of electricity 1,000 U.S. homes use in a year, University of Washington research estimates.
Major cloud providers capable of providing multiple gigawatts of power are also continuing to invest in renewable and alternative energy sources to power their data centers, and use less water to cool them down. By some estimates, data centers account for 1% to 3% of global electricity use.
An Amazon Web Services spokesperson said the scale of its massive data centers means it can make better use of resources and be more efficient than smaller, privately operated data centers. Amazon says it has been the world’s largest corporate buyer of renewable energy for the past three years.
Jen Bennett, a Google Cloud leader in technology strategy for sustainability, said the cloud giant is exploring “advanced nuclear” energy and has partnered with Fervo Energy, a startup beginning to offer geothermal power for Google’s Nevada data center. Geothermal, which taps heat under the earth’s surface, is available around the clock and not dependent on weather, but comes with high risk and cost.
“Similar to what we did in the early days of wind and solar, where we did these large power purchase agreements to guarantee the tenure and to drive costs down, we think we can do the same with some of the newer energy sources,” Bennett said.
References:
https://aws.amazon.com/what-is/large-language-model/
https://spectrum.ieee.org/ai-energy-consumption
https://www.crn.com/news/cloud/microsoft-aws-google-cloud-market-share-q3-2023-results/6
Amdocs and NVIDIA to Accelerate Adoption of Generative AI for $1.7 Trillion Telecom Industry
SK Telecom and Deutsche Telekom to Jointly Develop Telco-specific Large Language Models (LLMs)
AI Frenzy Backgrounder; Review of AI Products and Services from Nvidia, Microsoft, Amazon, Google and Meta; Conclusions