AI gets to work: the winners of the next era for the theme
Semiconductor stocks have been surging ever since ChatGPT’s launch in 2022 sparked the AI theme into full flame. While names like Nvidia have been prominent, the list of AI beneficiaries is getting longer and is now encompassing CPUs, the brain handling logic in computers and laptops.
Nvidia was the standout beneficiary in the race to develop large language models because its GPU chips, which were originally designed for gaming, proved to be faster and more adaptable for training generative AI.
Semiconductor shares are benefiting from huge demand for high bandwidth memory chips which are required to run alongside Nvidia’s AI chips. Notwithstanding a recent correction which has taken some heat out of the market.
As the industry pivots towards inference and AI applications, from training and learning, the demand for CPUs now appears to be exploding. We discuss these trends and outlook for CPU makers later in the article.
Why have semiconductor stocks boomed?
The benchmark PHLX Semiconductor index, commonly known as the SOX has been on a tear since 2022, rising almost six-fold.
The SOX is a market value weighted index comprising the 30 largest US traded companies involved in the design, production and sale of semiconductors.
Surging demand for high-end AI data storage chips has tipped the notoriously cyclical semiconductor industry into a severe shortage that analysts believe will last for at least the next two years.
That has created a huge tail wind for the likes of Korea-based SK Hynix with the company on track to more than quadruple profits this year, based on consensus forecasts.
Profits growth reflects strength in memory chip prices which have surged dramatically as capacity has been diverted towards AI data centres amid rapid demand from hypyerscalers like Microsoft and Alphabet.
This means the electronics industry has been starved of capacity which has had a knock-on effect on RAM (Randon Access Memory) prices.
RAM is an umbrella term used for temporary system memory, sometimes referred to as flash memory.
The price of pre-packaged 32G computer RAM has jumped from $100 in late 2025 to more than $350 in early 2026.
Can’t chip makers just increase capacity?
There is little doubt that capacity will rise to meet demand eventually, but the question of when is harder to pin down for a couple of reasons.
Firstly, AI systems built around Nvidia GPUs to train large language models like ChatGPT need hundreds of gigabytes of ultra-fast memory.
This a special type of memory known as HBM (High Bandwidth Memory) and very few companies can make it at scale to satisfy the huge demand coming from the AI data centre build out.
To put that growth into context, Morgan Stanley recently increased its forecast for AI-related capital expenditures to $850 billion in 2026, nearly doubling 2025 levels, and this total is expected to reach more than a trillion dollars in 2027.
Effectively, demand for AI chips has transformed HBMs into a niche product and one of the most important components in AI technology systems.
HBMs are harder to manufacture than regular chips because they use advanced packaging and are part of a complex supply chain.
The second reason clouding the capacity question is that the industry has shrunk to just three major players in recent years who collectively control around 95% of production.
This structure means the remaining chip makers have much greater pricing power and revenue visibility. In summary, only a few firms can manufacture HBM at scale while supply chain complexity makes it harder to roll-out new capacity.
The key players in HBM are Korean companies SK Hynix and Samsung Electronics, whose shares are up 16-fold and three-fold respectively over the last five years, and US chip maker Micron Technology, which is up nine-fold, and recently became the 13th US company to reach $1 trillion in market value.
What is the difference between a CPU and GPU?
It might be useful to use a football analogy. Think of GPUs or graphics processing units as the players doing all the work on the pitch, they are the engine room. The CPUs or central processing units are like managers and coaches, directing play, issuing instructions and orchestrating the set pieces and strategy. CPUs are responsible for retrieving data from storage, cleaning and formatting it, and feeding it to the GPUs at the exact right moment.
Could the next shortage be in CPUs?
In its first-quarter results call on 23 April Intel said the ratio of CPUs to GPUs in AI data centres has already moved from 1:8 to 1:4 in agentic scenarios where inference is dominant.
The company believes the ratio could converge towards parity or go even further as data centre workloads pivot towards inference.
That represents a structural increase in CPU demand per unit of AI capacity deployed of between four and eight times, which is significant.
Inference is driving CPU demand
As enterprises develop more use cases for AI and integrate them into their businesses, analysts expect the volume of inference to explode.
In early 2026 spending on inference overtook spending on training and analysts project it will account for 70% to 80% of all AI infrastructure budgets by the end of the year.
Because inference can run on a wider variety of hardware than training including cheaper, specialized custom chips and even consumer devices, it has the potential to break Nvidia’s stranglehold on the AI hardware supply chain.
That said, Nvidia is not sitting on its hands with CEO Jensen Huang recently saying its new Vera CPU, designed for AI agents, gives it access to a new $200 billion market.
“This (Vera CPU) is going to be our new major growth driver,” said Huang while visiting Taiwan.
Why agentic AI is a key driver
While a company might spend $100 million training an AI model, that is dwarfed by the billions spent running it for global customers over the life cycle of a data centre.
As the industry moves towards autonomous AI agents, often referred to as agentic AI, the number of prompts increases exponentially as a feedback loops kick in.
AI agents operate in continuous autonomous loops to generating code, execute tests, query databases, and interact with software. This shifts the demand driver from single users to AI agents.
In short, the shift to inference positions CPUs higher up the value chain while they are also cost-effective for small language models and edge computing (where the processing is completed close to its source).
Is CPU capacity as constrained as HBM capacity?
There is mounting evidence of capacity constraints with Intel saying it is “absolutely constrained” and prioritising capacity into data centres at the expense of low-end consumer electronics.
Meanwhile, AMD’s CEO Lisa Su has stated that global demand for CPUs has far outstripped the company’s forecasts. In response the company has been working with partners like TSMC (Taiwan Semiconductor Manufacturing Company) to increase capacity.
Despite these supply pressures, the constraint in CPUs is distributed among several players like Intel, AMD and ARM, as well companies like Alphabet which has developed custom-made AI chips with Broadcom.
This means the market is more competitive and less concentrated than the markets for GPUs and memory storage.
With AI moving so fast it is difficult to know how long the CPU shortage will last but the shift towards autonomous agents looks durable and remains a key driver for inference and therefore CPU demand.
