Wanted: GPUs, but is there a shortage?

Everyone wants to get in on the AI hype, but not everyone has the necessary hardware to train or use models. Dealing with the GPU shortage will be the subject of much discussion during KubeCon.

“People always talk about a GPU shortage, but is it true?” asks Lachlan Evenson, Principal Program Manager from Microsoft, is vocal about it. “Today, billions of people walk around with a CPU and a GPU in their pockets. Instead of always wanting the newest, most powerful chips, shouldn’t we be thinking about how we can get more out of the devices we already use?

There is currently no hardware more in demand in the IT industry than a GPU. Nvidia only has to announce a new chip and the stocks are sold out, even though they each cost tens of thousands of euros. Hyperscalers and AI specialists are ordering tens of thousands of chips per shipment, while the rest are looking for surpluses. KubeCon was also completely fascinated by AI and GPUs.

Not a new problem

The call for GPUs in the IT industry is louder than ever, but the problem is not new. Graphics cards have been a scarce commodity since 2020. During the Corona pandemic, demand for all kinds of IT hardware and components increased disproportionately and manufacturers simply could not keep up with this demand.

Billions of people today walk around with a CPU and a GPU in their pockets. Instead of always wanting the newest, most powerful chips, shouldn’t we be thinking about how we can get more out of the devices we already use?

Lachlan Evenson, Principal Program Manager, Microsoft

Initially it was mainly crypto miners, gamers and hobby users who took advantage of GPUs, but the current hype around (generative) AI has made them sought after by a wider audience. The shortage is shifting from lower-priced consumer GPUs to high-end graphics chips.

While the CPU segment has gradually stabilized, the gap between supply and demand for GPUs appears to be widening. Jonathan Bryce, Managing Director of the OpenInfra Foundation, outlines the current situation. “Users of public cloud services generally don’t care about the underlying hardware. This is changing with AI and especially with generative AI as it requires specific chips. The demand for GPUs will therefore not fall again anytime soon.”

ups and downs

GPUs are the foundation for both training and the foundation of AI models. Originally, graphics cards were primarily intended to handle graphics operations from the CPU. While a CPU consists of a handful of powerful cores that sequentially distribute tasks among themselves, a GPU has up to thousands of smaller cores that work together in parallel and are therefore all used at the same time. This makes GPUs more suitable for tasks that require a large number of smaller operations.

Think of it this way: four very strong construction workers can move heavy beams easily, but five hundred small children move five hundred small stones faster. AI training and inference consists of a lot of work that needs to be done in parallel, but where a single task is not that complex

AI workloads, on the other hand, cause large spikes in GPU consumption. Training and developing AI models is an intensive process, but technically speaking it mostly revolves around day-to-day use Conclusions that’s what causes these spikes. Still, many companies struggle with efficiency issues during these peak times, resulting in the GPUs in the arsenal not performing to their full potential, AI company ClearML found in a recent study.

The solution to the GPU shortage lies not only in producing more GPUs, but above all in getting the most out of each chip. Bryce agrees: “An increase in efficiency of just a few percent per unit equals more GPUs.”

The red carpet has been rolled out for Nvidia: Of course, the GPU specialist par excellence cannot be missed at Kubecon. Company representatives appear in several sessions to talk about how companies can get more out of their (Nvidia) GPUs. Nvidia engineer Kevin Klues refers, among other things, to the concept of GPU sharing. This can be done in different ways, but the principle is the same: a physical GPU is split into multiple virtual GPUs, allowing different workloads to run on the chip at the same time. Nvidia introduced this capacity in amps with MIG technology.

In the spirit of openness and togetherness that KubeCon aims to convey, the GPU theme unites even sworn rivals. Intel and Nvidia actually appear side by side during a meeting to talk about it dynamic Resource allocation (DRA). A remarkable combination, as one would most like to break the AI dominance of the other.

Arun Gupta briefly and succinctly explains what DRA means. He leads open source activities at Intel. “DRA is a broader concept and together with Nvidia we are exploring how we can apply it to GPUs in Kubernetes. This is not a standalone hardware management system, but can be used to determine how workloads handle resources. One of the possibilities of DRA is Cutting time“Distributing GPUs across multiple containers”. Gupta wants to emphasize that Intel made this up.

Don’t forget the CPU

Sudha Raghavan, Vice President of Developer Platform At Oracle, the solution looks in a different direction: “In many situations, AI workloads can run more cost-effectively on a CPU than on a GPU.” A statement that sounds like music to Gupta’s ears. “The GPU explosion we are currently experiencing comes with a large footprint. But should we use a GPU for everything? “Shouldn’t we just get more out of the CPU to free up the GPU?” says Gupta.

It sounds like AI is turning the hierarchy between processors on its head. The original purpose of the GPU was to support the CPU, but now let’s look at how the CPU can support the GPU. Intel is also doing its best to convince the IT world of the usefulness of an NPU, a chiplet to bring AI to the device level. “LLMs have a complex structure. Cloud-native is a good starting point for running models, but once you’re tied to the hardware, it’s difficult to break away from it,” says Gupta, taking a swipe at Nvidia between the lines.

Do we have to use a GPU for everything? Once you get stuck with the hardware, it’s hard to break free.

Arun Gupta, GM Open Ecosystem Intel

However, Intel and Gupta will also have to realize that they cannot currently ignore Nvidia in the AI sector. AI is rarely discussed during KubeCon without the GPU specialist spontaneously coming to mind. Nvidia is the one and only conductor of the AI orchestra.

Source: IT Daily

Mary

As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.

Recent Posts

An unprecedented ecosystem found under the ice of an Antarctic lake December 24, 2024

Vivo Y29 5G officially launched: entry level with LED Dynamic Light December 24, 2024

Can animals have schizophrenia like humans? December 24, 2024