Supercomputer Frontier trains LLM in ChatGPT format with a fraction of the computing power

US researchers have trained a giant LLM using the Frontier supercomputer. Previously, they only used eight percent of the available acceleration power.

The world’s most powerful supercomputer and first exascale system, Frontier knows how to train Large language models (LLMs). This emerges from a study by the Oak Ridge National Laboratory in the USA, where the supercomputer is located. The researchers trained two LLMs, one with a trillion parameters and one with 175 billion parameters. To do this, they only used a fraction of the computing power available to Frontier.

Memory

Frontier features 7,472 AMD Epyc 7A53 processors supported by 37,888 AMD Instinct MI250X accelerators. Storage is the biggest bottleneck when training an LLM and the researchers needed about 14TB. 64 GB is available for each MI250X. Combining GPUs is the solution, but such immense parallelization of training workloads itself brings complications. Ultimately, the GPUs must communicate with each other very quickly and precisely.

Ultimately, the researchers used 3,072 of the 37,888 GPUs to train the models. Scaling was an issue because such work is typically done on Nvidia hardware within the CUDA ecosystem. The researchers found that AMD’s competing ROCm platform was a bit more spartan in comparison. Through their work, the scientists are giving the AMD ecosystem a nice boost.

Source: IT Daily

Mary

As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.

Recent Posts

An unprecedented ecosystem found under the ice of an Antarctic lake December 24, 2024

Vivo Y29 5G officially launched: entry level with LED Dynamic Light December 24, 2024

Can animals have schizophrenia like humans? December 24, 2024