April 24, 2025
Trending News

AWS showcases Project Rainer: custom HPC cluster from Anthropic

  • December 4, 2024
  • 0

AWS is showing Project Rainer at Re:Invent. This is an HPC cluster for AI workloads that is built on the basis of its own chips. The system is

AWS showcases Project Rainer: custom HPC cluster from Anthropic

The smartphone battery runs out more quickly in winter
Energy meter
Discover HPE
Lancom made in Germany

AWS is showing Project Rainer at Re:Invent. This is an HPC cluster for AI workloads that is built on the basis of its own chips. The system is intended to support OpenAI competitor Anthropic in the development of models.

At Re:Invent in Las Vegas, AWS will demonstrate Project Rainer to the general public. Project Rainer is an HPC supercluster built on the basis of hundreds of thousands of self-developed Trainium 2 chips. The original name of these components suggests that they are intended for AI training workloads.

The Rainer project is divided into Trn2 Ultraserver. These are servers consisting of 64 Trainium 2 chips. Each chip has 96 gigabytes of HBM memory and eight NeuronCores. Together they ensure that an Ultraserver brings 332 petaflops of FP8 computing power to the battlefield.

AWS virtually links the Ultraservers together. The hardware for the Project Rainer supercluster is distributed across data centers at various locations. In this way, AWS wants to ensure that there is enough power available to power the entire system.

Higher latency

On the other hand, Project Rainer doesn’t have spectacularly low latency. The cloud provider developed its own network technology under the name Elastic fabric adapter to compensate for this disadvantage to some extent. The Elastic Fabric Adapter ensures that traffic does not have to go through the operating system, improving the overall communication speed in the cluster.

Project Rainer is not yet completed. AWS expects to complete the cluster next year. When this happens, the HPC cluster will be the world’s largest for training AI models. OpenAI competitor Anthropic can then get started. This means that the company has five times more computing power available to develop its models than it does today.

AWS is investing heavily in Anthropic and is trying to create a counterbalance to the Microsoft-OpenAI tandem with this collaboration. Microsoft is also supporting OpenAI with money and raw computing power in Azure.

Source: IT Daily

Leave a Reply

Your email address will not be published. Required fields are marked *