May 3, 2025
Trending News

Google introduces Trillium TPU v6: improved performance for AI models

  • November 6, 2024
  • 0

Google announces Trillium, the latest generation of its Tensor Processing Units (TPU) for Google Cloud customers. Trillium offers improved performance for both training and inference tasks, optimizing energy

Google introduces Trillium TPU v6: improved performance for AI models

Google Cloud
Europe Security NIS2
Cloud dark
Lancom
Niche2
WiFi 7

Google announces Trillium, the latest generation of its Tensor Processing Units (TPU) for Google Cloud customers. Trillium offers improved performance for both training and inference tasks, optimizing energy consumption and costs.

At the App Dev & Infrastructure Summit last week, Google announced Trillium, its sixth-generation TPU that shows a leap forward in performance. Compared to the previous TPU v5e, Trillium delivers over four times better training performance and up to three times higher inference throughput. In addition, Trillium increases energy efficiency by 67 percent and doubles the capacity of the High Bandwidth Memory (HBM) and the bandwidth of the Interchip Interconnect (ICI). This makes the sixth generation suitable for AI models. Trillium is available in preview for Google Cloud customers.

Language models

The enhancements enable larger AI models, such as Large Language Models (LLMs) and computationally intensive diffusion models, to be trained and deployed more efficiently. Google specifically mentions models such as Gemma 2, Llama and Stable Diffusion XL as applications that benefit from the new TPU architecture.

With doubled HBM capacity, Trillium can work with larger models with complex networks and key-value caches, contributing to more efficient resource utilization. This significantly increases performance per chip, with peak performance 4.7 times higher than the previous generation.

Scalability and cost advantages

Trillium is designed for high scalability. The TPU can connect up to 256 chips in a single pod, which can then be scaled to hundreds of pods. This creates a building-scale supercomputer connected to the Jupiter data center network at 13 petabits per second. The Multislice software ensures almost linear scalability under high workloads and enables the use of the TPU for complex and intensive training scenarios.

In addition to the performance improvements, Google also emphasizes Trillium’s cost-effectiveness. The new TPU offers almost 1.8x more performance per dollar compared to the TPU v5e and almost double compared to the TPU v5p. This makes Trillium a cost-effective choice for customers requiring high-performance and scalable infrastructure for large-scale AI training and inference.

Google hopes these innovations will usher in a new era for applications that require large-scale AI models. Trillium is now available in preview for Google Cloud users.

Source: IT Daily

Leave a Reply

Your email address will not be published. Required fields are marked *