Amazon Web Services (AWS) has announced the general availability of its new Trainium2 (T2) AI chips, designed to significantly accelerate the training and deployment of large language models (LLMs). These chips boast performance exceeding current Nvidia GPUs, with a single EC2 instance featuring 16 T2 chips delivering up to 20.8 petaflops of compute performance for dense models and FP8 precision. AWS highlights that using Trainium2 in Amazon's Bedrock LLM platform results in a three times higher token-generation throughput compared to competing cloud providers. The company also unveiled EC2 Trn2 UltraServers, each incorporating 64 interconnected Trainium2 chips, capable of scaling to 83.2 peak petaflops for FP8 with sparse models.
AWS is collaborating with Anthropic to build a massive cluster using these UltraServers, incorporating hundreds of thousands of Trainium2 chips. This cluster is projected to be five times more powerful than Anthropic's previous generation, potentially becoming the world's largest reported AI compute cluster. The impressive performance increase stems from the NeuronLink interconnect linking the Trainium chips, enabling efficient communication and scaling. This advancement positions AWS as a strong competitor in the rapidly evolving AI chip market, offering a compelling alternative to currently dominant players.
Looking to the future, AWS also previewed Trainium3, promising a four-fold performance increase over Trainium2 and slated for release in late 2025. Reports indicate that Apple is already utilizing Trainium2 chips for search applications and considering broader integration within its Apple Intelligence systems, further underscoring the significant impact and potential of this new technology. The release of Trainium2 and the upcoming Trainium3 mark significant steps in AWS’s ongoing effort to provide leading-edge AI infrastructure for the rapidly growing demand for LLM development and deployment.