Elon Musk’s AI startup, xAI, has achieved a major milestone by bringing its new AI training system, ‘Colossus,’ online over the weekend. This system represents a significant leap forward in the field of artificial intelligence.
In a tweet, Musk announced that the xAI team successfully brought the Colossus 100K H100 training cluster online, completing the project in a remarkably fast 122 days. He highlighted that Colossus is currently the most powerful AI training system in the world.
Even more impressive is the planned expansion of Colossus. Within the next few months, it will be doubled in size, incorporating 50,000 NVIDIA’s advanced H200 AI GPUs. These GPUs feature faster HBM3E memory, exceeding the capacity of the H100 AI GPU.
Colossus houses an impressive 100,000 NVIDIA’s cutting-edge Hopper H100 AI GPUs. This massive computing power positions xAI to develop next-generation large language models with capabilities far surpassing its current flagship LLM, Grok 2, which was trained on 15,000 AI GPUs.
The potential of Colossus is immense. Musk had previously indicated that training Grok 3 would require 100,000 NVIDIA H100 AI GPUs. Now, with Colossus operational, the xAI team is equipped to push the boundaries of AI development.
NVIDIA’s new Hopper H200 AI GPUs boast up to 141GB of faster HBM3E memory compared to the H100 AI GPU’s 80GB of HBM3 memory. This upgrade, combined with Colossus’ massive scale, provides xAI with an unparalleled advantage in AI training.
The successful launch of Colossus marks a significant moment for xAI and the future of artificial intelligence. With this immense computational power, Elon Musk and his team are poised to revolutionize AI development, potentially leading to breakthroughs in language models and other AI applications.