Be a part of our every day and weekly newsletters for the newest updates and unique content material on industry-leading AI protection. Study Extra
Elon Musk’s synthetic intelligence startup xAI has unveiled Grok 3, its newest AI mannequin that the corporate claims outperforms main opponents throughout key technical benchmarks. The announcement marks a major escalation within the race to develop extra highly effective AI techniques.
The launch comes simply days after Musk’s failed $97.4 billion bid to amass OpenAI, the corporate he co-founded with Sam Altman in 2015. Throughout a livestreamed demonstration on X, Musk characterised Grok 3 as “an order of magnitude extra succesful than Grok 2” and emphasised its potential to purpose by way of complicated issues.
Early testing seems to assist a few of xAI’s claims. The mannequin topped the influential Chatbot Area leaderboard, scoring greater than OpenAI’s GPT-4o, Google’s Gemini and DeepSeek’s V3 mannequin in blind consumer testing. Revealed benchmarks present Grok 3 attaining superior scores in arithmetic (AIME ’24), scientific reasoning (GPQA) and coding duties.

Inside Grok 3’s large computing infrastructure: 200,000 GPUs and a brand new knowledge middle
“Grok 3 clearly has round state-of-the-art considering capabilities,” wrote former OpenAI researcher Andrej Karpathy in an X publish after early-access testing. “Few fashions get this proper reliably. The highest OpenAI considering fashions get it too, however all of DeepSeek-R1, Gemini 2.0 Flash Pondering, and Claude don’t.”
The mannequin’s growth required large computational assets. xAI doubled its GPU cluster to 200,000 Nvidia chips for coaching, housed in a brand new Memphis knowledge middle. This infrastructure funding highlights the rising computational calls for of superior AI growth, as firms race to construct extra succesful techniques.
I used to be given early entry to Grok 3 earlier at present, making me I feel one of many first few who may run a fast vibe verify.
Pondering
— Andrej Karpathy (@karpathy) February 18, 2025
✅ First, Grok 3 clearly has an round state-of-the-art considering mannequin (“Suppose” button) and did nice out of the field on my Settler’s of Catan… pic.twitter.com/qIrUAN1IfD
DeepSearch and superior reasoning: how Grok 3 goals to outsmart ChatGPT and Google Gemini
A key innovation is Grok 3’s “DeepSearch” function, which mixes net looking with reasoning capabilities to research info from a number of sources. The system additionally consists of specialised modes for complicated problem-solving, together with a “Suppose” operate that reveals its reasoning course of and a “Large Mind” mode that allocates extra computing energy to tough duties.
“The factor to actually take note of in AI is studying pace. And @xai is studying manner sooner than every other,” posted tech {industry} veteran Robert Scoble, citing a dialog with Apple Siri cofounder Tom Gruber.
Grok 3 benchmarks.
The factor to actually take note of in AI is studying pace. And @xai is studying manner sooner than every other.
Who mentioned that?
Apple Siri cofounder Tom Gruber. He instructed me at dinner a decade in the past that that’s crucial factor to concentrate to. pic.twitter.com/yWCiJsN9pU
— Robert Scoble (@Scobleizer) February 18, 2025
Nonetheless, some limitations emerged throughout testing. Karpathy famous that the mannequin typically fabricates citations and struggles with sure sorts of humor and moral reasoning duties. These challenges are frequent throughout present AI techniques and spotlight the continued difficulties in creating actually human-like synthetic intelligence.
Scale.ai CEO Alexandr Wang praised the discharge, tweeting: “Grok 3 is a brand new finest mannequin on the earth from the @xai staff!” He famous its superior efficiency on numerous benchmarks and expressed enthusiasm for future collaboration.
Grok 3 is a brand new finest mannequin on the earth from the @xai staff!
Grok 3 ranks #1 on Chatbot Area w/a giant hole, and scores impressively on pretraining and reasoning evals.
congrats to @elonmusk @ibab @jimmybajimmyba @Yuhu_ai_
wanting ahead to extra partnership on grok4 & past ? pic.twitter.com/BrPGz17P51
— Alexandr Wang (@alexandr_wang) February 18, 2025
AI {industry} competitors heats up: what Grok 3’s launch means for OpenAI, DeepSeek and the way forward for synthetic intelligence
The mannequin will probably be accessible by way of X’s Premium+ subscription ($40/month) and a brand new standalone “SuperGrok” service ($30/month). Enterprise API entry is deliberate for the approaching weeks.
This launch intensifies competitors within the AI {industry}, notably as Chinese language startup DeepSeek not too long ago demonstrated comparable efficiency with reportedly decrease computational necessities. The event additionally raises questions concerning the sustainability of the computational arms race in AI, as firms make investments billions in more and more highly effective {hardware} infrastructure.

Musk emphasised that Grok 3 stays in beta, with enhancements anticipated “virtually on daily basis.” The corporate plans so as to add voice interplay capabilities inside weeks and can open-source its earlier mannequin, Grok 2, as soon as the brand new model stabilizes.
But maybe essentially the most telling side of Grok 3’s debut isn’t its technical specs or benchmark scores, however what it represents: the mounting pressure between Musk and his former colleagues at OpenAI. Simply days after his failed $97.4 billion bid to amass OpenAI, Musk has unveiled a mannequin that challenges its supremacy — suggesting that within the high-stakes race for AI dominance, even a rejected suitor can change into a formidable rival.