OpenAI launches preview of Codex AI SWE agent for builders

Be a part of our day by day and weekly newsletters for the most recent updates and unique content material on industry-leading AI protection. Study Extra

Shock! Simply days after studies emerged suggesting OpenAI was shopping for white-hot coding startup Windsurf, the previous firm seems to be launching its personal competitor service as a analysis preview beneath its model identify Codex, going head-to-head in opposition to Windsurf, Cursor, and the rising record of AI coding instruments supplied by startups and enormous tech corporations together with Microsoft and Amazon.

In contrast to OpenAI’s earlier Codex code completion AI mannequin, the brand new model is a full cloud-based AI software program engineering (SWE) agent constructed atop a fine-tuned model of OpenAI’s o3 reasoning mannequin that may execute a number of improvement duties in parallel.

Beginning in the present day it will likely be accessible for ChatGPT Professional, Enterprise, and Group customers, with assist for Plus and Edu customers anticipated quickly.

Codex’s evolution: from mannequin to autonomous AI coding agent

This launch marks a major step ahead in Codex’s improvement. The authentic Codex debuted in 2021 as a mannequin for translating pure language into code accessible by OpenAI’s nascent software programming interface.

It was the engine behind GitHub Copilot, the favored autocomplete-style coding assistant designed to work inside IDEs like Visible Studio Code.

That preliminary iteration targeted on code era and completion, skilled on billions of strains of public supply code.

Nonetheless, the early model got here with limitations. It was susceptible to syntactic errors, insecure code options, and biases embedded in its coaching knowledge. Codex often proposed superficially appropriate code that failed functionally, and in some instances, made problematic associations based mostly on prompts.

Regardless of these flaws, it confirmed sufficient promise to determine AI coding instruments as a quickly rising product class. That authentic mannequin has since been deprecated and was the identify of a brand new suite of merchandise, in accordance with an OpenAI spokesperson.

GitHub Copilot formally transitioned off OpenAI’s Codex mannequin in March 2023, adopting GPT-4 as a part of its Copilot X improve to allow deeper IDE integration, chat capabilities, and extra context-aware code options.

Agentic visions

The brand new Codex goes far past its predecessor. Now constructed to behave autonomously over longer durations, Codex can write options, repair bugs, reply codebase-specific questions, run exams, and suggest pull requests—every process operating in a safe, remoted cloud sandbox.

The design displays OpenAI’s broader ambition to maneuver past fast solutions and into collaborative work.

Josh Tobin, who leads the Brokers Analysis Group at OpenAI, mentioned throughout a latest briefing: “We consider brokers as AI methods that may function in your behalf for an extended time period to perform large chunks of labor by interacting with the true world.” Codex matches squarely into this definition. “Our imaginative and prescient is that ChatGPT will change into virtually like a digital coworker—not simply answering fast questions, however collaborating on substantial work throughout a spread of duties,” he added.

Figures launched by OpenAI present that the brand new Codex-1 SWE agent outperforms all of OpenAI’s newest reasoning fashions on inside SWE duties.

New capabilities, new interface, new workflows

Codex duties are initiated by a sidebar interface in ChatGPT, permitting customers to immediate the agent with duties or questions.

The agent processes every request in an air-gapped surroundings loaded with the consumer’s repository and configured to reflect the event setup. It logs its actions, cites check outputs, and summarizes modifications—making its work traceable and reviewable.

Alexander Embiricos, head of OpenAI’s Desktop & Brokers crew (and the previous CEO and co-founder of screenshare collaboration startup Multi that OpenAI acquired for an undisclosed sum final 12 months) mentioned in a briefing with journalists that “the Codex agent is a cloud-based software program engineering agent that may work on many duties in parallel, with its personal pc to run safely and independently.”

Internally, he mentioned, engineers already use it “like a morning to-do record—hearth off duties to Codex and return to a batch of draft options able to evaluate or merge.”

Codex additionally helps configuration by AGENTS.md recordsdata—project-level guides that train the agent tips on how to navigate a codebase, run particular exams, and comply with home coding kinds.

“We skilled our mannequin to learn code and infer model—like whether or not or to not use an Oxford comma—as a result of code model issues as a lot as correctness,” Embiricos mentioned.

Safety and sensible use

Codex executes duties with out web entry, drawing solely on user-provided code and dependencies. This design ensures safe operation and minimizes potential misuse.

“That is greater than only a mannequin API,” mentioned Embiricos. “As a result of it runs in an air-gapped surroundings with human evaluate, we can provide the mannequin much more freedom safely.”

OpenAI additionally studies early exterior use instances. Cisco is evaluating Codex for accelerating engineering work throughout its product strains. Temporal makes use of it to run background duties like debugging and check writing. Superhuman leverages Codex to enhance check protection and allow non-engineers to recommend light-weight code modifications. Kodiak, an autonomous automobile agency, applies it to enhance code reliability and acquire insights into unfamiliar stack parts.

OpenAI can be rolling out updates to Codex CLI, its light-weight terminal agent for native improvement. The CLI now makes use of a smaller mannequin—codex-mini-latest—optimized for low-latency modifying and Q&A.

The pricing is about at $1.50 per million enter tokens and $6 per million output tokens, with a 75% caching low cost. Codex is at present free to make use of through the rollout interval, with charge limits and on-demand pricing choices deliberate.

Does this imply OpenAI IS NOT shopping for Windsurf? Pondering face emoji

The discharge of Codex comes amid elevated competitors within the AI coding instruments area—and alerts that OpenAI is intent on constructing, reasonably than shopping for, its subsequent part of merchandise.

In accordance with latest knowledge from SimilarWeb, visitors to developer-focused AI instruments has surged by 75% over the previous 12 weeks, underscoring the rising demand for coding assistants as important infrastructure reasonably than experimental add-ons.

Stories from TechCrunch and Bloomberg recommend OpenAI held acquisition talks with fast-growing AI dev instrument startups Cursor and Windsurf. Cursor allegedly walked away from the desk; Windsurf reportedly agreed in precept to be acquired by OpenAI for a worth of $3 billion, although no deal has been formally confirmed by both OpenAI or Windsurf.

Simply yesterday, the truth is, Windsurf debuted its circle of relatives of coding-focused basis fashions, SWE-1, purpose-built to assist the complete software program engineering lifecycle, from debugging to long-running mission upkeep. SWE-1 fashions had been reported customized made, skilled solely in-house utilizing a brand new sequential knowledge mannequin tailor-made to real-world improvement workflows.

Many issues could also be occurring behind the scenes between the 2 corporations, however to me, the timing of Windsurf launching its personal coding basis mannequin — as a substitute of its technique to-date of utilizing Llama variants and giving customers the choice to fit in OpenAI and Anthropic fashions — adopted someday later by OpenAI releasing its personal Windsurf competitor, appears to recommend the 2 usually are not aligning quickly.

However however, the truth that this new Codex AI SWE agent is in “analysis preview” to begin could also be a type of OpenAI pressuring Windsurf or Cursor or anybody else to return to the bargaining desk and strike a deal. Requested concerning the potential for a Windsurf acquisition and studies of 1 thereof, an OpenAI spokesperson informed VentureBeat they’d nothing to share on that entrance.

In both case, Embiricos frames Codex as way over a mere code instrument or assistant.

“We’re about to endure a seismic shift in how builders work with brokers—not simply pairing with them in actual time, however absolutely delegating duties,” he mentioned. “The primary experiments had been simply reasoning fashions with terminal entry. The expertise was magical—they began doing issues for us.”

Constructed for dev groups, not merely solo devs

Codex is designed with skilled builders in thoughts, however Embiricos famous that even product managers have discovered it useful for suggesting or validating modifications earlier than pulling in human SWEs. This versatility displays OpenAI’s technique of constructing instruments that increase productiveness throughout technical groups.

Trini, an engineering lead on the mission, summarized the broader ambition behind Codex: “This can be a transformative change in how software program engineers interface with AI and computer systems on the whole. It amplifies every particular person’s potential.”

OpenAI envisions Codex because the centerpiece of a brand new improvement workflow the place engineers assign high-level duties to brokers and collaborate with them asynchronously. The corporate is constructing towards deeper integrations throughout GitHub, ChatGPT Desktop, difficulty trackers, and CI methods. The long-term purpose is to mix real-time pairing and long-horizon process delegation right into a seamless improvement expertise.

As Josh Tobin put it, “Coding underpins so many helpful issues throughout the economic system. Accelerating coding is a very high-leverage solution to distribute the advantages of AI to humanity, together with ourselves.”

Whether or not or not OpenAI closes offers for opponents, the message is obvious: Codex is right here, and OpenAI is betting by itself brokers to steer the following chapter in developer productiveness.

Every day insights on enterprise use instances with VB Every day

If you wish to impress your boss, VB Every day has you lined. We provide the inside scoop on what corporations are doing with generative AI, from regulatory shifts to sensible deployments, so you possibly can share insights for optimum ROI.

Learn our Privateness Coverage

Thanks for subscribing. Try extra VB newsletters right here.

An error occured.