Be part of the occasion trusted by enterprise leaders for practically 20 years. VB Rework brings collectively the folks constructing actual enterprise AI technique. Study extra
At VentureBeat’s Rework 2025 convention, Olivier Godement, Head of Product for OpenAI’s API platform, supplied a behind-the-scenes have a look at how enterprise groups are adopting and deploying AI brokers at scale.
In a 20-minute panel dialogue I hosted completely with Godement, the previous Stripe researcher and present OpenAI API boss unpacked OpenAI’s newest developer instruments—the Responses API and Brokers SDK—whereas highlighting real-world patterns, safety issues, and cost-return examples from early adopters like Stripe and Field.
For enterprise leaders unable to attend the session dwell, listed below are prime 8 most essential takeaways:
Brokers Are Quickly Transferring From Prototype to Manufacturing
In accordance with Godement, 2025 marks an actual shift in how AI is being deployed at scale. With over one million month-to-month energetic builders now utilizing OpenAI’s API platform globally, and token utilization up 700% 12 months over 12 months, AI is transferring past experimentation.
“It’s been 5 years since we launched primarily GPT-3… and man, the previous 5 years has been fairly wild.”
Godement emphasised that present demand isn’t nearly chatbots anymore. “AI use instances are transferring from easy Q&A to really use instances the place the applying, the agent, can do stuff for you.”
This shift prompted OpenAI to launch two main developer-facing instruments in March: the Responses API and the Brokers SDK.
When to Use Single Brokers vs. Sub-Agent Architectures
A significant theme was architectural alternative. Godement famous that single-agent loops, which encapsulate full device entry and context in a single mannequin, are conceptually elegant however usually impractical at scale.
“Constructing correct and dependable single brokers is tough. Like, it’s actually onerous.”
As complexity will increase—extra instruments, extra potential consumer inputs, extra logic—groups usually transfer towards modular architectures with specialised sub-agents.
“A follow which has emerged is to primarily break down the brokers into a number of sub-agents… You’d do separation of considerations like in software program.”
These sub-agents perform like roles in a small group: a triage agent classifies intent, tier-one brokers deal with routine points, and others escalate or resolve edge instances.
Why the Responses API Is a Step Change
Godement positioned the Responses API as a foundational evolution in developer tooling. Beforehand, builders manually orchestrated sequences of mannequin calls. Now, that orchestration is dealt with internally.
“The Responses API might be the largest new layer of abstraction we launched since just about GPT-3.”
It permits builders to specific intent, not simply configure mannequin flows. “You care about returning a very good response to the shopper… the Response API primarily handles that loop.”
It additionally contains built-in capabilities for data retrieval, internet search, and performance calling—instruments that enterprises want for real-world agent workflows.
Observability and Safety Are Constructed In
Safety and compliance had been prime of thoughts. Godement cited key guardrails that make OpenAI’s stack viable for regulated sectors like finance and healthcare:
- Coverage-based refusals
- SOC-2 logging
- Knowledge residency help
Analysis is the place Godement sees the largest hole between demo and manufacturing.
“My sizzling take is that mannequin analysis might be the largest bottleneck to huge AI adoption.”
OpenAI now contains tracing and eval instruments with the API stack to assist groups outline what success seems to be like and monitor how brokers carry out over time.
“Except you spend money on analysis… it’s actually onerous to construct that belief, that confidence that the mannequin is being correct, dependable.”
Early ROI Is Seen in Particular Features
Some enterprise use instances are already delivering measurable features. Godement shared examples from:
- Stripe, which makes use of brokers to speed up bill dealing with, reporting “35% quicker bill decision”
- Field, which launched data assistants that allow “zero-touch ticket triage”
Different high-value use instances embody buyer help (together with voice), inside governance, and data assistants for navigating dense documentation.
What It Takes to Launch in Manufacturing
Godement emphasised the human think about profitable deployments.
“There’s a small fraction of very high-end individuals who, each time they see an issue and see a know-how, they run at it.”
These inside champions don’t all the time come from engineering. What unites them is persistence.
“Their first response is, OK, how can I make it work?”
OpenAI sees many preliminary deployments pushed by this group — individuals who pushed early ChatGPT use within the enterprise and at the moment are experimenting with full agent methods.
He additionally identified a spot many overlook: area experience. “The data in an enterprise… doesn’t lie with engineers. It lies with the ops groups.”
Making agent-building instruments accessible to non-developers is a problem OpenAI goals to deal with.
What’s Subsequent for Enterprise Brokers
Godement provided a glimpse into the roadmap. OpenAI is actively engaged on:
- Multimodal brokers that may work together by way of textual content, voice, photographs, and structured knowledge
- Lengthy-term reminiscence for retaining data throughout classes
- Cross-cloud orchestration to help complicated, distributed IT environments
These aren’t radical adjustments, however iterative layers that develop what’s already potential. “As soon as we’ve got fashions that may suppose not just for a couple of seconds however for minutes, for hours… that’s going to allow some fairly mind-blowing use instances.”
Closing Phrase: Reasoning Fashions Are Underhyped
Godement closed the session by reaffirming his perception that reasoning-capable fashions—these that may mirror earlier than responding—would be the true enablers of long-term transformation.
“I nonetheless have conviction that we’re just about on the GPT-2 or GPT-3 degree of maturity of these fashions….We’re nonetheless scratching the floor on what reasoning fashions can do.”
For enterprise resolution makers, the message is obvious: the infrastructure for agentic automation is right here. What issues now could be constructing a targeted use case, empowering cross-functional groups, and being able to iterate. The following section of worth creation lies not in novel demos—however in sturdy methods, formed by real-world wants and the operational self-discipline to make them dependable.