[フレーム]
BT

InfoQ Software Architects' Newsletter

A monthly overview of things you need to know as an architect or aspiring architect.

View an example

We protect your privacy.

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Unlock the full InfoQ experience

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources.

Log In
or

Don't have an InfoQ account?

Register
  • Stay updated on topics and peers that matter to youReceive instant alerts on the latest insights and trends.
  • Quickly access free resources for continuous learningMinibooks, videos with transcripts, and training materials.
  • Save articles and read at anytimeBookmark articles to read whenever youre ready.

Topics

Choose your language

InfoQ Homepage News OpenAI DevDay 2025 Introduces GPT-5 Pro API, Agent Kit, and More

OpenAI DevDay 2025 Introduces GPT-5 Pro API, Agent Kit, and More

Oct 10, 2025 4 min read

Write for InfoQ

Feed your curiosity. Help 550k+ global
senior developers
each month stay ahead.
Get in touch
Listen to this article - 0:00
Audio ready to play
0:00
0:00

On October 6, 2025, OpenAI hosted DevDay 2025, where the company introduced new tooling for agents, AgentKit, and shipped fresh model options, GPT-5 Pro and Sora 2 in the API. The day was themed with the idea of making a chatbot a place where software runs, collaborates, and sells inside the conversation itself.

The biggest shift was “apps inside ChatGPT.” With a preview of the Apps SDK, third-party software can render interactive UI directly in the chat and share context through the Model Context Protocol. In demos, a user designed a poster with Canva, then pivoted into a Zillow map without leaving the thread. OpenAI said an app directory and a submission review process are coming, with monetization guidance to follow later this year.

The Apps SDK doesn’t yet plug into the community-driven MCP-UI project that adds declarative UI components to MCP servers. But both sides have publicly signaled plans to collaborate post-launch, and OpenAI says it intends to “publish an open implementation of the host environment, so anyone can become an apps host,” hinting at a broader ecosystem where multiple MCP clients, not just ChatGPT, can run these apps.

Developer reactions at DevDay were mixed, with some noting the event felt like "doubling down on existing opportunities" rather than pushing frontiers. "DevDay was a little boring for developers—but extremely exciting if you’re an AI ops pro", Dan Shipper wrote. OpenAI is betting on ecosystem integration and cloud-based autonomous agents, while Anthropic maintains its edge in code quality and terminal-native workflows. Anthropic still commands 32% of enterprise LLM usage versus OpenAI's 25%, and Claude dominates coding applications with 42% market share compared to OpenAI's 21%.

OpenAI stacked the launch with recognizable partners. Booking.com, Canva, Coursera, Expedia, Figma, Spotify, and Zillow all showed live experiences running in chat. Financial press noted minimal stock reactions overall, though some partners saw pops around the keynote. The broader takeaway is that OpenAI is courting mainstream services to anchor a credible software surface in ChatGPT, while the market waits to see if this becomes habitual end-user behavior.

On the “do work” axis, OpenAI formalized agents. AgentKit includes a visual Agent Builder, a Connector Registry for governing data sources, ChatKit for embeddable agent UIs, and integrated evaluation and tracing, enabling teams to observe and improve workflows. This is the plumbing most orgs end up writing themselves: orchestration, guardrails, metrics, and versioning. OpenAI also highlighted reinforcement fine-tuning that is GA on o4-mini and in private beta on GPT-5.

For teams that need portability, the company emphasized a self-hosted route. The open-source Agents SDK in Python and TypeScript exposes a small set of primitives for agents, handoffs, guardrails, and sessions, with built-in tracing. That makes it plausible to design flows in OpenAI’s UI and operate them on your own stack with the same semantics and observability, which is often the compliance requirement in regulated environments.

I actually think agent evals is still a work in progress… [we’re] allowing you to use [traces] in the evals product and be able to grade it… our roadmap… is to… break down the different parts of the trace and allow you to eval… and optimize each of those. - Sherwin Wu

Codex moved to general availability. The keynote narrative positioned it less as code-completion and more as a building block for longer-running “do it for me” workflows that can reason across project context, call tools, and return working artifacts. Model options expanded at the top and the edges.

GPT-5 Pro arrived in the API and positioned itself alongside cheaper real-time models aimed at voice and latency-sensitive use cases. Coverage praised better reasoning and accuracy relative to earlier releases, but communities also flagged uneven latency and consistency. If you are replacing task conduits, not just assistants, those tails matter more than median performance and will drive how you route between high-effort and fast-path models.

OpenAI formalized the Sora app and opened the new model to developers, while the news cycle focused on copyright, deepfake concerns, and watermark handling. If you intend to pipe Sora 2 into commercial pipelines, plan for provenance, consent, and content review from the start rather than bolting them on later. Hardware remained in the rumor zone, with a fireside chat featuring Jony Ive teasing the philosophy behind new devices, while reports about the event suggested delays and unresolved design tradeoffs.

Altman and Ive… spoke in vague terms about the ‘family of devices’… finer details… remain under wraps.” - Wired

MCP-style connectors expand the attack surface and raise exfiltration risk when agents move freely between tools. Researchers have already demonstrated connector exploits, while OpenAI’s guidance stresses least privilege, explicit consent, and defense in depth. Treat connectors as production integrations, keep audit trails, and assume prompt injection will reach your edges. The upside of the new tracing and evals is that you can see and score behavior rather than relying on anecdotes.

Sessions on interactive evaluations and ARC-AGI-3 emphasized action efficiency and first-run generalization over static paper benchmarks. For engineering managers, this means building eval sets that resemble your real workflows, grading full traces, and optimizing the slowest agents in the chain rather than the average.

Developers looking to learn more can start with the official DevDay recap and product posts. Then, they can work through the Apps SDK, AgentKit, and Agents SDK docs and session recordings referenced above to replicate the demos and adapt them to real workloads.

About the Author

Andrew Hoblitzell

Show moreShow less

Rate this Article

Adoption
Style

Related Content

The InfoQ Newsletter

A round-up of last week’s content on InfoQ sent out every Tuesday. Join a community of over 250,000 senior developers. View an example

We protect your privacy.

BT

AltStyle によって変換されたページ (->オリジナル) /