OpenAI's first economic paper on Codex: agents have eaten the company's chatbot use

A June 25 arXiv preprint and matching blog post put numbers on the chatbot-to-agent shift — and show OpenAI itself as the leading indicator.

As of June 11, Codex generated 99.8% of the output tokens produced by OpenAI’s own workers. That number, buried in an arXiv preprint published June 25 alongside a companion blog post, is the load-bearing fact of OpenAI’s first attempt at an economic study of agentic AI, and it carries an obvious tell: the company studying the chatbot-to-agent transition is also the company that has most thoroughly completed it internally.

The paper, “The Shift to Agentic AI: Evidence from Codex,” tracks a workforce that has effectively stopped using ChatGPT for production work. Weekly active Codex users at OpenAI grew more than fivefold between January 1 and June 1. By May, 80.6% of sampled individuals had filed at least one request estimated to exceed thirty minutes of human work, 70.2% had crossed one hour, and 25.6% had crossed eight hours. The heaviest users were orchestrating more than sixty hours of agent turns per day across parallel sessions by June.

Adoption escaped Engineering early. Legal, Finance, and Recruiting all crossed into majority Codex use around April. Measured against a November 2025 baseline, median Research use grew 56x, Customer Support 32x, Engineering 27x, and Legal 13x. Non-developer organizational users are up 189-fold since August 2025; non-developer individuals, 137-fold.

Outside OpenAI’s walls, the picture compresses. Codex accounts for 63.3% of output tokens among external organizational users and just 16.5% among individuals. It also burns through them: 60.3% of Codex turns invoked at least one external tool in the week to June 11, versus 21.9% of ChatGPT turns.

OpenAI chief economist Ronnie Chatterji called the work “really opening up for the first time an economic study of agents.” The Deep View noted the obvious: OpenAI benefits directly from Codex winning against Anthropic’s Claude Code, and Codex burns more tokens than ChatGPT. The paper’s authors concede the point, describing OpenAI as “an unusually favorable environment” rather than a representative one.

Which is the structural read. The leading indicator and the vendor are the same entity.

OpenAI's first economic paper on Codex: agents have eaten the company's chatbot use

Sources