Why humans are AI’s biggest bottleneck (and what’s coming in 2026) | Alexander Embiricos (OpenAI Codex Product Lead)

Dec 14, 2025 Episode Page ↗
Overview

Alexander Embirikos, Product Lead for OpenAI's Codex, discusses building product at OpenAI, Codex's explosive growth (20x), how it accelerated the Sora app's 18-day build, and the vision for AI agents as proactive, code-writing teammates.

At a Glance
28 Insights
1h 25m Duration
16 Topics
6 Concepts

Deep Dive Analysis

OpenAI's Speed and Ambition in Product Development

Introduction to Codex: OpenAI's Coding Agent

Factors Driving Codex's Explosive Growth

Vision for AI as a Proactive Software Engineering Teammate

Coding as a Core Competency for Any AI Agent

Impact of AI on the Engineering Field

How Codex Accelerates Product Managers and Designers

Building the Sora Android App with Codex

Developing the Atlas Browser with AI Acceleration

Measuring Progress and User Feedback for Codex

Rationale for OpenAI Building a Web Browser

Non-Engineering and Coding-Adjacent Use Cases for Codex

Optimal Approach for Trying Codex's Capabilities

Essential Skills for the AI Age

AGI Timelines and Productivity Bottlenecks

Career Opportunities within the Codex Team

Codex

OpenAI's coding agent, available as an IDE extension or terminal tool, that pairs with developers to answer questions, write, run, and execute code within the software development lifecycle. It is envisioned as a software engineering teammate that participates across the entire development process, not just code writing.

Proactivity in AI Agents

The goal for AI agents to be helpful by default, anticipating user needs and taking action without explicit prompting. This contrasts with current AI products that require users to constantly think about when and how to invoke AI, aiming for a 'super assistant' that just knows how to be helpful.

Coding Agent as Foundation

The idea that for AI models to effectively 'do stuff' and use computers, writing code is the best method. Therefore, building any general AI agent might inherently involve building a coding agent, even if the end-user isn't aware of the underlying code generation, making coding a core competency for any agent.

Compaction

A feature in Codex that allows models to work continuously for long periods (e.g., 24 hours) by managing context windows. It requires a model that understands when to prepare for a new context, an API layer to handle this, and a harness to prepare the payload, enabling extended, uninterrupted operation.

Chatter Driven Development

A concept where code gets written and deployed as a result of ongoing conversations and signals within team communication tools or social media, rather than formal specifications. It implies a highly self-driven team where tasks emerge and are addressed fluidly, reducing the need for explicit spec writing.

Contextual Assistant

An AI assistant that understands what a user is attempting to do based on their current environment and actions, allowing it to provide maximally relevant and timely help without requiring explicit context provision from the user. This approach aims to keep users in flow and enable agents to take action on many more things.

?
What is Codex and how does it help engineers?

Codex is OpenAI's coding agent, available as an IDE extension or terminal tool, that helps engineers answer questions about code, write, run, and execute code. It's envisioned as a software engineering teammate that assists across the entire development lifecycle.

?
Why is OpenAI able to move so quickly in AI product development?

OpenAI's speed is attributed to the transformative technology itself, a 'ready, fire, aim' approach to product development, and an incredibly bottoms-up organizational structure that empowers highly driven and autonomous individuals.

?
What was the key factor that unlocked Codex's explosive growth?

The key unlock was shifting from an initial 'too far in the future' cloud-based agent to integrating Codex directly into engineers' existing workflows via IDE extensions and CLI tools, making it more intuitive and trivial to get immediate value.

?
How does OpenAI ensure the safety and security of Codex when it uses the shell?

Codex operates within a sandbox environment, which allows it to use the shell for commands safely and securely. If a command doesn't work in the sandbox, it can ask the user for guidance.

?
How does Codex handle tasks that exceed its context window?

Codex uses a feature called 'compaction,' which involves a smart reasoning model, an API layer, and a harness to manage and prepare the model for new context windows, allowing it to work continuously for extended periods.

?
What is the vision for AI agents beyond just coding?

The vision is for AI to become a proactive 'super assistant' or 'teammate' that can 'do things' by using computers, primarily by writing code. This agent would eventually handle tasks beyond coding, such as scheduling or responding to market signals, without constant explicit prompting.

?
What is the biggest bottleneck to achieving AGI-level productivity today?

A significant, underappreciated bottleneck is human typing speed and multitasking speed, specifically in writing prompts and manually validating AI-generated work. Unblocking these human-centric productivity loops is crucial for unlocking exponential AI progress.

?
How has Codex impacted the roles of Product Managers and Designers at OpenAI?

Codex empowers PMs and designers to be more technical and efficient, allowing them to answer questions, understand changes, and prototype faster. Designers, for example, can 'vibe code' prototypes directly into PRs, blurring traditional role boundaries.

?
What is the most effective way to measure progress for a tool like Codex?

Key metrics include early retention stats (e.g., D7 retention) to understand initial user adoption, and closely monitoring user feedback and 'vibes' on social media platforms like Reddit and Twitter for specific complaints and emergent behaviors.

?
Why did OpenAI decide to build a web browser (Atlas)?

Building a browser allows OpenAI to create a contextual assistant that understands user intent in a first-class way, directly within the rendering engine of the web. This enables surfacing contextual actions at the moment they are helpful, providing users with clear control over what the AI can assist with.

?
What skills should individuals lean into for a career in the AI age?

Individuals should focus on being 'doers' and leveraging the latest AI tools to maximize productivity, develop strong systems engineering skills, effective communication and collaboration, and pursue knowledge at the frontier of specific domains.

?
How far away is AGI (Artificial General Intelligence) according to Alexander Embiricos?

Alexander believes that starting next year, early adopters (like startups) will begin to see hockey-stick growth in productivity due to more self-sufficient agents. Over subsequent years, larger companies will follow, and when this hockey-sticking flows back into AI labs, that's when AGI will be reached.

1. Unblock AI Productivity Loops

Rebuild systems to reduce human reliance on constant prompting and manual validation, allowing AI agents to be “default useful” and unlock significant productivity gains, which is currently an underappreciated limiting factor for AGI.

2. Maximize Human Acceleration

When building tools, focus on how they maximally accelerate people rather than making human tasks unclear, to ensure users feel empowered and productive.

3. Deeply Understand Customer Problems

Prioritize developing a deep, meaningful understanding of specific customer problems, as this is the most critical competency for building successful products, especially with AI tools.

4. Build Coding Agents

To enable AI models to “do stuff” effectively, build them as coding agents, as writing code is the best way for models to use computers.

5. Integrate AI for Proactivity

Aim to integrate AI tools seamlessly into existing workflows so they proactively assist without constant prompting, acting as a “teammate” that is helpful by default.

6. Be Humble, Learn Empirically

In rapidly evolving fields, prioritize humility, empirical learning, and quick experimentation over rigid planning, as capabilities and user adoption are unpredictable.

7. Ruthlessly Prioritize Impact

Given the high potential impact of work at companies like OpenAI, be ruthless in prioritizing how you spend your time to ensure you are focusing on the most impactful work.

8. Prioritize Intuitive User Onboarding

For new technologies, ensure the initial user experience is intuitive and provides trivial immediate value, even if the long-term vision is more complex, to achieve broad user adoption.

9. Configure Agents Collaboratively

Work side-by-side with AI agents to configure their environment and provide necessary access (e.g., passwords, permissions), enabling them to perform tasks autonomously for extended periods.

10. Balance Dogfooding with Market Needs

While internal dogfooding provides valuable signal, remain cognizant that internal users (e.g., AI experts) may differ from the general market, requiring adjustments for broader product adoption.

11. Plan with AI for Long Tasks

For long or complex tasks, collaborate with AI to first create a detailed plan (e.g., in a markdown file) with verifiable steps, then delegate the execution to the AI, which helps it work for much longer.

12. Improve Agent Self-Validation

Focus on making AI agents better at validating their own work, reducing the human burden of verification and increasing trust in AI-generated output.

13. Enhance AI Code Review

Develop features that specifically aid in reviewing AI-generated code to build human confidence and make the less enjoyable task of code review more efficient.

14. Prioritize Visual Previews in AI Tools

When an AI agent performs visual work (e.g., UI changes), prioritize showing an image preview before the code diff to empower the human and accelerate the review process.

15. Value Execution Over Ideas

Recognize that while AI accelerates building, strong execution remains crucial for success, meaning ideas alone are not as valuable as effective implementation.

16. Focus on Vertical AI

Consider investing in or building vertical AI startups that deeply understand and solve specific problems for a niche customer base, as this approach is currently promising.

17. Monitor Early Retention & User Experience

Regularly check early retention metrics (e.g., D7 retention) and experience the product as a new user by signing up from scratch to understand initial adoption and identify pain points.

18. Monitor Social Media for Real Feedback

Actively monitor social media (e.g., Reddit for real, often negative but valuable feedback; Twitter/X for hype) to gauge user sentiment and identify specific issues that need improvement.

19. Provide Clear AI Control & Boundaries

Offer users clear control and boundaries for AI interaction, such as choosing to use an “AI browser” for AI assistance versus a regular browser for privacy or non-AI tasks, to build trust.

20. Test AI with Hardest Tasks

When evaluating a professional AI coding tool like Codex, give it your most challenging, real-world problems rather than trivial ones to truly assess its capabilities.

21. Apply AI to Real, Complex Problems

Use AI tools on genuine, complex problems like hard-to-diagnose bugs or implementing fixes, rather than simplifying tasks, to leverage their full potential.

22. Build Trust with AI Teammates

Approach AI tools like a new teammate: start by helping them understand the codebase, align on a plan, and then delegate tasks incrementally to build trust and learn effective prompting.

23. Be a Doer, Not Just a Learner

Focus on actively “doing things” and building, leveraging AI tools to increase productivity, rather than solely fulfilling academic assignments, especially for early-career individuals.

24. Master Systems & Collaboration Skills

Develop strong systems engineering, communication, and collaboration skills, as these remain crucial for building effective software systems and teams, even with advanced AI.

25. Advance Knowledge at the Frontier

Pursue knowledge at the frontier of a specific domain, as this area is less accessible to current AI agents and forces you to leverage AI tools to accelerate your own workflow.

26. Automate Monitoring with AI

Use AI agents to continuously monitor critical metrics (e.g., training run graphs) on a loop, enabling proactive identification and potential resolution of issues, such as being “on call for its own training.”

27. Adopt Bottoms-Up Approach

Implement a truly bottoms-up organizational structure to foster rapid experimentation and leverage individual drive, especially in fast-moving tech environments, though it requires high talent caliber.

28. Be Kind and Candid

Practice kindness alongside candor in communication and leadership, recognizing that true kindness sometimes requires difficult but honest conversations.

It turns out the best way for models to use computers is simply to write code. And so we're kind of getting to this idea where if you want to build any agent, maybe you should be building a coding agent.

Alexander Embiricos

The current underappreciated limiting factor is literally human typing speed or human multitasking speed.

Alexander Embiricos

I think actually like auto completion and IDEs is like one of the most successful AI products today and part of what's so magical about it is that when the it can surface like ideas for helping you really rapidly when it's right you're accelerated when it's wrong it's not like that annoying.

Alexander Embiricos

I think what we can do what we can do basically as a product team building in the space is just try to always think about how are we building a tool so that it feels like we're like maximally accelerating people rather than building a tool that makes it like more unclear what you should do as the human.

Alexander Embiricos

The best way to try Codex is to give it your hardest tasks.

Alexander Embiricos

If you don't believe it, you can't will it into existence, so you need a balance.

Alexander Embiricos

I think still think execution is really hard, right? Like you can build something fast, but you still need to execute well on it. It still needs to make sense and be a coherent thing overall.

Alexander Embiricos

How to run Codex on a really long task (Plan-driven Development)

Alexander Embiricos
  1. Collaborate with Codex to write a plan in a markdown file (e.g., plan.md).
  2. Ensure the plan has verifiable steps.
  3. Once satisfied with the plan, ask Codex to go off and do the work.

How to get started with Codex (Building Trust with a Teammate)

Alexander Embiricos
  1. Try a few different tasks in parallel.
  2. Ask Codex to understand the codebase.
  3. Formulate a plan with Codex around an idea you have.
  4. Build your way up from there, giving it tasks bit by bit.

How to configure a coding agent to be effective (Human-in-the-loop configuration)

Alexander Embiricos
  1. Identify a specific problem or area where the agent is not verifying its work effectively (e.g., Atlas project's verification).
  2. Prompt the agent (e.g., Codex) with a command like 'hey why can't you verify your work fix it'.
  3. Repeat this process in a loop, with human guidance, until the agent is configured to verify its own work.
18 days
Sora Android app build time (internal launch) Time from zero to launch to employees.
28 days total
Sora Android app build time (public launch) Time from zero to public launch (18 days to internal, 10 more to public).
two or three
Sora Android app engineers Number of engineers who built the Sora Android app.
20x
Codex model growth since August Growth in scale since the launch of GPT-5.
many trillions
Codex tokens served weekly Number of tokens served by Codex models per week.
roughly 30% faster
GPT-5.1 Codex Max speed improvement Speed increase for accomplishing tasks compared to previous versions.
one engineer one week
Atlas project acceleration Equivalent to what previously took 'two to three weeks for two to three engineers' for a similar task.
thousands of times per day
AI product usage frequency (potential) Potential benefit from an intelligent entity, compared to 'tens of times' for average users prompting AI today.
200 bucks a month
OpenAI's ChatGPT Pro accounts Cost of Alexander's multiple ChatGPT Pro accounts for dogfooding.