Why humans are AI’s biggest bottleneck (and what’s coming in 2026) | Alexander Embiricos (OpenAI Codex Product Lead)
Alexander Embirikos, Product Lead for OpenAI's Codex, discusses building product at OpenAI, Codex's explosive growth (20x), how it accelerated the Sora app's 18-day build, and the vision for AI agents as proactive, code-writing teammates.
Deep Dive Analysis
16 Topic Outline
OpenAI's Speed and Ambition in Product Development
Introduction to Codex: OpenAI's Coding Agent
Factors Driving Codex's Explosive Growth
Vision for AI as a Proactive Software Engineering Teammate
Coding as a Core Competency for Any AI Agent
Impact of AI on the Engineering Field
How Codex Accelerates Product Managers and Designers
Building the Sora Android App with Codex
Developing the Atlas Browser with AI Acceleration
Measuring Progress and User Feedback for Codex
Rationale for OpenAI Building a Web Browser
Non-Engineering and Coding-Adjacent Use Cases for Codex
Optimal Approach for Trying Codex's Capabilities
Essential Skills for the AI Age
AGI Timelines and Productivity Bottlenecks
Career Opportunities within the Codex Team
6 Key Concepts
Codex
OpenAI's coding agent, available as an IDE extension or terminal tool, that pairs with developers to answer questions, write, run, and execute code within the software development lifecycle. It is envisioned as a software engineering teammate that participates across the entire development process, not just code writing.
Proactivity in AI Agents
The goal for AI agents to be helpful by default, anticipating user needs and taking action without explicit prompting. This contrasts with current AI products that require users to constantly think about when and how to invoke AI, aiming for a 'super assistant' that just knows how to be helpful.
Coding Agent as Foundation
The idea that for AI models to effectively 'do stuff' and use computers, writing code is the best method. Therefore, building any general AI agent might inherently involve building a coding agent, even if the end-user isn't aware of the underlying code generation, making coding a core competency for any agent.
Compaction
A feature in Codex that allows models to work continuously for long periods (e.g., 24 hours) by managing context windows. It requires a model that understands when to prepare for a new context, an API layer to handle this, and a harness to prepare the payload, enabling extended, uninterrupted operation.
Chatter Driven Development
A concept where code gets written and deployed as a result of ongoing conversations and signals within team communication tools or social media, rather than formal specifications. It implies a highly self-driven team where tasks emerge and are addressed fluidly, reducing the need for explicit spec writing.
Contextual Assistant
An AI assistant that understands what a user is attempting to do based on their current environment and actions, allowing it to provide maximally relevant and timely help without requiring explicit context provision from the user. This approach aims to keep users in flow and enable agents to take action on many more things.
12 Questions Answered
Codex is OpenAI's coding agent, available as an IDE extension or terminal tool, that helps engineers answer questions about code, write, run, and execute code. It's envisioned as a software engineering teammate that assists across the entire development lifecycle.
OpenAI's speed is attributed to the transformative technology itself, a 'ready, fire, aim' approach to product development, and an incredibly bottoms-up organizational structure that empowers highly driven and autonomous individuals.
The key unlock was shifting from an initial 'too far in the future' cloud-based agent to integrating Codex directly into engineers' existing workflows via IDE extensions and CLI tools, making it more intuitive and trivial to get immediate value.
Codex operates within a sandbox environment, which allows it to use the shell for commands safely and securely. If a command doesn't work in the sandbox, it can ask the user for guidance.
Codex uses a feature called 'compaction,' which involves a smart reasoning model, an API layer, and a harness to manage and prepare the model for new context windows, allowing it to work continuously for extended periods.
The vision is for AI to become a proactive 'super assistant' or 'teammate' that can 'do things' by using computers, primarily by writing code. This agent would eventually handle tasks beyond coding, such as scheduling or responding to market signals, without constant explicit prompting.
A significant, underappreciated bottleneck is human typing speed and multitasking speed, specifically in writing prompts and manually validating AI-generated work. Unblocking these human-centric productivity loops is crucial for unlocking exponential AI progress.
Codex empowers PMs and designers to be more technical and efficient, allowing them to answer questions, understand changes, and prototype faster. Designers, for example, can 'vibe code' prototypes directly into PRs, blurring traditional role boundaries.
Key metrics include early retention stats (e.g., D7 retention) to understand initial user adoption, and closely monitoring user feedback and 'vibes' on social media platforms like Reddit and Twitter for specific complaints and emergent behaviors.
Building a browser allows OpenAI to create a contextual assistant that understands user intent in a first-class way, directly within the rendering engine of the web. This enables surfacing contextual actions at the moment they are helpful, providing users with clear control over what the AI can assist with.
Individuals should focus on being 'doers' and leveraging the latest AI tools to maximize productivity, develop strong systems engineering skills, effective communication and collaboration, and pursue knowledge at the frontier of specific domains.
Alexander believes that starting next year, early adopters (like startups) will begin to see hockey-stick growth in productivity due to more self-sufficient agents. Over subsequent years, larger companies will follow, and when this hockey-sticking flows back into AI labs, that's when AGI will be reached.
28 Actionable Insights
1. Unblock AI Productivity Loops
Rebuild systems to reduce human reliance on constant prompting and manual validation, allowing AI agents to be “default useful” and unlock significant productivity gains, which is currently an underappreciated limiting factor for AGI.
2. Maximize Human Acceleration
When building tools, focus on how they maximally accelerate people rather than making human tasks unclear, to ensure users feel empowered and productive.
3. Deeply Understand Customer Problems
Prioritize developing a deep, meaningful understanding of specific customer problems, as this is the most critical competency for building successful products, especially with AI tools.
4. Build Coding Agents
To enable AI models to “do stuff” effectively, build them as coding agents, as writing code is the best way for models to use computers.
5. Integrate AI for Proactivity
Aim to integrate AI tools seamlessly into existing workflows so they proactively assist without constant prompting, acting as a “teammate” that is helpful by default.
6. Be Humble, Learn Empirically
In rapidly evolving fields, prioritize humility, empirical learning, and quick experimentation over rigid planning, as capabilities and user adoption are unpredictable.
7. Ruthlessly Prioritize Impact
Given the high potential impact of work at companies like OpenAI, be ruthless in prioritizing how you spend your time to ensure you are focusing on the most impactful work.
8. Prioritize Intuitive User Onboarding
For new technologies, ensure the initial user experience is intuitive and provides trivial immediate value, even if the long-term vision is more complex, to achieve broad user adoption.
9. Configure Agents Collaboratively
Work side-by-side with AI agents to configure their environment and provide necessary access (e.g., passwords, permissions), enabling them to perform tasks autonomously for extended periods.
10. Balance Dogfooding with Market Needs
While internal dogfooding provides valuable signal, remain cognizant that internal users (e.g., AI experts) may differ from the general market, requiring adjustments for broader product adoption.
11. Plan with AI for Long Tasks
For long or complex tasks, collaborate with AI to first create a detailed plan (e.g., in a markdown file) with verifiable steps, then delegate the execution to the AI, which helps it work for much longer.
12. Improve Agent Self-Validation
Focus on making AI agents better at validating their own work, reducing the human burden of verification and increasing trust in AI-generated output.
13. Enhance AI Code Review
Develop features that specifically aid in reviewing AI-generated code to build human confidence and make the less enjoyable task of code review more efficient.
14. Prioritize Visual Previews in AI Tools
When an AI agent performs visual work (e.g., UI changes), prioritize showing an image preview before the code diff to empower the human and accelerate the review process.
15. Value Execution Over Ideas
Recognize that while AI accelerates building, strong execution remains crucial for success, meaning ideas alone are not as valuable as effective implementation.
16. Focus on Vertical AI
Consider investing in or building vertical AI startups that deeply understand and solve specific problems for a niche customer base, as this approach is currently promising.
17. Monitor Early Retention & User Experience
Regularly check early retention metrics (e.g., D7 retention) and experience the product as a new user by signing up from scratch to understand initial adoption and identify pain points.
18. Monitor Social Media for Real Feedback
Actively monitor social media (e.g., Reddit for real, often negative but valuable feedback; Twitter/X for hype) to gauge user sentiment and identify specific issues that need improvement.
19. Provide Clear AI Control & Boundaries
Offer users clear control and boundaries for AI interaction, such as choosing to use an “AI browser” for AI assistance versus a regular browser for privacy or non-AI tasks, to build trust.
20. Test AI with Hardest Tasks
When evaluating a professional AI coding tool like Codex, give it your most challenging, real-world problems rather than trivial ones to truly assess its capabilities.
21. Apply AI to Real, Complex Problems
Use AI tools on genuine, complex problems like hard-to-diagnose bugs or implementing fixes, rather than simplifying tasks, to leverage their full potential.
22. Build Trust with AI Teammates
Approach AI tools like a new teammate: start by helping them understand the codebase, align on a plan, and then delegate tasks incrementally to build trust and learn effective prompting.
23. Be a Doer, Not Just a Learner
Focus on actively “doing things” and building, leveraging AI tools to increase productivity, rather than solely fulfilling academic assignments, especially for early-career individuals.
24. Master Systems & Collaboration Skills
Develop strong systems engineering, communication, and collaboration skills, as these remain crucial for building effective software systems and teams, even with advanced AI.
25. Advance Knowledge at the Frontier
Pursue knowledge at the frontier of a specific domain, as this area is less accessible to current AI agents and forces you to leverage AI tools to accelerate your own workflow.
26. Automate Monitoring with AI
Use AI agents to continuously monitor critical metrics (e.g., training run graphs) on a loop, enabling proactive identification and potential resolution of issues, such as being “on call for its own training.”
27. Adopt Bottoms-Up Approach
Implement a truly bottoms-up organizational structure to foster rapid experimentation and leverage individual drive, especially in fast-moving tech environments, though it requires high talent caliber.
28. Be Kind and Candid
Practice kindness alongside candor in communication and leadership, recognizing that true kindness sometimes requires difficult but honest conversations.
7 Key Quotes
It turns out the best way for models to use computers is simply to write code. And so we're kind of getting to this idea where if you want to build any agent, maybe you should be building a coding agent.
Alexander Embiricos
The current underappreciated limiting factor is literally human typing speed or human multitasking speed.
Alexander Embiricos
I think actually like auto completion and IDEs is like one of the most successful AI products today and part of what's so magical about it is that when the it can surface like ideas for helping you really rapidly when it's right you're accelerated when it's wrong it's not like that annoying.
Alexander Embiricos
I think what we can do what we can do basically as a product team building in the space is just try to always think about how are we building a tool so that it feels like we're like maximally accelerating people rather than building a tool that makes it like more unclear what you should do as the human.
Alexander Embiricos
The best way to try Codex is to give it your hardest tasks.
Alexander Embiricos
If you don't believe it, you can't will it into existence, so you need a balance.
Alexander Embiricos
I think still think execution is really hard, right? Like you can build something fast, but you still need to execute well on it. It still needs to make sense and be a coherent thing overall.
Alexander Embiricos
3 Protocols
How to run Codex on a really long task (Plan-driven Development)
Alexander Embiricos- Collaborate with Codex to write a plan in a markdown file (e.g., plan.md).
- Ensure the plan has verifiable steps.
- Once satisfied with the plan, ask Codex to go off and do the work.
How to get started with Codex (Building Trust with a Teammate)
Alexander Embiricos- Try a few different tasks in parallel.
- Ask Codex to understand the codebase.
- Formulate a plan with Codex around an idea you have.
- Build your way up from there, giving it tasks bit by bit.
How to configure a coding agent to be effective (Human-in-the-loop configuration)
Alexander Embiricos- Identify a specific problem or area where the agent is not verifying its work effectively (e.g., Atlas project's verification).
- Prompt the agent (e.g., Codex) with a command like 'hey why can't you verify your work fix it'.
- Repeat this process in a loop, with human guidance, until the agent is configured to verify its own work.