An AI state of the union: We’ve passed the inflection point, dark factories are coming, and automation timelines | Simon Willison

19 Topic Outline

Introduction to Simon Willison and AI's Impact

The November 2025 AI Coding Inflection Point

Current Capabilities and Future of AI in Software Development

Vibe Coding vs. Agentic Engineering

The 'Dark Factory' Pattern and Code Review

Shifting Bottlenecks and Human Value in AI Development

Impact of AI on Different Engineering Seniority Levels

Advice for Mid-Career Engineers

The Intensity of Working with AI Agents

The Impact of 'Cheap Code' and Prototyping

Simon Willison's Current AI Stack and Model Preferences

The Pelican Riding a Bicycle AI Benchmark

Hoarding Knowledge for Agentic Engineering

Red/Green Test-Driven Development with AI Agents

Starting Projects with Good Code Templates

The Lethal Trifecta and Prompt Injection Security Risks

The Normalization of Deviance and AI Disaster Prediction

The OpenClaw Phenomenon and Demand for Personal AI Assistants

Simon Willison's Current Work and Future Focus

8 Key Concepts

Reasoning Models

AI models that can "think" through problems, exhibiting a process similar to human reasoning, which significantly improved their ability to generate and debug code.

Vibe Coding

A hands-off approach to coding where a user tells an AI agent what to build, plays with the result, and iterates without directly looking at or understanding the underlying code.

Agentic Engineering

The professional practice of using AI coding agents to write, debug, and test production-ready software, requiring deep experience in software and agent interaction to achieve high-quality results.

Dark Factory Pattern

An advanced concept in software development where code is generated and quality-assured by AI agents without human developers writing or reviewing the code directly, akin to an automated factory operating in darkness.

Normalization of Deviance

A sociological term describing how people or organizations gradually accept increasingly unsafe practices as normal, especially when repeated failures do not immediately result in catastrophe, leading to a false sense of security.

Prompt Injection

A class of security vulnerabilities in applications built on top of LLMs, where malicious instructions embedded in user input can override the system's original instructions, potentially leading to unintended actions or data breaches.

Lethal Trifecta

A specific subset of prompt injection vulnerabilities characterized by three conditions: an AI agent has access to private information, is exposed to malicious instructions (e.g., via email), and has a mechanism to exfiltrate that data back to an attacker.

Pelican Riding a Bicycle Benchmark

A humorous yet surprisingly effective benchmark created by Simon Willison to assess the quality of text-to-SVG code generation by LLMs, where the ability to accurately draw a pelican on a bicycle correlates with overall model performance.

10 Questions Answered

?

What was the "November 2025 inflection point" in AI coding?

In November 2025, models like GPT 5.1 and Claude Opus 4.5 crossed a threshold where coding agents became reliably effective, consistently producing functional code rather than buggy prototypes, leading to widespread realization among engineers.

?

How has AI changed the role of a software engineer?

AI has massively accelerated code generation, shifting bottlenecks from writing code to other areas like idea validation and process redesign, and requiring engineers to leverage their experience to guide agents effectively.

?

Are junior or senior engineers more at risk from AI?

Mid-career engineers are considered most at risk, as AI amplifies the skills of experienced engineers and significantly aids the onboarding of new engineers, but offers less unique benefit to those in the middle.

?

Why are people working harder with AI, despite its productivity benefits?

The intense cognitive load of managing multiple AI agents and the constant need for rapid decision-making can be mentally exhausting, leading to increased work intensity for those at the leading edge of AI adoption.

?

Why is "vibe coding" not always suitable for professional software?

Vibe coding, where users don't look at the code, is great for personal prototypes, but becomes irresponsible for production code used by others, as understanding potential damage and ensuring quality requires expert oversight.

?

What is the "dark factory" pattern in software development?

The "dark factory" pattern involves AI agents generating and testing software without human developers writing or reviewing the code, relying instead on automated QA swarms and robust testing to ensure quality.

?

How can engineers avoid becoming obsolete due to AI?

Engineers should lean into AI to amplify their skills, take on more ambitious projects, and focus on developing their "agency" – the ability to decide what problems to tackle and how to leverage technology for self-improvement.

?

Why is prompt injection an unsolved security problem?

Prompt injection is difficult to solve because LLMs cannot fundamentally distinguish between user instructions and system instructions within text, making it nearly impossible to create filters that are 100% effective against all possible malicious inputs.

?

What is the "lethal trifecta" in AI security?

The lethal trifecta describes a critical security vulnerability where an AI agent has access to private information, is exposed to malicious instructions (e.g., from an attacker), and has a mechanism to exfiltrate that data back to the attacker.

?

Why is the "normalization of deviance" relevant to AI security?

Repeatedly using AI systems in unsafe ways without immediate catastrophic failure can lead to a false sense of security, causing organizations to normalize risky practices, which Simon Willison predicts will eventually lead to a major AI disaster.

9 Actionable Insights

1. Embrace AI for Ambitious Projects

Use AI to amplify existing skills and take on much more ambitious projects, as the initial learning curve for new technologies is significantly reduced.

2. Prototype Multiple Ideas Rapidly

Leverage AI’s ability to quickly build functional prototypes to explore three or more different approaches for any feature or design, enabling faster experimentation and better decision-making.

3. Brainstorm with AI for Breadth

Utilize AI as a brainstorming companion to generate a wide range of initial ideas, including obvious ones, and then push for more unusual combinations to spark truly innovative directions.

4. Hoard Knowledge with AI Assistance

Actively build a backlog of tried techniques and new technologies by using AI to quickly prototype and document new tools or research findings, storing them in a trusted system like GitHub for future reference.

5. Start Projects with Good Templates

Begin new coding projects with a minimal, well-structured template that includes a single test and preferred formatting, as coding agents are exceptionally good at adhering to existing patterns and styles.

6. Limit AI Agent Blast Radius

When building systems with AI agents, assume malicious instructions are possible and limit the agent’s “blast radius” by restricting its ability to exfiltrate private data or perform high-damage actions without human approval.

7. Treat AI as an Unreliable Source

For tasks requiring high accuracy, like journalism, treat AI as another unreliable source; double-check all AI-generated details, as models can hallucinate even with search integration.

8. Leverage AI for Research & Learning

Use AI models with search integration for research, asking complex questions and observing how they fire off parallel searches to gather data, but always verify critical information.

9. Prioritize Human-in-the-Loop for High Risk

For high-risk AI agent activities, design systems where human approval is required only for critical, high-risk actions, filtering out trivial requests to avoid “click fatigue.”

7 Key Quotes

Today, probably 95% of the code that I produce, I didn't type it myself.
Simon Willison

Using coding agents well is taking every inch of my 25 years of experience as a software engineer.
Simon Willison

I can fire up four agents in parallel and have them work on four different problems, by 11 a.m. I am wiped out.
Simon Willison

The problem is the people in the middle. Like if you're mid career, if you haven't made it to sort of super senior engineer yet, but you're not sort of new either. That's the, that's the group which ThoughtWorks Resolved were probably in the most trouble right now.
Simon Willison

I think agents have no agency at all. Like I would argue that the one thing AI can never have is agency because it doesn't have human motivations.
Simon Willison

I think the biggest opportunity in AI right now, if you can build safe OpenClaw, if you can deploy a version of OpenClaw that does all the things people love about it and won't randomly leak people's data and delete their files, that's a huge opportunity.
Simon Willison

I think something people often miss is that this space is inherently funny. Like it is ridiculous.
Simon Willison

1 Protocols

Red/Green Test-Driven Development with AI Agents

Simon Willison

Instruct the AI agent to write the automated test for the desired functionality.
Run the test and observe it fail (red), confirming the test is correctly designed to detect the absence of the feature.
Instruct the AI agent to implement the code necessary to make the test pass.
Run the test again and observe it pass (green), confirming the functionality is correctly implemented.

14 Key Numbers

95%

Percentage of code Simon Willison produces that he didn't type himself As of the podcast recording

25 years

Simon Willison's experience as a software engineer Used to guide AI agents effectively

10,000 lines

Amount of code an engineer can churn out in a day with AI Most of it works

1,000 interns

Number of interns Cloudflare and Shopify were hiring Over the course of 2025, aided by AI onboarding

$10,000

Daily cost for Strong DM to simulate end-users for QA with AI agents On tokens for simulation

100

Number of potential vulnerabilities Anthropic discovered for Firefox Responsibly reported to Mozilla

2022

Year before which data labeling companies are buying human-written code Pre-ChatGPT emergence

97%

Effectiveness rate of filters against prompt injection that Simon Willison considers a 'failing grade' Meaning 3 out of 100 attacks could succeed

250

Number of Kakapo parrots left in the world Flightless nocturnal parrots in New Zealand

4 years

Duration since the last good breeding season for Kakapo parrots before 2026 They breed when Rimu trees have a mass fruiting season

3.5 months

Time it took for OpenClaw to go from its first line of code to a Superbowl ad First line of code on November 25th

193

Number of small HTML/JavaScript tools in Simon Willison's 'SimonW/tools' GitHub repository Captures ideas or possible things to do

50

Number of additional research projects in Simon Willison's private GitHub repository Things that didn't fit public sharing

Over 100

Number of tests many of Simon Willison's small libraries now have Normally considered over-testing, but acceptable with AI agents

Deep Dive Analysis

Introduction to Simon Willison and AI's Impact

The November 2025 AI Coding Inflection Point

Current Capabilities and Future of AI in Software Development

Vibe Coding vs. Agentic Engineering

The 'Dark Factory' Pattern and Code Review

Shifting Bottlenecks and Human Value in AI Development

Impact of AI on Different Engineering Seniority Levels

Advice for Mid-Career Engineers

The Intensity of Working with AI Agents

The Impact of 'Cheap Code' and Prototyping

Simon Willison's Current AI Stack and Model Preferences

The Pelican Riding a Bicycle AI Benchmark

Hoarding Knowledge for Agentic Engineering

Red/Green Test-Driven Development with AI Agents

Starting Projects with Good Code Templates

The Lethal Trifecta and Prompt Injection Security Risks

The Normalization of Deviance and AI Disaster Prediction

The OpenClaw Phenomenon and Demand for Personal AI Assistants

Simon Willison's Current Work and Future Focus

Reasoning Models

Vibe Coding

Agentic Engineering

Dark Factory Pattern

Normalization of Deviance

Prompt Injection

Lethal Trifecta

Pelican Riding a Bicycle Benchmark

1. Embrace AI for Ambitious Projects

2. Prototype Multiple Ideas Rapidly

3. Brainstorm with AI for Breadth

4. Hoard Knowledge with AI Assistance

5. Start Projects with Good Templates

6. Limit AI Agent Blast Radius

7. Treat AI as an Unreliable Source

8. Leverage AI for Research & Learning

9. Prioritize Human-in-the-Loop for High Risk

Red/Green Test-Driven Development with AI Agents