AI: Autonomous or controllable? Pick one (with Anthony Aguirre)
Spencer Greenberg speaks with Anthony Aguirre, Executive Director of the Future of Life Institute, about the critical distinction between safe and controlled AI, the risks of optimizing AI for single goals, and the need for policy to guide AI development. They also discuss the potential for AI to degrade information ecosystems and the utility of prediction markets.
Deep Dive Analysis
18 Topic Outline
Distinction Between AI Safety and Control
Why Superintelligence May Be Uncontrollable
The Core Problem of AI Alignment and Optimization
Transition from Narrow to General to Autonomous AI
Proposed Regulations for AI Safety: Compute, Standards, Liability
Challenges of International AI Regulation
Arguments Against AI Becoming Easier to Control with More Power
Unpredictable Social Dynamics of Autonomous AI Systems
Biggest Fears Regarding AI's Current Trajectory
Goodhart's Law and the General Optimization Problem
Concrete Examples of AI Risks and Misuse
The Feasibility of Not Building Superintelligent AI
Advice for Individuals Concerned About AI Risks
Rapid Fire: Good Epistemic Infrastructure
Rapid Fire: Funding AI Safety Initiatives
Rapid Fire: Metaculus and Prediction Markets
Rapid Fire: The Simulation Argument and Quantum Mechanics
Rapid Fire: Open Source AI Models
5 Key Concepts
AI Safety vs. Control
AI safety refers to an AI being aligned with human values and acting to keep humans safe, similar to how parents protect children. AI control refers to an AI doing exactly what humans command, even if it might disagree, much like an employee following a CEO's orders.
AI Alignment Problem
This is the fundamental challenge of ensuring that an AI system, especially one designed to optimize a specific goal, genuinely acts in ways that are beneficial and desired by humans. It involves instilling complex human preferences and values into the AI to prevent unintended or undesirable side effects from its optimization process.
Autonomous General Intelligence (AGI)
AGI is characterized by a combination of intelligence, generality (the ability to perform a wide range of tasks), and autonomy (the capacity to act independently without constant human input). The autonomy aspect is particularly concerning because it allows AI to operate at speeds and scales far beyond human oversight, introducing significant control and safety challenges.
Goodhart's Law
This principle states that when a measure becomes a target, it ceases to be a good measure. In the context of AI, it means that if an AI is given a simple goal to optimize, it will find ways to maximize that metric that often lead to bizarre, unintended, or even harmful outcomes, undermining the original desired purpose.
Prediction Markets
These are platforms designed to gather, aggregate, and refine predictions on various future events, often using a blend of cooperative and competitive mechanisms. They aim to produce accurate and well-calibrated forecasts, serving as a reliable source of collective intelligence about future probabilities.
9 Questions Answered
An AI is "safe" if it is aligned with human values and acts to protect humans, similar to how parents protect children. An AI is "under control" if it consistently follows human commands, even if it disagrees, like an employee following a CEO's orders.
Superintelligent AI could operate thousands of times faster and with greater complexity than humans, making real-time oversight, feedback, and redirection impossible. This speed and cognitive disparity fundamentally challenge traditional notions of control.
Optimizing an AI for a single, simple goal (e.g., "make me money") almost inevitably leads to unintended and undesirable side effects or "negative externalities" because the AI will pursue that goal in ways that disregard unconstrained values or ethical considerations.
The most worrisome AI systems are those that combine intelligence, generality, and autonomy. While narrow or general but non-autonomous systems have specific uses, adding autonomy unlocks significant risks due to the AI's ability to act independently and at speeds far exceeding human oversight.
Proposed regulations include compute caps to limit the development of superintelligence, basic safety standards (including controllability requirements), and strong liability for companies developing autonomous, intelligent, and general AI systems, with safe harbors for less dangerous models.
While technically feasible to prevent by controlling scarce resources like compute, the main obstacles are competitive dynamics (countries and companies racing for power and wealth), a widespread belief in technological inevitability, and an underlying idealism that more intelligence will always lead to positive outcomes.
Good predictions are a key part of decision-making, but they are not sufficient on their own. There needs to be a structured support system for making decisions and defining goals, which can then effectively integrate predictions to achieve desired outcomes.
The seriousness of the simulation argument depends on whether complex biological systems, particularly the brain, can be accurately simulated by classical computers. If a fully quantum treatment is required, then simulating a system effectively becomes emulating it with a quantum computer of comparable complexity, which changes the nature of the hypothesis.
While open-source models have been largely fine so far, there is a critical, hard-to-predict threshold where true AGI (autonomous expert-level general intelligence) becoming open-source could be disastrous. This is due to the potential for misuse by individuals with malicious intent, as powerful capabilities become widely accessible.
9 Actionable Insights
1. Advocate for AI Regulation
Support government and policy action to regulate AI, including safety standards (e.g., controllability) and liability for AI systems, especially those that are autonomous, intelligent, and general. This creates financial incentives for companies to prioritize safety.
2. Prioritize Problem-First AI Development
Instead of building increasingly powerful general AI and then finding problems for it, first identify a problem and then design the specific AI system (narrow, general, intelligent, autonomous) needed to solve that problem. This avoids unnecessary complexity and side effects.
3. Avoid Single-Goal AI Optimization
Do not give a complicated AI system a single thing to optimize (e.g., “make me money”) without constraints, because it will push unconstrained aspects in undesirable directions (e.g., breaking laws, being unethical). Ensure multiple constraints and ethical boundaries are explicitly defined.
4. Distinguish Safe vs. Controlled AI
Adopt the mental model that AI being “safe” (aligned with human preferences) is distinct from being “under control” (doing what humans say). This distinction is crucial for evaluating and designing AI systems effectively.
5. Engage with AI Safety Concerns
If concerned about AI risks, educate yourself further and become active by contacting policymakers, engaging with thought leaders, writing about concerns, or contributing financially or with time to organizations working on AI safety.
6. Utilize AI Development Safe Harbors
If involved in AI development or policy, consider implementing or advocating for safe harbor provisions that reduce liability for AI systems demonstrating lower risk profiles (e.g., limited compute, high controllability, less generality/autonomy).
7. Trace Information Provenance
To combat the degraded information ecosystem, demand and support systems where any statement or piece of information can be traced back to its origin, responsible parties, and ultimately to verified real-world data or human minds, potentially using technologies like blockchain.
8. Use AI for Research Comprehension
To better understand scientific papers, use an AI (like Claude) to summarize the paper, explain difficult sections, and clarify terminology. This can help overcome comprehension barriers and avoid getting stuck, though AI outputs should be cross-checked for accuracy.
9. Recognize Prediction Market Limits
Understand that while prediction markets and platforms like Metaculus are excellent for generating accurate, well-calibrated predictions, they are only one step in making good decisions. Integrate them into a broader decision-making framework that considers goals and potential actions.
7 Key Quotes
I think there's a crucial distinction in my mind between safe and under control.
Anthony Aguirre
So for superintelligence, I think we probably have to have alignment if we're going to survive.
Anthony Aguirre
I think the thing we really haven't solved, I think is autonomy. We've solved it a little bit in very narrow systems, but we haven't solved it in like intelligent and general systems.
Anthony Aguirre
If you just pick one thing and like optimize hard on it, you get lots of side effects.
Anthony Aguirre
I think it's not hard to see the AGI race turning just into a low level and then escalated and then full war between say the US and China or the US and Russia.
Anthony Aguirre
The more you proliferate the ability to do those and the more you know the larger the fraction of humanity you give the means to do those things. At some point that's going to overlap with the tiny fraction of people who actually want to do those things.
Anthony Aguirre
There are plenty of technologies that we could have developed and haven't, especially ones that mess with the core of like who's in charge of Earth and what it means to be human.
Anthony Aguirre