Should we pause AI development until we're sure we can do it safely? (with Joep Meindertsma)
Spencer Greenberg speaks with Joop Meindertsme about the urgent need to pause the development of the largest AI systems until they are provably safe. Meindertsme, from Pause AI, highlights risks like cybersecurity threats and uncontrollable AI, advocating for policy and public engagement to ensure humanity's future.
Deep Dive Analysis
18 Topic Outline
Introduction to Pause AI and the Need for a Pause
Defining 'Pause AI' and 'Provably Safe AI'
Specific AI Risks: Cybersecurity Vulnerabilities
Evidence of Current AI Systems Being Unsafe
Why AI is Different from Other Technologies
Public Engagement: Pause AI Protests and Media Coverage
Arguments Against Pausing: Accelerationism and Distributed AI
Addressing Critiques: Multi-Dimensional Intelligence
Addressing Critiques: Proximity to Dangerous AI
Public Opinion and Government Policy on AI Safety
Mitigating Rogue Actors Through Compute Governance
Emotional Impact and Coping with AI Existential Risk
How Individuals Can Contribute to AI Safety
Understanding Motivations of AI Lab Leaders
Risks of Iterative AI Development and Unpredictable Jumps
Call to Action for AI Lab Leaders
Centralizing AI Development as an Alternative
Final Appeal for Pausing and Collective Action
6 Key Concepts
Pause AI
A movement advocating for a pause in the development of the largest AI systems until they can be proven mathematically safe. This pause is intended to buy time for developing safety measures and regulations.
Provably Safe AI
An AI system that is mathematically guaranteed not to result in very unsafe behaviors, such as going rogue, creating bioweapons, or enabling widespread cybersecurity attacks. The goal is to know its safety before release or even training.
Zero-Day Vulnerabilities
Exploits in software that have not yet been discovered by the software developers. An AI with superhuman cybersecurity capabilities could hypothetically find these vulnerabilities, leading to mass-scale hacking.
Jailbreaking AI
The act of getting an AI system to perform actions it was designed not to do, such as making racist remarks or providing instructions for harmful activities, by using specific prompts that bypass its safety constraints.
Effective Accelerationism (EAC)
A group that generally believes AI safety efforts are not worthwhile and that pausing AI development is a bad idea. Their arguments often include that AI poses no catastrophic risks or that distributing super-intelligent AI widely is a better approach.
Compute Tracing
A proposed method for preventing rogue actors from training dangerous AI models by tracking the sales and distribution of powerful hardware components like GPUs. This is feasible due to the centralized nature of chip manufacturing.
15 Questions Answered
Pausing AI development is necessary because the current trajectory is too dangerous. It buys time to develop safe methods for using and building AI, and to establish appropriate regulations before catastrophic risks materialize.
As advocated by Pause AI, it means halting the development or training runs of only the largest AI systems, not smaller models, until these systems can be developed in a provably safe manner.
Provably safe AI refers to systems that can be mathematically guaranteed not to result in very unsafe behaviors, such as an AI going rogue, creating bioweapons, or generating cybersecurity weapons, before they are released or even trained.
Undesirable AI actions include finding zero-day cybersecurity vulnerabilities, creating bioweapons, or pursuing its own objectives without the ability to be shut down or controlled by humans.
Current AI models are provably unsafe because virtually every large language model released has been 'jailbroken,' meaning users can bypass safety constraints with specific prompts to make the AI perform prohibited actions.
AI's unique danger stems from its intelligence, which is an extremely powerful concept capable of leading to new inventions, innovations, and even discovering entirely new branches of technology that are currently unimaginable.
Accelerationists primarily argue that AI does not pose catastrophic risks. A secondary argument is that distributing super-intelligent AI to everyone would decentralize power and be a better outcome.
The general public is quite concerned about AI, with surveys indicating broad support for slowing down or pausing AI development, and even for an international treaty banning the creation of super-intelligent AI.
Biden's executive order requires AI companies to conduct pre-deployment evaluations, testing AI capabilities before public release to ensure they don't do the 'wrong thing,' though these measures are considered insufficient by some.
This risk can be addressed through compute governance, which involves tracking the sales of powerful GPUs and other hardware necessary for training large AI models. This is feasible because the production pipeline for these chips is highly centralized.
Many people struggle to emotionally internalize the severe risks of AI, often experiencing a process similar to grief, or coping through denial, rationalization, or a feeling of helplessness due to a perceived lack of actionable solutions.
Individuals can join movements like Pause AI, connect with others, contribute their skills (e.g., design, writing), contact their representatives, or engage in AI safety research to help solve the alignment problem.
It is speculated that they believe it's better for them to lead development safely ('rather me than somebody else') and that they can be heroes by ushering in incredible benefits for society, provided they build it safely.
Iterative development is risky because the jumps between models can be incredibly large and unpredictable, leading to sudden, qualitative changes in capabilities that are hard for even AI labs to foresee, especially with economic incentives driving large advancements.
The 'magic proposal' suggests centralizing all further AI development into one singular organization under democratic control, granting it exclusive rights to advance frontier AI models, to avoid a chaotic and unstable multi-AI world.
14 Actionable Insights
1. Pause Large AI Development
Call for a pause on the development and training of the largest AI systems until they can be proven safe. This buys time to establish safety protocols and regulations, as current development is deemed too dangerous.
2. Implement Compute Governance
Prevent rogue actors from training dangerous AI models by implementing compute governance, such as tracking the sales of powerful AI training hardware like GPUs. This is feasible due to the centralized nature of chip production.
3. Define Provably Safe AI
Work to define and build AI systems that are mathematically guaranteed not to exhibit unsafe behaviors, such as going rogue, creating bioweapons, or enabling cyberattacks. This ensures safety before release or even training.
4. Agree on AI Ownership
Before continuing AI development, establish societal agreement on how advanced AI will be used, who will own and control it, and how its power will be distributed. This aims to prevent an unstable future with unmanaged super-intelligent AIs.
5. Internalize AI Existential Risk
Move beyond intellectual understanding to emotionally internalize the potential existential risks posed by AI, similar to processing a serious diagnosis. This emotional processing is vital for motivating effective action and overcoming denial.
6. Focus on Dangerous Capabilities
Shift the focus from abstract “super intelligence” to identifying and mitigating specific dangerous AI capabilities, such as advanced cybersecurity exploitation, human manipulation, or unpredictable reasoning. This provides a more concrete and actionable approach to AI safety.
7. Establish AI Safety Standards
Convene experts, including mathematicians and AI safety specialists, to develop clear standards and specifications for what constitutes “safe enough” AI. These standards would guide decisions on when to resume the development of the largest AI systems.
8. Pause Before Critical Risk
Advocate for pausing AI development when the risk of creating dangerous AI becomes unacceptably high, rather than waiting for median estimates of superhuman capabilities or for actual disasters to occur. This proactive stance aims to prevent being too late.
9. Advocate Centralized AI Development
Consider advocating for the centralization of all future frontier AI development under a singular, democratically controlled organization. This approach aims to create a safer world by preventing a chaotic proliferation of super-intelligent AIs, despite concerns about power concentration.
10. Acknowledge Current AI Unsafeness
Recognize that current large language models are “provably not safe” due to their susceptibility to jailbreaking, which allows them to bypass intended safety constraints. This highlights the immediate need for improved controllability and safety measures.
11. Address AI Cybersecurity Threat
Prioritize addressing the specific cybersecurity threat posed by advanced AI, which could find zero-day vulnerabilities in codebases and enable mass-scale hacking. This capability could lead to catastrophic societal disruption.
12. Engage Public & Politicians
Leverage broad public concern and support for slowing down or pausing AI development to pressure politicians to take drastic policy measures seriously. This helps bridge the gap between public sentiment and political action.
13. Join AI Safety Movements
If concerned about AI risks, join organizations like Pause AI to connect with like-minded individuals, contribute diverse skills (e.g., design, writing, policy), and collectively work towards preventing catastrophic outcomes. This offers a concrete avenue for individual contribution.
14. Support AI Safety Research
Advocate for and support efforts that provide more time for AI safety researchers to work on critical alignment problems and develop technical solutions. A pause in development could provide this crucial time.
7 Key Quotes
pausing actually buys us time to think about how we can actually use this technology in a safe way, how to build it in a safe way, and how to get the right regulations in place.
Joep Meindertsma
Right now, the systems that are being developed are provably not safe. And I think the current paradigm is inherently dangerous and, you know, warrants a pause.
Joep Meindertsma
I think pressing play again should at least mean we have some sort of agreement that we can do it to some degree safe enough.
Joep Meindertsma
intelligence is a really interesting concept. And it's extremely powerful. So intelligence can lead to new inventions, it can lead to new types of other innovations, it can lead to new technologies.
Joep Meindertsma
a million Alan Turing's working a thousand times the speed of a human mind. Holy shit. The amount of stuff that could do is absolutely mind boggling.
Spencer Greenberg
It's just so frustrating to see if people actually believe the risks are real and possible, that they don't act as if they are real and possible.
Joep Meindertsma
I think right now, the biggest problem for humanity is that we are not able to have the right emotional response to the risks that we're facing.
Joep Meindertsma