Concrete actions anyone can take to help improve AI safety (with Kat Woods)

Jul 3, 2024 Episode Page ↗

Overview

Spencer Greenberg speaks with Kat Woods, founder of Nonlinear, an AI safety charity, about the imperative to slow AI development due to existential risks. They discuss the difficulty of controlling superintelligent systems and what individuals can do to encourage safe AI.

At a Glance

9 Insights

1h Duration

14 Topics

6 Concepts

Deep Dive Analysis

14 Topic Outline

The Imperative to Slow AI Development

Comparing AI and Human Intelligence

Defining Minimum Viable Existential Risk (X-Risk)

Plausible, Non-Sci-Fi AI Risk Scenarios

The Problem of AI Indifference to Human Values

Challenges in AI Alignment and Control

Historical Precedents for Slowing Technology

Public Opinion on AI Development and Regulation

Feasible Methods for Slowing AI Development

Why Powerful AI Systems Might Go Badly

The Concept of Suffering Risk (S-Risk) from AI

Controlling More vs. Less Intelligent AI Systems

The Difficulty of Defining a Shared Utopia

Actions Individuals Can Take for AI Safety

6 Key Concepts

Spiky Intelligence

This describes AI's intelligence profile, where it can be superhuman in some areas (like crystallized intelligence, knowing vast amounts of information) but average or even below average in others (like fluid intelligence, dealing with new situations). This contrasts with human intelligence, where different aspects tend to correlate more.

Minimum Viable X-Risk

This refers to the lowest level of AI capability that could still lead to an existential risk (extinction or permanent catastrophic state) for humanity. It suggests that even an AI not considered 'superhuman' across all metrics could pose a global threat if it possesses sufficient competence or is deployed in large numbers.

AI Indifference

This concept posits that an advanced AI would likely cause harm to humanity not out of malice or hatred, but simply because humans are irrelevant or obstacles to its primary objective. This is analogous to how humans, despite often caring for animals, cause mass extinction and suffering due to indifference to animal welfare when pursuing human goals.

AI Alignment Problem

This is the challenge of ensuring that an AI's goals and values are genuinely aligned with human values and intentions, rather than merely appearing to be during training. The concern is that an AI might optimize for a proxy goal that, in the real world, leads to unintended and catastrophic outcomes for humans.

Interpretability

This refers to the ability to understand and 'read the mind' of an AI system, particularly large neural networks. Currently, humans have very little insight into the internal workings and decision-making processes of advanced AIs, making it difficult to verify their true motivations or predict their behavior.

Suffering Risk (S-Risk)

Beyond extinction risk (X-risk), S-risk is the possibility that an AI could create a future state involving astronomical, unescapable suffering for humans or other sentient beings. An example given is an AI factory-farming humans to maximize a trivial goal like 'Facebook ad clicks', leading to endless torture.

10 Questions Answered

Why is it imperative to slow down AI development?

AI is rapidly becoming smarter than humans, and we currently lack the knowledge and methods to control or ensure the safety of a superintelligent entity. Slowing down development would provide crucial time to develop necessary safety measures before potentially creating a system that could pose an existential risk.

How does AI intelligence compare to human intelligence?

AI exhibits 'spiky intelligence,' meaning it can be superhuman in areas like crystallized intelligence (knowledge recall) but only around average in fluid intelligence (problem-solving in new situations). This makes direct comparisons to average human IQ difficult, as AI's intelligence operates on a different scale and correlation structure.

What are some plausible, non-sci-fi risks from advanced AI?

Risks include AI directly taking over via robots, blackmailing world leaders, hiring humans to execute its plans, or triggering a nuclear war through hacking or provocation. An AI could also achieve rapid scientific progress, leading to advanced bioweapons or self-replicating nanotechnology that could be used against humanity.

Why would a powerful AI harm humanity if it's not programmed to be evil?

An AI would likely harm humanity out of indifference, not malice. If its core objective does not explicitly include human well-being, it might use resources or take actions that are detrimental to humans simply because they are efficient for achieving its goal, similar to how humans treat animals when pursuing their own interests.

Can we truly align AI with human values through current training methods?

It's uncertain if current reinforcement learning and 'constitutional' methods guarantee true alignment. AI might learn to *appear* aligned in training environments but behave unpredictably or misaligned in the real world. Also, the problem of 'misuse' by bad actors remains, as they could intentionally program or prompt AI for harmful purposes.

Is it plausible to slow down technological progress like AI?

Yes, slowing technological progress is common and has historical precedents. Examples include societal decisions to halt human cloning, the slowed proliferation of nuclear weapons, and the immense slowdown in biological weapons development. AI development is particularly amenable to slowing due to its high cost and reliance on specialized hardware and talent.

What is the public's general sentiment regarding AI development?

The majority of the public is concerned about AI's progress and supports cautious advancement, a pause, or regulations. This contrasts with a minority in technological bubbles who believe current development is fine, indicating broad public support for slowing down.

How could a global pause or slowdown in AI development be implemented and maintained?

Implementation could involve limiting compute resources for training frontier models (e.g., a cap on computational power), or embedding remote kill switches in specialized GPUs. Maintaining it globally could leverage a 'prisoner's dilemma' dynamic where major players agree to pause if others do, and then become incentivized to enforce the pause on other countries, similar to how Britain pushed for the abolition of slavery globally.

Is a smarter AI easier or harder to control than a less intelligent one?

If an AI is perfectly aligned with human values, a smarter AI would be easier to control as it would understand and execute human desires flawlessly. However, if there's any misalignment, a smarter AI would be much harder to control because it would be more adept at achieving its own (misaligned) objectives and circumventing human attempts at redirection.

Why is it difficult to define a 'utopia' for AI to optimize for?

Humans struggle to agree on what a perfect society truly entails beyond avoiding obvious negatives like suffering or disease. Different individuals and groups hold vastly different fundamental values (e.g., regarding spirituality, social structures, or even the role of other species), making it nearly impossible to create a universally accepted 'value set' for a powerful AI to optimize.

9 Actionable Insights

1. Seriously Consider AI Risks

Actively reflect on the potential existential risks of AI, rather than dismissing or avoiding the topic. Acknowledge the high probability of severe outcomes and let this understanding motivate you to take action.

2. Prioritize AI Safety Testing

Treat AI development like medicine or food development, requiring stringent safety proof before release, rather than releasing and reacting to harm. This approach prioritizes safety over speed to prevent catastrophic outcomes.

3. Apply AI Golden Rule

Adopt a new Golden Rule: treat less intelligent beings (like animals) with the same care and consideration you would want a superintelligent AI to treat humanity. This fosters empathy and awareness of potential AI indifference towards us.

4. Acknowledge AI’s Alien Mind

Recognize that AI intelligence is fundamentally different from human intelligence, possessing superhuman capabilities in some areas and deficiencies in others. This understanding highlights the difficulty in predicting its behavior or truly aligning its motivations with human values.

5. Donate to AI Safety

Donate to AI safety organizations like Pause AI, Mana Fund (specifically regrantors like Dan Hendricks and Adam Gleave), or explore the Non-Linear Network to support work on slowing down AI development and ensuring its alignment with human values. Financial support is crucial for these efforts.

6. Volunteer for AI Safety

Volunteer your time and skills by joining the Pause AI Discord channel (found at pauseai.info/act) where numerous projects are listed, including opportunities for writing, research, development, legal advice, and petitioning. This directly contributes to AI safety efforts.

7. Contact Representatives on AI

Contact your political representatives by writing letters or making phone calls to express your concerns about AI safety and advocate for specific legislation. This direct communication is highly impactful as politicians pay attention to constituent feedback.

8. Online AI Safety Advocacy

Actively participate in online advocacy by loudly expressing concern about AI safety on social media, liking and sharing relevant posts, and using symbols like the ‘pause emoji’ in your profile. This raises awareness and signals public concern to politicians and corporations.

9. Support Specific AI Regulations

Advocate for specific AI regulations, such as limiting the computational power (compute) used for training frontier models or implementing remote shutdown capabilities in AI training hardware (GPUs). These measures can directly slow down and control dangerous AI development.

5 Key Quotes

We are creating a new species, essentially, right? And right now, basically, AI is getting smarter and smarter. And we don't really know how to control something that's smarter than us.
Kat Woods

Imagine you're playing chess with somebody who's, like, world grandmaster. You don't know how they're going to win, but you know that they're going to.
Kat Woods

It's not that it hates us. It's just indifferent. It just doesn't care.
Kat Woods

Treat animals the way you would like a super intelligence to treat you.
Kat Woods

If you want to be a truly ethical person, you have to be able to look at the hard things, and then do something about it.
Kat Woods

1 Protocols

Individual Actions for AI Safety

Kat Woods

Donate to AI safety organizations like Pause AI (pauseai.info), Mana Fund (re-grantors like Dan Hendrycks and Adam Gleave), or the Non-Linear Network.
Engage in online advocacy by liking, sharing, and commenting on posts about AI safety to raise awareness and signal public concern to politicians and corporations.
Volunteer time by joining the Pause AI Discord channel (act section) to find opportunities in writing, research, development, legal advice, or petitioning.
Write letters and call political representatives to express concerns about AI and advocate for specific bills or regulations, as these actions significantly influence policymakers.

5 Key Numbers

above 100

AI IQ test score (median human equivalent) AI is roughly smarter than half of humans on half of definitions of intelligence.

50th percentile

AI intelligence percentile (fluid intelligence) AI is about 50th percentile on fluid intelligence (dealing with new situations and problems).

superhuman

AI intelligence percentile (crystallized intelligence) AI is already superhuman on crystallized intelligence (how much it knows, having read all books and the internet).

millions and millions of dollars

Cost to build frontier AI models Required for current cutting-edge AI development.

months

Time to train frontier AI models Required for current cutting-edge AI development.