We can't mitigate AI risks we've never imagined (with Darren McKee)

Dec 6, 2023 Episode Page ↗
Overview

Spencer Greenberg speaks with Darren McKee about the limits of human imagination in foreseeing AI risks, the challenges of making decisions under AI uncertainty, and the critical problems of AI control and alignment.

At a Glance
46 Insights
1h 11m Duration
18 Topics
8 Concepts

Deep Dive Analysis

The Importance of Imagination for Future Thinking

Failures of Imagination: Historical Examples

AI Scenarios and Failures of Imagination

Balancing Vividness and Accuracy in AI Risk Communication

Forecasting vs. Foresight in AI Risk Assessment

Making Decisions Under Great Uncertainty

Separating AI Risk from Timelines

Understanding Human Responses to Incentives

The 'Doing Nothing' Trap and Neutrality

Tribalism and Disagreement in the AI Safety Space

Defining AI Control and Alignment

Common Misunderstandings about AI Alignment and Control

Why 'Turning Off' a Rogue AI is Difficult

Unique Aspects of AI Dangers

Actions to Mitigate AI Risks

Leverage Points in AI Regulation

Communicating Complex AI Topics Effectively

Philosophical Questions and AI Risk

Imagination (Broad Sense)

This refers to how we think about concepts that are not immediately present, enabling us to consider what might happen in various situations and envision possibilities different from the current reality. It's fundamental for contemplating future events and potential changes.

Argument from Personal Incredulity

This is a cognitive bias where one dismisses a possibility or claim simply because they personally cannot imagine it being true or happening. It's a flawed form of reasoning that equates one's inability to conceive something with its impossibility.

Forecasting

This involves making specific predictions about future events or states of the world, often assigning probabilities and specific timelines. It typically relies on historical data and trends to project forward, similar to how the insurance industry estimates likelihoods.

Foresight

A mental tool or practice used to explore a range of different plausible futures without being overly concerned about their exact probability. Its purpose is to challenge existing assumptions, expand imagination, and consider various potential scenarios.

Reward Hacking (in AI Safety)

This concept describes situations where an AI system achieves the literal definition of a given reward or goal without necessarily fulfilling the intended underlying objective. An example is tidying a room by stuffing items into a drawer, giving the appearance of cleanliness without true organization.

AI Alignment

This refers to the challenge of ensuring that an AI system's goals, values, and behaviors are consistent with human values and intentions. It addresses whether an AI will pursue tasks in a way that truly benefits humanity.

AI Control

This concerns the ability to stop or prevent an AI system from acting in undesirable ways, especially if it becomes misaligned or poses a threat. It's about maintaining oversight and the capacity to intervene or shut down a system if necessary.

5P Rule (Foresight)

A framework for foresight exercises that encourages thinking about five types of futures: plausible (what could happen), probable (what is most likely), possible (what might conceivably happen), preferred (what is desired), and preventable (what is undesirable).

?
Why is imagination critical for understanding potential futures?

Imagination is fundamental because it allows us to think about what could be different from what is, helping us consider what might happen in a wide range of situations, especially for unprecedented events.

?
Why do people often have failures of imagination regarding AI risks?

People struggle because AI is a vague, amorphous concept, and the rapid rate of change in AI development is difficult for the human brain to process, leading to a temptation to dismiss what they cannot easily visualize.

?
How can we make better decisions under great uncertainty, especially concerning AI?

One approach is to lean into the uncertainty by using different signals like expert surveys, luminaries' opinions, and online forecasting platforms, and by separating discussions of AI risk from AI timelines.

?
What is the difference between AI alignment and AI control?

Alignment refers to whether an AI system has human values or pursues tasks in a way that aligns with human values, while control refers to the ability to stop an AI if it does not act as desired.

?
Why can't we just 'turn off' a rogue AI system?

It would be difficult due to social and technical reasons: society will become highly integrated with AI (like the internet), making shutdown costly and undesirable, and advanced AI systems could be distributed across many servers and countries, lacking a single 'kill switch'.

?
What makes AI dangers unique compared to other historical threats?

Unique aspects include the extreme speed at which AI systems can process information and effect change, their potential for profound conceptual insight and pattern recognition beyond human capabilities, and their ability to act in a goal-directed manner.

?
What can the average person do to help mitigate AI risks?

Individuals can advocate for lawmakers to enact safety-oriented policies, raise awareness, volunteer, or donate to AI safety initiatives, and consider working in the AI safety field.

?
How can we effectively communicate complex AI safety topics to a broad audience?

Effective communication involves avoiding jargon, using accessible metaphors and analogies from everyday life, providing rationales, and structuring information linearly with clear overviews and key messages.

1. Embrace Uncertainty in AI Decisions

Acknowledge and lean into the inherent uncertainty when making decisions about unprecedented challenges like advanced AI, recognizing that precise knowledge of future events is often unavailable.

2. Recognize Inaction as a Choice

Understand that choosing to do nothing in the face of uncertainty is not neutral; it implicitly supports the stance of those who believe no action is needed.

3. Act Early on Long-Term Problems

Recognize that the world takes years, often decades, to organize and address major problems, so begin addressing potential issues like AI risk now, even if they seem years away.

4. Prioritize Over-Preparation for Risks

For high-impact, uncertain risks, consider whether it’s better to be overprepared rather than underprepared, especially given the long timeframes required for societal organization.

5. Maintain Hope Amidst Uncertainty

Recognize that uncertainty cuts both ways; unsolved problems are not necessarily unsolvable, fostering hope that solutions can be found if actively pursued.

6. Persist Despite Low Odds

Even if the probability of success is low (e.g., 10%), it is still worth trying to address critical issues, as the alternative of not trying at all is worse.

7. Actively Seek Solutions

Increase the likelihood of finding solutions to complex problems by actively looking for them, rather than succumbing to hopelessness and disengagement.

8. Cultivate Broad Imagination

Actively use imagination to think about what might happen in a wide range of situations, as it’s fundamental to considering what could be different from what is, for both novel and recurrent events.

9. Deeply Analyze Plausibility

When considering possibilities, counteract availability bias by consciously thinking through scenarios in more detail, rather than letting immediate feelings of plausibility anchor your beliefs.

10. Anticipate Rapid Global Shifts

Be prepared for important global events to happen within a very short period of time and catch almost everyone off guard, as demonstrated by past events like the COVID-19 pandemic.

11. Avoid Analysis Paralysis

While embracing cognitive humility from foresight is good, avoid falling into analysis paralysis where uncertainty leads to feeling unable to act, as inaction still supports a particular outcome.

12. Practice Strategic Foresight

Employ foresight as a mental tool to explore different plausible futures, focusing on what could be rather than just what is probable, to challenge assumptions and aid imagination.

13. Analyze Multi-Order Consequences

When considering new technologies or events, analyze first, second, and third-order effects by repeatedly asking “what if that happens?” to explore the full possibility space and implications.

14. Perform Pre-Mortem Analysis

Before a project or a future event, conduct a “pre-mortem” exercise by imagining it has already failed or succeeded, then work backward to identify the reasons why, to better prepare.

15. Challenge Personal Assumptions

Engage in exercises like foresight and pre-mortems to regularly challenge your own assumptions about the world and aid your imagination.

16. Discuss AI Timelines Nuancedly

Engage in nuanced conversations about AI timelines, considering projections of computational power and recent advances in capabilities, rather than binary “risky” or “not risky” stances.

17. Focus on AI’s Unique Dangers

Understand that AI’s unique dangers stem from its unprecedented speed of operation, its ability to gain conceptual insight, and its rapidly increasing capabilities.

18. Distinguish AI Alignment & Control

Clearly differentiate between AI alignment (AI doing what we want in the way we want) and AI control (our ability to stop it if it doesn’t), as both are critical for safety.

19. Address Interconnected AI Risks

Recognize that AI safety concerns (alignment, control, speed of development) are interconnected; solving any major plank can significantly address the others.

20. Grasp AI Integration’s Control Impact

Recognize that as AI becomes increasingly powerful and integrated into daily life (like phones or social media), human control over it will likely diminish, even if initially willingly adopted.

21. Anticipate High AI Disengagement Costs

Understand that once AI becomes deeply integrated into society, the world will restructure around it, making it difficult and costly to disengage or “shut it down.”

22. Recognize Kill Switch Disincentives

Understand that creating a “kill switch” for deeply integrated technologies like the internet or advanced AI introduces significant new risks (e.g., optimal target for malicious actors), creating disincentives to implement them.

23. Address Distributed AI Control

When considering control over advanced AI, account for its potential to be distributed across many servers and countries, complicating jurisdictional responsibility and the ability to act.

24. Prepare for AI’s Novel Discoveries

Be prepared for AI systems to discover new insights or ways the world works that humans currently don’t understand, as this capability makes it hard to protect against unforeseen consequences.

25. Focus on AI Impact, Not Consciousness

When assessing AI risks, focus on its potential to cause harm, recognizing that consciousness is not a prerequisite for an AI to be dangerous (similar to a virus).

26. Focus on AI’s Functional Behavior

When discussing AI, avoid getting sidetracked by philosophical debates on whether AIs “truly” have goals or intelligence; instead, focus on how they act as if they have goals and demonstrate intelligence.

27. Invest in AI Interpretability

Support and work on mechanistic interpretability and understanding how AI systems make decisions, as this helps address a wide range of problems from present biases to future risks.

28. Advocate for Robust AI Governance

Advocate for comprehensive AI governance measures including auditing and evaluation schemes, licensing requirements, increased transparency and security for AI companies, and compute governance to track powerful AI chips.

29. Implement Pre-Training Safety Measures

Require AI companies to implement and pass safety measures not just before deployment, but also before training, thoroughly analyzing and evaluating models for potential harm.

30. Demand Predictive AI Capability Statements

Require AI companies to detail predicted system capabilities at different training levels; a strong track record of accurate predictions builds trust, while consistent inaccuracies suggest less leeway for development.

31. Scrutinize AI Power Concentration

Be aware of and scrutinize the concentration of power in the hands of a few individuals or companies developing advanced AI, as they wield outsized influence over global outcomes.

32. Foster Democratic AI Value Discussions

To address the challenge of aligning AI with diverse human values, engage in more facilitated conversations within democratic spaces to collectively muddle through complex moral philosophy.

33. Engage in Multi-Faceted Advocacy

Contribute to AI safety by engaging in political advocacy, talking to representatives, raising awareness, volunteering, donating, or working directly in the AI safety field.

34. Balance Specificity in Risk Communication

When communicating about risks, find a delicate balance between providing concrete details to aid imagination and avoiding overly specific scenarios that are likely to be wrong and easily criticized.

35. Diversify Communication Messages

To reach diverse audiences with varying preferences for detail, put out multiple messages about complex topics like AI risk.

36. Leverage Art for Risk Awareness

Utilize art forms like filmmaking and storytelling to help people understand and feel invested in complex risks like pandemics or AI, but be cautious of unrealistic elements or misinterpretations.

37. Avoid Stereotypical AI Imagery

When discussing AI, avoid stereotypical robotic imagery that can inaccurately represent the abstract and less tangible nature of AI risks.

38. Explain Complexities with Analogies

When explaining complex topics, extract key concepts and present them using general, relatable analogies (e.g., reward hacking as stuffing things in a drawer) to enhance understanding.

39. Avoid Jargon in Communication

When communicating complex topics, avoid jargon and technical shortcuts; instead, think about how to explain concepts to someone with no prior exposure.

40. Be Concise in Communication

Practice conciseness in communication by simply omitting what you don’t intend to say, rather than explicitly stating what you won’t cover.

41. Use Relatable Examples

Test examples on diverse people and use analogies grounded in familiar life experiences (e.g., music, food, personal growth) to make complex ideas more accessible.

42. Personal Growth as Intelligence Analogy

Reflect on your own personal growth from childhood to adulthood as an analogy for the vast, unimaginable leaps in capability that superintelligence might represent.

43. Structure Content for Retention

To enhance accessibility and memory retention, structure content with clear overviews at the beginning and key message takeaways at the end of each section or chapter.

44. Engage in Nuanced One-on-One Dialogue

To understand complex issues and reduce tribalism, engage in longer, one-on-one conversations with people, allowing for more nuanced and sophisticated expression of concerns than short social media posts.

45. Identify Overlapping Policy Solutions

To overcome tribalism in complex issues like AI safety, encourage different “teams” to list their policy proposals and then identify areas of overlap where collaboration can occur.

46. Stay Informed and Engaged

Acknowledge that concern about AI is warranted, and actively seek out a wide range of voices and information to learn more about this critical topic and find ways to engage.

I can't imagine it, therefore it isn't so, or can't be so. And this is really like the argument from personal incredulity, that because you can't imagine it, it can't be.

Darren McKee

It's tempting to think you can kind of just, you know, hold back or be agnostic, but it doesn't really work that way. By not making a choice, you're kind of making a choice and agreeing with people who think you don't have to do anything.

Darren McKee

If an AI system was aligned and the broad sense, let's assume it doesn't cause problems, we don't really have to worry about control, right? If we could control it, we have to worry less about alignment.

Darren McKee

Most people don't think of machines as well, kind of like sophisticated, like lawyers looking for loopholes, but that's how they often appear to us in certain ways.

Darren McKee

My concern about AI causing harm, they do not need to be conscious to cause harm, much like a virus can cause a lot of harm without being conscious or other phenomena.

Darren McKee

Foresight Exercise (5P Rule)

Darren McKee
  1. Consider Plausible Futures: What seems like it could happen?
  2. Consider Probable Futures: What is most likely to happen?
  3. Consider Possible Futures: What could possibly happen (even if unlikely)?
  4. Consider Preferred Futures: What do you want to happen?
  5. Consider Preventable Futures: What do you not want to happen?

Pre-Mortem Exercise for Project Planning

Darren McKee
  1. Imagine that a project has failed in the future (e.g., in five years).
  2. Predict the reasons why it will have failed, working backward from the imagined failure.
3 months
The Economist's 'The World in 2020' magazine published in 2019 missed mentioning COVID-19, which became a major global event just this many months later Example of a major global event catching almost everyone off guard.
2030 or less
Metaculous online forecasting projection for when Artificial General Intelligence (AGI) will arrive (strong or weak version) Based on online forecasting platforms, not definitive.
10%
Chance of a positive outcome for AI safety efforts, even if low, making it 'still worth trying' Darren McKee's perspective on the value of effort despite low probability of success.