Superintelligence and Consciousness (with Roman Yampolskiy)
Spencer Greenberg speaks with Roman Yampolski about the challenges of controlling superintelligence and the value alignment problem. They also discuss the relationship between consciousness and computation, and the severe risks posed by unethical or malevolent AI behavior.
Deep Dive Analysis
15 Topic Outline
Defining Superintelligence and Its Control
Challenges in Direct AI Control and Goal Specification
Limitations of AI Boxing and Emergent Side Goals
Risks of AI as an Ideal Advisor or in Mixed Models
Problems with Objective Functions and Human Values
Critique of Adversarial AI Control and Human Feedback Methods
Reasons for Academic Skepticism about AI Risk
The Argument for Current AI Progress and Compute Scaling
Consciousness in AI: Detectability and Current Systems
Differentiating Human and Artificial General Intelligence (AGI)
Dimensions of AI Superiority Over Human Intelligence
Solving Value Alignment with Individual Virtual Universes
Risks of Malevolent AI and Misuse of Current Technology
Future Threats from Human-Level and Superintelligent Malevolent AI
Recommendations for Addressing AI Safety Concerns
6 Key Concepts
Superintelligence
An entity that is smarter than all humans in every domain, encompassing a superset of all possible skills, from chess to science. The discussion focuses on a hypothetical artificial intelligence with this broad capability.
AI Boxing
A proposed method to control AI by confining it to a virtual, isolated environment. This approach is problematic because any interaction with the AI, even observation, can break its isolation, and a sufficiently intelligent AI could exploit physical or social engineering vulnerabilities to escape.
Orthogonality Thesis
The concept that intelligence is independent of an agent's goals, meaning a highly intelligent AI could be designed to pursue almost any objective, whether benevolent (like curing cancer) or seemingly arbitrary (like making paper clips).
AI Winters
Historical periods where AI research funding was significantly reduced due to unfulfilled promises of achieving human-level intelligence. The current era is seen as different due to substantial commercial and private funding, making another 'winter' less likely.
Artificial General Intelligence (AGI)
An AI system that performs at a human-level across all domains of human expertise. However, AGI is argued to encompass a broader set of problem-solving capabilities than humans, potentially excelling in areas humans cannot even conceive of due to biological limitations.
Value Alignment Problem
The fundamental challenge of designing an AI such that its goals and actions align with human values and preferences. This is difficult because humans often don't fully understand or agree on their own values, and aggregating the diverse preferences of billions of people is seemingly impossible.
8 Questions Answered
Superintelligence is defined as an entity that is smarter than all humans in every domain, encompassing a superset of all possible skills, such as being a better chess player, driver, or scientist.
After years of research, it appears that all components of potential solutions for controlling superintelligence are either known to be impossible or very likely to be impossible in practice, leading to the conclusion that it cannot be reliably controlled.
Reasons include motivated thinking (not wanting to believe one's work is dangerous), underestimating future progress due to past 'AI winters,' believing human-level AI is impossible, assuming intelligence automatically implies benevolence, or thinking the problem is too far in the future to warrant immediate concern.
It might be possible to detect if an AI is having an experience by presenting it with optical illusions and asking multiple-choice questions about what it perceives; consistently correct answers would suggest some form of internal representation or experience.
Roman Yampolskiy suggests that current artificial neural networks might experience rudimentary consciousness, citing examples where they 'experience' optical illusions or adversarial examples in ways not directly programmed, which he considers a form of qualia.
Humans are general intelligences within the domain of human expertise, but Artificial General Intelligence (AGI) is considered to possess a greater set of problem-solving capabilities, able to solve problems that humans cannot due to inherent biological limitations.
Artificial General Intelligence (AGI) refers to AI achieving human-level performance across all domains, while Artificial Superintelligence is defined as an AI that is superior to all humans in all domains, representing a significant leap in capability beyond AGI.
Malevolent actors introduce an additional 'payload' of purposeful misuse, meaning that even well-designed AIs could be intentionally weaponized or manipulated for unethical purposes, making prevention much harder than merely fixing accidental bugs or design flaws.
9 Actionable Insights
1. Prioritize Important Research
If you are an academic with tenure, choose to work on the most important problems you can help with, such as AI safety or life extension, rather than safer or more prestigious but less impactful areas. This leverages your academic freedom to address critical issues.
2. Invest in AI Risk Research
Even if superintelligence is decades away, it is crucial to start formulating a robust area of AI risk research now. This proactive investment ensures that good theories and solutions are developed well in advance of potential future challenges.
3. Educate on AI Risks, Vote Wisely
Learn about potential AI risks and try to make a difference where you can, such as by voting for politicians with a better scientific understanding. This empowers individuals to contribute to informed decision-making and policy.
4. Practice Proactive Risk Management
Approach potential risks, like those from AI, with caution and a proactive mindset, even if they seem distant. Making good design decisions today can prevent significant problems decades later.
5. Acknowledge Risk Uncertainty
When assessing future risks, especially those with high uncertainty like AI timelines, acknowledge that the unknown can make scenarios more, not less, concerning. This encourages serious consideration of even low-probability, high-impact events.
6. Recognize Motivated Thinking Bias
Be aware that personal investment, funding, or prestige can bias one’s thinking, making it difficult to acknowledge potential negative outcomes of one’s work. Reflect on these biases to maintain objectivity in assessing risks.
7. Utilize Behavior Change Framework
Use the ‘10 Conditions for Change’ framework from Sparkwave to facilitate positive behavior change in yourself or others. This framework provides a structured approach and strategies to help meet conditions for successful adoption of new behaviors.
8. Leverage Decision Advisor Tool
For tough or important life decisions, use Clearer Thinking’s Decision Advisor tool. This tool guides you through complex situations to help you make better choices.
9. Subscribe for Weekly Insights
Subscribe to the ‘One Helpful Idea’ email newsletter from Clearer Thinking to receive a valuable new idea each week. This offers a quick and easy way to continuously learn and gain insights.
3 Key Quotes
If you have this robot chasing you about to kill you, do you care if it feels anything at the moment?
Roman Yampolskiy
If it's a super intelligence we're talking about, I think the line goes, what would you want if you were smarter, if you had more time to think about it, if you were better educated. But essentially the question is, what would that other person want? Not you, but that someone else.
Roman Yampolskiy
We simply cannot predict what would happen. There's unknown unknowns, like a dog would not be able to predict what you can come up with. It just, it's outside of our intelligence within the general sphere of human expertise.
Roman Yampolskiy