Taking pleasure in being wrong (with Buck Shlegeris)
Spencer Greenberg speaks with Buck Slegeris about maintaining rational beliefs, asking the right questions, and machine learning and AI alignment. Buck shares insights on intellectual humility, effective learning strategies, and his organization, Redwood Research, which tackles AI safety by solving analogous problems.
Deep Dive Analysis
16 Topic Outline
Distinction: Other People Being Wrong vs. You Being Right
The Difficulty of Forming Accurate, Specific Beliefs
The Problem of Winning Arguments Easily
Practicing Humility by Engaging with Experts
The Role of Pain Tolerance in Rationality
Writing for Falsification vs. Defensibility
Using Oversimplified Frames for Understanding
Fields Requiring Factual Knowledge vs. Intuition
Learning Through Case Studies and Examples
Drilling Small Skills for Faster Learning
Seeking Out the Smallest Questions in a Field
Distinguishing True Understanding from Cargo Culting
Redwood Research: Applied AI Alignment
Evolution of Core Problems in Machine Learning
Variance Reduction Techniques in Machine Learning
AI Alignment: Scaling Capacities for Future Deployment
7 Key Concepts
Other People Being Wrong vs. You Being Right
This distinction highlights that it's relatively easy to identify widespread false beliefs, but significantly harder to develop confident, specific, and correct beliefs of one's own. People often mistakenly assume their understanding is accurate simply because they can spot errors in mainstream opinions.
Writing for Falsification
An approach to writing where the primary goal is to state claims as bluntly and directly as possible, maximizing the contrast with common beliefs. This method aims to make it easy for others to identify and critique potential mistakes, rather than hedging or trying to make claims indefensible.
Using Oversimplified Frames
A cognitive strategy for understanding complex topics by temporarily adopting an extremely simplified or exaggerated framework. The idea is to interpret all relevant information through this frame to test its limits and identify where it needs refinement or concessions, rather than immediately incorporating all nuances.
Drilling Small Skills
A learning methodology that advocates for becoming highly proficient and fast at very small, minute-long sub-skills within a larger skill set. This capitalizes on humans' ability to learn efficiently from rapid feedback loops and high repetition, leading to increased overall productivity and better retention of mental state.
Seeking the Smallest Question
A learning strategy that involves actively searching for the most basic-sounding question in a given field that one cannot answer. This practice helps uncover and solidify foundational gaps in one's knowledge, aiming to build a robust understanding from the ground up.
Cargo Culting
A term describing the act of imitating the superficial actions or outward forms of a practice without truly understanding the underlying principles, reasons, or internal mechanisms that make it effective. This results in an appearance of competence without actual functional benefit.
Variance Reduction
A fundamental principle in machine learning, often seen in techniques like importance sampling, where the goal is to decrease the noise or variability in the estimation or learning process. By reducing variance, models can learn faster and more accurately from data, even if the expected outcome remains the same.
7 Questions Answered
It's not very hard to notice when widespread beliefs are false or dumb, but it's much harder to form confident, specific, and correct beliefs of your own, as identifying others' errors doesn't automatically imply your own understanding is accurate.
Practicing being wrong and experiencing it in undeniable situations is a crucial skill because a lack of this ability can lead to accumulating false beliefs that become painful to update away from later in life.
One effective method is to study subjects in which you are an amateur and then engage in discussions with experts in those fields, providing valuable practice in making claims in a context where errors are easily and immediately corrected, thus fostering humility.
Instead of hedging or trying to make claims indefensible, it is useful to state strong opinions bluntly and directly, aiming for easy falsification. This approach clearly highlights points of potential disagreement, making it easier for others to provide targeted criticism.
The core problem in machine learning is primarily an engineering challenge, involving difficult execution of experiments, slow feedback loops, high computational expense, and the inherent difficulty of determining if code changes are correct or if a new model architecture is truly effective.
Ensuring AI safety involves building capacities for interpretability (probing models to understand their internal workings) and red-teaming (aggressively searching for scenarios where models exhibit undesirable behaviors), practicing these skills on smaller, analogous problems today.
Taking more samples from a population with higher variance leads to a more accurate overall estimate because there are diminishing marginal returns to sampling. By focusing more samples where the data is more spread out, you can reduce the overall uncertainty in your estimate more effectively.
13 Actionable Insights
1. Exaggerate New Belief Frames
When developing new beliefs or theories, oversimplify and exaggerate them internally, temporarily behaving as if they are 100% accurate. This helps test the theory’s correctness by seeing how often it needs concessions when interpreting new information.
2. Seek Foundational Knowledge Gaps
Actively seek out the simplest-sounding questions in a field that you don’t know the answer to, even if they seem basic. This helps uncover holes in your foundational knowledge and build a more solid understanding of the subject.
3. Identify True vs. Illusory Understanding
Be vigilant about whether you truly understand a concept or are merely ‘cargo culting’ techniques without internalizing the underlying principles. Actively try to explain concepts, draw diagrams, or apply methods to reveal gaps in your comprehension.
4. Seek Expert Feedback as Amateur
To maintain intellectual humility and calibrate your confidence, intentionally study subjects where you are an amateur and engage with experts in those fields. This provides practice in being wrong in situations where your errors are undeniable.
5. Write for Easy Falsification
When writing about complex topics, state your opinions as bluntly and directly as possible, aiming for easy falsification rather than defensibility. This maximizes clarity and makes it easier for others to identify and critique potential mistakes.
6. Separate Others’ Errors from Your Truth
Recognize that noticing widespread false beliefs is easier than forming confident, specific, and correct beliefs of your own. Avoid the jump from ‘mainstream opinion is wrong’ to ‘my specific understanding is accurate,’ and instead commit to deeper investigation.
7. Cultivate Pain Tolerance for Truth
Understand that achieving accurate beliefs often requires enduring the discomfort of being wrong or receiving criticism. Maintain constant vigilance against becoming too afraid of this pain, as it can lead to accumulating unexamined false beliefs.
8. Begin Analysis with Rational Models
When analyzing complex systems, especially in fields like economics, start by modeling everything as if all agents are perfectly rational. This helps identify what truly needs explanation by less predictive theories, preventing premature jumps to behavioral economics.
9. Master Fast, Small-Scale Skills
When learning a new skill, prioritize getting very fast at sub-skills that take less than a minute to perform. This strategy leverages faster feedback loops for more repetitions and better retention, significantly boosting overall productivity.
10. Solve Analogous Future Problems
To address future grave risks (e.g., from AI), identify analogous problems that exist today and solve them using current solutions that resemble future ones. This provides concrete, iterative feedback and builds scalable solutions.
11. Practice Future Critical Capacities
Identify critical capacities needed when deploying powerful future systems (e.g., interpretability, red teaming) and conduct ‘fire drills’ by practicing them on current, less critical problems. This builds necessary skills and processes in advance.
12. Prioritize ML Execution & Tools
Recognize that machine learning’s primary challenge is often difficult engineering and execution rather than just intuition. Dedicate more time to building tools, implementing diagnostics, and ensuring efficient, well-tested code for basic tasks.
13. Post Work for Public Criticism
Share your work publicly (e.g., essays on social media) to receive diverse and challenging criticism. This helps calibrate your confidence, identify flaws in your arguments, and build more robust beliefs over time.
6 Key Quotes
It's not very hard to notice ways in which almost everyone is being pretty dumb about things, or it's not that hard to notice ways in which widespread beliefs are quite false. But it's a lot harder to come to confident and specific, correct beliefs about the world.
Buck Shlegeris
It turns out that making more sense than other people is not the bar for being accurate about things. And you need to hold yourself to a higher standard.
Buck Shlegeris
I think that to some extent, it's easy to wade into psychology, because psychology is kind of a shallow field, or I wildly hypothesize this.
Buck Shlegeris
I think that people find it easier to remember something for one minute than for two minutes. And so if you're in the middle of solving a problem and then you have to write these five lines of code, I think you lose a lot more of your state if it takes you two minutes rather than one minute to write those five lines of code.
Buck Shlegeris
I think that it's a lot more productive to really seek out questions that are as simple sounding as possible while still being really hard to answer or still like demonstrating that there's something you don't understand about this subject.
Buck Shlegeris
My sense with machine learning now is that the core problem is that it's just a very difficult engineering challenge.
Buck Shlegeris
3 Protocols
Learning by Drilling Small Skills
Buck Shlegeris- Identify a larger skill you want to learn (e.g., programming, machine learning).
- Break the larger skill down into very small sub-skills that can be performed in less than a minute.
- Practice these small sub-skills repeatedly and rapidly to achieve high speed and proficiency.
- Leverage the fast feedback loops from these quick repetitions to accelerate your learning and improve retention of mental state.
- For machine learning, this might involve building the code for a full model (e.g., GPT-2) and comparing it to a reference implementation, rather than solely focusing on the complex and slow process of training.
Learning by Seeking the Smallest Question
Buck Shlegeris- When studying a subject, actively search for the simplest-sounding question within that field that you cannot answer.
- Use these simple, unanswerable questions to pinpoint and address gaps in the foundational understanding of your knowledge.
- Aspire to define classes of problems for which you can consistently develop decision procedures, thereby building conceptual clarity and a solid knowledge base.
Redwood Research's Applied AI Alignment Strategy
Buck Shlegeris- Identify potential future problems where powerful AI could pose grave risks to humanity.
- Find the most analogous and concrete problems that are occurring in AI development today.
- Identify the technical difficulties in current AI that are most analogous to those expected to cause existential risk later.
- Develop solutions for these current, analogous problems using techniques that are most likely to scale and be effective for future, more powerful systems.
- Focus on building practical capacities, such as interpretability (understanding what models are doing) and red-teaming (stress-testing models for bad behaviors), which will be crucial when deploying highly advanced AI systems.