Taking pleasure in being wrong (with Buck Shlegeris)

Jun 8, 2022 Episode Page ↗

Overview

Spencer Greenberg speaks with Buck Slegeris about maintaining rational beliefs, asking the right questions, and machine learning and AI alignment. Buck shares insights on intellectual humility, effective learning strategies, and his organization, Redwood Research, which tackles AI safety by solving analogous problems.

At a Glance

13 Insights

1h 16m Duration

16 Topics

7 Concepts

Deep Dive Analysis

16 Topic Outline

Distinction: Other People Being Wrong vs. You Being Right

The Difficulty of Forming Accurate, Specific Beliefs

The Problem of Winning Arguments Easily

Practicing Humility by Engaging with Experts

The Role of Pain Tolerance in Rationality

Writing for Falsification vs. Defensibility

Using Oversimplified Frames for Understanding

Fields Requiring Factual Knowledge vs. Intuition

Learning Through Case Studies and Examples

Drilling Small Skills for Faster Learning

Seeking Out the Smallest Questions in a Field

Distinguishing True Understanding from Cargo Culting

Redwood Research: Applied AI Alignment

Evolution of Core Problems in Machine Learning

Variance Reduction Techniques in Machine Learning

AI Alignment: Scaling Capacities for Future Deployment

7 Key Concepts

Other People Being Wrong vs. You Being Right

This distinction highlights that it's relatively easy to identify widespread false beliefs, but significantly harder to develop confident, specific, and correct beliefs of one's own. People often mistakenly assume their understanding is accurate simply because they can spot errors in mainstream opinions.

Writing for Falsification

An approach to writing where the primary goal is to state claims as bluntly and directly as possible, maximizing the contrast with common beliefs. This method aims to make it easy for others to identify and critique potential mistakes, rather than hedging or trying to make claims indefensible.

Using Oversimplified Frames

A cognitive strategy for understanding complex topics by temporarily adopting an extremely simplified or exaggerated framework. The idea is to interpret all relevant information through this frame to test its limits and identify where it needs refinement or concessions, rather than immediately incorporating all nuances.

Drilling Small Skills

A learning methodology that advocates for becoming highly proficient and fast at very small, minute-long sub-skills within a larger skill set. This capitalizes on humans' ability to learn efficiently from rapid feedback loops and high repetition, leading to increased overall productivity and better retention of mental state.

Seeking the Smallest Question

A learning strategy that involves actively searching for the most basic-sounding question in a given field that one cannot answer. This practice helps uncover and solidify foundational gaps in one's knowledge, aiming to build a robust understanding from the ground up.

Cargo Culting

A term describing the act of imitating the superficial actions or outward forms of a practice without truly understanding the underlying principles, reasons, or internal mechanisms that make it effective. This results in an appearance of competence without actual functional benefit.

Variance Reduction

A fundamental principle in machine learning, often seen in techniques like importance sampling, where the goal is to decrease the noise or variability in the estimation or learning process. By reducing variance, models can learn faster and more accurately from data, even if the expected outcome remains the same.

7 Questions Answered

How hard is it to arrive at true beliefs about the world?

It's not very hard to notice when widespread beliefs are false or dumb, but it's much harder to form confident, specific, and correct beliefs of your own, as identifying others' errors doesn't automatically imply your own understanding is accurate.

Why is it important to practice being wrong?

Practicing being wrong and experiencing it in undeniable situations is a crucial skill because a lack of this ability can lead to accumulating false beliefs that become painful to update away from later in life.

How can one improve their ability to argue and reason effectively?

One effective method is to study subjects in which you are an amateur and then engage in discussions with experts in those fields, providing valuable practice in making claims in a context where errors are easily and immediately corrected, thus fostering humility.

What is a useful approach for writing about complex topics?

Instead of hedging or trying to make claims indefensible, it is useful to state strong opinions bluntly and directly, aiming for easy falsification. This approach clearly highlights points of potential disagreement, making it easier for others to provide targeted criticism.

What is the core problem in machine learning research today?

The core problem in machine learning is primarily an engineering challenge, involving difficult execution of experiments, slow feedback loops, high computational expense, and the inherent difficulty of determining if code changes are correct or if a new model architecture is truly effective.

How can one ensure an AI model is safe to deploy?

Ensuring AI safety involves building capacities for interpretability (probing models to understand their internal workings) and red-teaming (aggressively searching for scenarios where models exhibit undesirable behaviors), practicing these skills on smaller, analogous problems today.

Why is it beneficial to take more samples from a population with higher variance when estimating an average?

Taking more samples from a population with higher variance leads to a more accurate overall estimate because there are diminishing marginal returns to sampling. By focusing more samples where the data is more spread out, you can reduce the overall uncertainty in your estimate more effectively.

13 Actionable Insights

1. Exaggerate New Belief Frames

When developing new beliefs or theories, oversimplify and exaggerate them internally, temporarily behaving as if they are 100% accurate. This helps test the theory’s correctness by seeing how often it needs concessions when interpreting new information.

2. Seek Foundational Knowledge Gaps

Actively seek out the simplest-sounding questions in a field that you don’t know the answer to, even if they seem basic. This helps uncover holes in your foundational knowledge and build a more solid understanding of the subject.

3. Identify True vs. Illusory Understanding

Be vigilant about whether you truly understand a concept or are merely ‘cargo culting’ techniques without internalizing the underlying principles. Actively try to explain concepts, draw diagrams, or apply methods to reveal gaps in your comprehension.

4. Seek Expert Feedback as Amateur

To maintain intellectual humility and calibrate your confidence, intentionally study subjects where you are an amateur and engage with experts in those fields. This provides practice in being wrong in situations where your errors are undeniable.

5. Write for Easy Falsification

When writing about complex topics, state your opinions as bluntly and directly as possible, aiming for easy falsification rather than defensibility. This maximizes clarity and makes it easier for others to identify and critique potential mistakes.

6. Separate Others’ Errors from Your Truth

Recognize that noticing widespread false beliefs is easier than forming confident, specific, and correct beliefs of your own. Avoid the jump from ‘mainstream opinion is wrong’ to ‘my specific understanding is accurate,’ and instead commit to deeper investigation.

7. Cultivate Pain Tolerance for Truth

Understand that achieving accurate beliefs often requires enduring the discomfort of being wrong or receiving criticism. Maintain constant vigilance against becoming too afraid of this pain, as it can lead to accumulating unexamined false beliefs.

8. Begin Analysis with Rational Models

When analyzing complex systems, especially in fields like economics, start by modeling everything as if all agents are perfectly rational. This helps identify what truly needs explanation by less predictive theories, preventing premature jumps to behavioral economics.

9. Master Fast, Small-Scale Skills

When learning a new skill, prioritize getting very fast at sub-skills that take less than a minute to perform. This strategy leverages faster feedback loops for more repetitions and better retention, significantly boosting overall productivity.

10. Solve Analogous Future Problems

To address future grave risks (e.g., from AI), identify analogous problems that exist today and solve them using current solutions that resemble future ones. This provides concrete, iterative feedback and builds scalable solutions.

11. Practice Future Critical Capacities

Identify critical capacities needed when deploying powerful future systems (e.g., interpretability, red teaming) and conduct ‘fire drills’ by practicing them on current, less critical problems. This builds necessary skills and processes in advance.

12. Prioritize ML Execution & Tools

Recognize that machine learning’s primary challenge is often difficult engineering and execution rather than just intuition. Dedicate more time to building tools, implementing diagnostics, and ensuring efficient, well-tested code for basic tasks.

13. Post Work for Public Criticism

Share your work publicly (e.g., essays on social media) to receive diverse and challenging criticism. This helps calibrate your confidence, identify flaws in your arguments, and build more robust beliefs over time.

6 Key Quotes

It's not very hard to notice ways in which almost everyone is being pretty dumb about things, or it's not that hard to notice ways in which widespread beliefs are quite false. But it's a lot harder to come to confident and specific, correct beliefs about the world.
Buck Shlegeris

It turns out that making more sense than other people is not the bar for being accurate about things. And you need to hold yourself to a higher standard.
Buck Shlegeris

I think that to some extent, it's easy to wade into psychology, because psychology is kind of a shallow field, or I wildly hypothesize this.
Buck Shlegeris

I think that people find it easier to remember something for one minute than for two minutes. And so if you're in the middle of solving a problem and then you have to write these five lines of code, I think you lose a lot more of your state if it takes you two minutes rather than one minute to write those five lines of code.
Buck Shlegeris

I think that it's a lot more productive to really seek out questions that are as simple sounding as possible while still being really hard to answer or still like demonstrating that there's something you don't understand about this subject.
Buck Shlegeris

My sense with machine learning now is that the core problem is that it's just a very difficult engineering challenge.
Buck Shlegeris

3 Protocols

Learning by Drilling Small Skills

Buck Shlegeris

Identify a larger skill you want to learn (e.g., programming, machine learning).
Break the larger skill down into very small sub-skills that can be performed in less than a minute.
Practice these small sub-skills repeatedly and rapidly to achieve high speed and proficiency.
Leverage the fast feedback loops from these quick repetitions to accelerate your learning and improve retention of mental state.
For machine learning, this might involve building the code for a full model (e.g., GPT-2) and comparing it to a reference implementation, rather than solely focusing on the complex and slow process of training.

Learning by Seeking the Smallest Question

Buck Shlegeris

When studying a subject, actively search for the simplest-sounding question within that field that you cannot answer.
Use these simple, unanswerable questions to pinpoint and address gaps in the foundational understanding of your knowledge.
Aspire to define classes of problems for which you can consistently develop decision procedures, thereby building conceptual clarity and a solid knowledge base.

Redwood Research's Applied AI Alignment Strategy

Buck Shlegeris

Identify potential future problems where powerful AI could pose grave risks to humanity.
Find the most analogous and concrete problems that are occurring in AI development today.
Identify the technical difficulties in current AI that are most analogous to those expected to cause existential risk later.
Develop solutions for these current, analogous problems using techniques that are most likely to scale and be effective for future, more powerful systems.
Focus on building practical capacities, such as interpretability (understanding what models are doing) and red-teaming (stress-testing models for bad behaviors), which will be crucial when deploying highly advanced AI systems.

3 Key Numbers

150 lines

Lines of code for a full GPT-2 model Refers to the code required for the model's structure, not its training.

60%

Target false positive rate for injury classifier This is the acceptable rate for false alarms, while aiming for zero false negatives (missing actual injuries).

2017

Year Buck Shlegeris started seriously learning deep learning Buck's personal experience.