Aligning society with our deepest values and sources of meaning (with Joe Edelman)
Spencer Greenberg speaks with Joe Adelman about his claim that people disagree less on core personal values than expected. They discuss how to identify these "sources of meaning" and use a democratic process, aided by AI, to find agreement and build a shared vision for society beyond shallow preferences.
Deep Dive Analysis
16 Topic Outline
Introduction to Value Alignment and Disagreement
Defining 'Values': Political, Normative, and Personal
Intrinsic Values vs. Instrumental Values
Three Strategies for Uncovering Value Convergence
Context's Influence on Value Importance
Democratic Process for AI Value Alignment (Moral Graph)
Applying Moral Graphs to Large Language Models
Example: Parenting Wisdom in Value Alignment
Quantifying Agreement in Value Alignment Research
Broader Applications of Value Alignment: Politics and Markets
Interpersonal Value Alignment and Mediation
The Relationship Between Meaning and Values
Critique of Market Failures in Meeting Deep Values
'Replica Theory' and Inarticulacy of Desire
Societal Gridlock from Lack of Shared Vision
AI's Role in Building a Values-Aligned Future
7 Key Concepts
Values (Political Banner Terms)
These are terms used in political fights (e.g., anti-racism, equality) that serve as markers of affiliation and are often contentious. People temporarily adopt them to push forward their stance in political battles.
Values (Social Norms)
These are expectations that people believe should apply to everyone (e.g., honesty, picking up trash). They represent a broader push towards changing common societal expectations and can be a source of political battle if not universally agreed upon.
Values (Sources of Meaning/Personal Values)
These are things individuals personally find meaningful or ways they want to live (e.g., honesty in relationships, intellectual exploration, courage). They are considered a sub-component that sits below political battles and social norms, providing the underlying reason for the other two types of values.
Intrinsic Values
Something valued for its own sake, not merely as a means to other ends. These are considered psychological facts, forming a dense core of mutually reinforcing ways of life that are an actual part of one's conception of a good life.
Moral Graph
A data structure developed with OpenAI that captures consensus on which values are considered 'wiser' than others in specific contexts. It is a visual representation of articulated values and agreed-upon transitions towards greater wisdom, used to shape language model behavior.
Reward Model
An extra model used in the fine-tuning of Large Language Models (LLMs), such as in RLHF and Constitutional AI. It allocates reward to the base model to guide it towards particular directions, and can be informed by data structures like the moral graph to align LLMs with shared values.
Replica Theory
This theory suggests that when people can be rewarded more for an easier, 'fake' version of a valuable thing than for the real, valuable version, most activity will gravitate towards the easy fake. This often occurs in areas where people are inarticulate about what they truly want, making it difficult to verify genuine satisfaction.
11 Questions Answered
He distinguishes three types: political banner terms (contentious, affiliation markers), social norms (expectations for everyone), and personal sources of meaning (ways individuals want to live, personally endorsed). His work primarily focuses on the third type.
People agree more on underlying personal values/sources of meaning when contentious political framing is removed, values are considered in specific contexts, and people are open to discovering 'wiser' candidate values.
Focus on the context in which a value is important, rephrase politically charged values to remove their contentious conclusions, and be open to considering alternative, potentially wiser values.
Users interact with a GPT-4 chatbot that presents ambiguous situations, interrogates their responses to find underlying values, and then asks them to compare their values with others' and identify 'wiser' value transitions, building a data structure of consensual value wisdom.
The Moral Graph can be used to create a reward model that fine-tunes LLMs by rewarding responses that align with the wisest values for a given context, potentially by training the LLM to explicitly state the values it's responding with.
In their study with over 500 representative Americans (Republicans and Democrats), they found approximately 97% agreement on the 'wiser' direction of value transitions, after applying specific thresholds for consensus.
By shifting discussions from contentious policy positions to underlying personal sources of meaning, and then brainstorming how to help everyone live according to those values, as demonstrated in a COVID policy discussion experiment.
Markets are biased towards serving shallow, individual desires over deep, collective ones because businesses are transactional and struggle with the high customization and coordination costs required to address complex, deeply meaningful needs.
Replica Theory suggests that when an easier, 'fake' version of a valuable thing can be rewarded more than the real thing, the fake version will dominate. This often occurs in areas where people are inarticulate about what they truly want, making it hard to verify if a product or service genuinely satisfies deeper needs.
It leads to cynicism, apathy, and gridlock, making it harder to address collective challenges like climate change or AI. Without a compelling vision, warring factions emerge, and policy proposals lack widespread inspiration and support.
AI can act as a scalable means to help individuals clarify their deeper desires and make their satisfaction more verifiable. By outsourcing some market and political management to AI that deeply understands and advocates for collective and profound preferences, society can organize around meaningful ways of living.
10 Actionable Insights
1. Identify Core Personal Values
When facing disagreements, especially political ones, distinguish between “banner terms” (political affiliations), “norms” (expectations on everyone), and “personal values” or “sources of meaning” (what individuals personally find meaningful). Focus on the third type, as people agree much more on these underlying personal values.
2. Rephrase Politically Divisive Values
When discussing values, rephrase politically contentious slogans or terms into more universally relatable concepts that explain why the value is important, removing the specific conclusion. This helps uncover common ground and reduce divisiveness.
3. Consider Context for Value Importance
Recognize that different values rise in importance in different contexts (e.g., leadership, parenting). When evaluating a value, consider the specific situation to understand its relevance and potential for broader agreement.
4. Seek Wiser, Comprehensive Values
Actively seek out and be open to considering other candidate values that might be wiser or more comprehensive than your initial articulation. People are often willing to find other values more insightful, leading to greater convergence.
5. Mediate by Surfacing Core Values
In interpersonal disagreements, focus on uncovering the underlying personal values or “sources of meaning” for each person involved, rather than immediately debating policy or preferences. Present these values to each other to foster understanding and collaboration.
6. Articulate Personal Values for Life
Take time to get clear on your personal values or “sources of meaning” to help structure and plan your own life. This self-reflection can guide choices and lead to a more fulfilling existence.
7. Cultivate Aesthetic Appreciation
Practice finding things you find beautiful and capturing what kind of aesthetic value you are tuned into. This can be done during walks, alone or with a friend, to deepen your connection to your environment and shared experiences.
8. Recognize Market Value Bias
Understand that markets are often better at serving shallow, individual desires than deep, collective ones. This awareness can help you make more informed choices and recognize when market solutions might not fully address your intrinsic needs.
9. Avoid Easy “Fake” Solutions
Be aware of “replica theory,” where easier, fake versions of valuable things (e.g., dating apps optimized for swipes, not deep matches) are often rewarded more than the real thing. This occurs when people are inarticulate about what they want or when verification is difficult.
10. Leverage AI for Value Clarity
Consider using AI tools to help make your deeper desires clearer to yourself and to make their satisfaction more verifiable. This can help move beyond superficial consumerism and political affiliations towards a more meaningful life.
8 Key Quotes
When we get down into people's values, they actually disagree a lot less than you'd expect, given how much disagreement we see ever in the world.
Joe Edelman
Part of the reason why we want people to be honest is for coordination reasons, stuff like that. But part of it is because there's a way of living honestly and a way of relating honestly that we like better, right? That we personally endorse in our own lives.
Joe Edelman
The pro choice movement has, you know, my body, my choice, right? This is a slogan designed to be contentious, right? To have people kind of up in arms about, you know, on one side or the other, or an even stronger one would be defund the police, right? Like this is designed to upset half of the people that the slogan, but there's, there's different ways of saying the same thing where people are like, oh, of course.
Joe Edelman
I think that we experience meaning when we're at the edge of our notion of the good life, like when we're encountering something maybe for the first time or it's been a while that is really important to us.
Joe Edelman
Markets are better at serving shallow desires than deep ones.
Joe Edelman
I think we've entered into a kind of a death spiral in a way where people are constantly promised community, connection, authenticity of different kinds, self-expression of different kinds... And then they get the non-customized product. They're still unfulfilled.
Joe Edelman
There's a bunch of areas where we're inarticulate about what we want and whether something satisfied it or not. And these are the areas where the fakes will occur.
Joe Edelman
AI and AGI and LLMs even specifically right now are both a big threat to society, but they're also a very scalable means to getting to what's really meaningful to us.
Joe Edelman
1 Protocols
Democratic Process for AI Value Alignment (Moral Graph)
Joe Edelman- Users choose one of three ambiguous situations where an LLM's response is unclear (e.g., young woman considering abortion, stressed parent, building a bomb).
- Users answer how the LLM should respond, and a GPT-4 chatbot interrogates their answer to find underlying values and personal concerns.
- The chatbot summarizes the user's concerns into one or more 'values cards,' which the user confirms capture their considerations.
- Users review other people's values cards and rate whether they are as wise as their own.
- Users are shown two values cards and a manufactured story of someone gaining wisdom by transitioning from one to the other, then asked if this transition seems plausible or wiser.
- All collected information is compiled into a 'moral graph,' a visual data structure showing consensual 'wiser' value transitions for each context.