Highs and lows on the road out of the replication crisis (with Brian Nosek)

15 Topic Outline

Progress and Challenges in Open Science

Interpreting Replication Failures in Social Science

The Role of Perspectivism in Scientific Progress

Psychology's Progress and Public Trust in Science

Ecological Validity in Behavioral Research

Communicating Scientific Uncertainty to the Public

Replicability Challenges Across Scientific Disciplines

Impact of Open Science Practices on Research Credibility

Registered Reports: Aligning Incentives in Publishing

The Messy Reality of Scientific Discovery

Understanding the Implicit Association Test (IAT)

IAT Interpretation, Reliability, and Use Cases

Retraction of a High Replicability Study: Lessons Learned

The Lifecycle Journal Project: Future of Scholarly Communication

Addressing 'Importance Hacking' in Research Claims

8 Key Concepts

Replication Crisis

This refers to the phenomenon, particularly in social sciences, where many past experimental results fail to be reproduced when re-tested. It raises questions about the robustness and reliability of previously published scientific findings.

Perspectivism (Science)

This philosophy suggests that every scientific claim is true under some specific conditions, and scientific progress involves identifying those conditions. It encourages exploring the boundaries and specific contexts of findings rather than seeking universal truths.

Ecological Validity

This concept refers to the extent to which research findings can be generalized to real-world settings and behaviors outside of a controlled laboratory environment. There is often a trade-off between experimental control and ecological validity.

Uncertainty in Science Communication

This is the practice of explicitly including the provisional and conditional nature of scientific findings when communicating them to the public, rather than presenting them as definitive answers. It aims to help the public understand the scientific process and make informed decisions.

Open Science

This is a movement promoting practices like open data, open code, open materials, and pre-registration to increase the transparency, rigor, integrity, and reproducibility of scientific research. Its goal is to facilitate evaluation and self-correction within the scientific system.

Registered Reports

This is a publishing model where peer review occurs before the research outcomes are known. Reviewers evaluate the research question and methodology, and the journal commits to publishing the results regardless of the outcome, aligning incentives with rigorous methods rather than novel findings.

Implicit Association Test (IAT)

The IAT is a response time task designed to measure the strength of automatic associations between different categories (e.g., young/old, good/bad, black/white) without requiring direct introspection. It assesses how easily one can categorize concepts together versus their contrary pairs.

Importance Hacking

This refers to a research problem where the claims made in a paper about the significance, novelty, or meaning of findings do not accurately match what was statistically demonstrated by the evidence. It can lead to the publication of work that appears more impactful than it truly is.

12 Questions Answered

?

How robust are social sciences now compared to the beginning of the replication crisis?

While early indicators of new practices are positive, there isn't enough evidence yet to fully evaluate the impact of reforms, and much work remains to be done.

?

What does a replication failure truly indicate about an original study?

A replication failure doesn't automatically mean the original result was a false positive; it could be due to flaws in the replication study, or meaningful differences in conditions that affect the phenomenon.

?

How should the public engage with scientific findings, especially given declining trust in science?

The public should expect and be provided with a better injection of uncertainty into science communication, recognizing that scientific findings are provisional and conditional, rather than presented as definitive answers.

?

Why do surprising scientific findings often go viral but might be less likely to be true?

Newsworthiness often stems from violating intuitions or expectations, meaning that surprising findings are, by definition, less aligned with what we might intuitively expect, making them potentially less robust.

?

Are replicability issues common in scientific fields beyond social sciences?

While the extent is unknown, factors like pressure to publish and publication bias are present across virtually all scholarly fields, suggesting that issues could exist broadly, though their manifestation varies by discipline.

?

What is the main challenge for scientific research in general?

The main challenge is that the reward system is not aligned with the messy, fumbling, and exploratory nature of scientific discovery, instead favoring the production of polished papers that appear to have 'figured it out.'

?

What is the Implicit Association Test (IAT) designed to measure?

The IAT is a response time task that measures the strength of automatic associations between different categories, assessing how easily one can categorize concepts together without requiring direct introspection.

?

Is the IAT a reliable measure for individual self-understanding of bias?

The IAT is not a perfectly reliable measure, and it's debated whether the concepts it assesses are stable traits or fluctuating states, making its use for individual diagnostic contexts problematic.

?

What is the difference between an 'implicit bias' and an 'association' as measured by the IAT?

This is an active debate, but generally, an association refers to connections in the mind through experience, while a 'true bias' might imply a secret holding of prejudice or an impact on judgment/behavior, which is not definitively resolved by the IAT alone.

?

What are appropriate and inappropriate uses of the Implicit Association Test (IAT)?

The IAT is a productive research tool for understanding how people's minds work without direct questioning, but it lacks the validity and reliability for diagnostic contexts like selecting jurors or making employment decisions about individuals.

?

What was the core issue leading to the retraction of the 'high replicability' paper Brian Nosek co-authored?

The paper contained a definitive false statement claiming that all analyses, including the meta-project, were pre-registered, which was not true for the meta-analyses, undermining the credibility of some interpretations.

?

What is the Lifecycle Journal Project?

It's a pilot project aiming to connect research producers and consumers by running transparent peer review and various other evaluation services across the entire research lifecycle, treating all research outputs as first-class scholarly contributions.

18 Actionable Insights

1. Embrace Perspectivism in Science

Adopt the philosophy that every claim is true under some conditions, and progress involves identifying those conditions. This helps navigate complex fields where causation is difficult to pinpoint.

2. Take Replication Failures Seriously

When a study fails to replicate without a strong theoretical expectation, treat it as a serious challenge to the general claim. Use it to narrow the claim’s applicability or question its productivity.

3. Test Hypotheses for Failures

Instead of retrospectively explaining away replication failures with post-hoc reasons, formulate specific hypotheses about potential influencing factors and empirically test them. This moves beyond speculation to scientific inquiry.

4. Communicate Scientific Uncertainty Clearly

Always include the inherent uncertainty when communicating scientific findings to the public. This helps recipients understand the provisional nature of science and make informed decisions based on the current evidence.

5. Prioritize Uncertainty for Surprising Findings

When encountering surprising or newsworthy scientific results, give as much importance to the uncertainty surrounding the finding as to the finding itself, as highly surprising results are often less robust.

6. Seek Long-Form & Systematic Reviews

For a deeper and more comprehensive understanding of a scientific field, consult long-form pieces and systematic reviews that synthesize decades of research. These provide comprehensive insights beyond newsworthy individual findings.

7. Adopt Open Science Practices

Implement practices like sharing data, code, materials, and protocols openly. This transparency enhances the evaluability of research, facilitates self-correction, and allows others to build on work more effectively.

8. Utilize Registered Reports for Publishing

Researchers should use the “registered reports” publishing model, where peer review occurs before results are known. This aligns incentives with rigorous methodology and important research questions, improving research quality.

9. Reward “Done Well” Over “Figured Out”

The scientific community and journals should shift the reward system to prioritize the quality of research execution over the completeness or “figured out” nature of findings. This encourages publishing well-conducted exploratory or messy work.

10. Surface and Reward Discovery Phase

Create mechanisms to make visible and reward the “messy” exploratory and discovery phases of scientific research. This acknowledges their crucial role in pushing the boundaries of knowledge.

11. Publish Unexpected Phenomena

Be willing to publish research that uncovers unexpected phenomena or raises more questions than answers, even without a complete explanation. Such findings are valuable for understanding the conditions under which phenomena occur.

12. Embrace Openness for Error Detection

Researchers should embrace maximum openness and transparency, recognizing that errors are inevitable. Open sharing allows dedicated critics to scrutinize work, identify errors, and collectively reduce uncertainty, leading to progress.

13. Engage with Lifecycle Journal Models

Researchers and the scholarly community should engage with or develop “Lifecycle Journal” models that integrate diverse evaluation services across the entire research process. This treats all research outputs (data, code, plans) as first-class contributions.

14. Utilize Diverse Evaluation Services

Explore and use a variety of evaluation services beyond traditional peer review, such as prediction markets, data sharing quality assessments, and independent reproducibility checks. These modular services provide comprehensive insights into research quality.

15. Shift from Volume to Quality

Researchers and institutions should re-align incentives to prioritize the quality of research and the evaluations it receives, rather than solely focusing on the volume of publications. This can lead to more rigorous and impactful work.

16. Guard Against Importance Hacking

Researchers and reviewers must be vigilant against “importance hacking” by critically comparing stated research claims with the actual statistical evidence. Ensure that reported significance, novelty, and importance are truly justified by the data.

17. Develop Claim-Evidence Matching Services

For evaluation services, develop tools or protocols specifically designed to assess how well research claims align with the supporting evidence. This helps identify and mitigate “importance hacking” in published work.

18. Conduct Cross-Disciplinary Replicability Evaluations

Actively conduct evaluations of replicability, reproducibility, and generalizability across different scientific disciplines. This helps identify common challenges and areas of strength, informing targeted improvements.

6 Key Quotes

The uncertainty is a key part of the story and should always be part of the story when we're communicating scientific findings.
Brian Nosek

Every claim is true under some conditions. And the way we make progress in science is identifying those conditions under which you can observe evidence for that claim.
Brian Nosek

The more surprising it is, the more likely it is probably to be wrong a priori.
Spencer Greenberg

Making things valuable is the first step towards then providing, at the broader scale, more rigor, more reproducibility, and most importantly, better self correction.
Brian Nosek

The reward system is really rooted in the scholarly research context, in papers as the currency of advancement. And it's rare that that exploration and discovery process happens at the cadence or the pace of the production of papers.
Brian Nosek

No one is perfect. Error happens everywhere. Even big errors happen everywhere.
Brian Nosek

2 Protocols

Registered Reports Publishing Model

Brian Nosek

Researchers submit their research question, design, and methodology to a journal.
Peer reviewers evaluate the quality of the question and methods before data collection.
The journal commits to publishing the paper regardless of the research outcomes, provided the researchers follow their approved plan.
Researchers conduct the study and submit the results.
The paper is published, with the decision based on the quality of the initial design, not the novelty or significance of the results.

Lifecycle Journal Project (Proposed Evaluation Process)

Brian Nosek

**Planning Stage:** Authors submit their project at the beginning, undergoing registered report review by peer reviewers.
**Commitment & Research:** Once the plan is committed, a prediction market might open for people to make predictions about research outcomes.
**Completion & Data Sharing:** After research, a service evaluates the quality of data sharing and provides improvement reports.
**AI & Replication Assessment:** AI tools might assess the likelihood of findings replicating, and independent groups might reproduce findings using provided data and code.
**Publication Decision:** Authors decide whether to make their contribution a 'version of record' in the Lifecycle Journal or submit it to another journal, with all evaluations remaining public.

8 Key Numbers

40-50%

Average replication rate of social science papers 10-15 years ago Or 40% not replicable, according to Brian Nosek's understanding.

Around 0.7

Correlation between IAT responses and explicit preferences for Democrats vs. Republicans This indicates a strong correspondence.

Much less than 0.7

Correlation between IAT responses and explicit preferences for young vs. old This indicates a weak correspondence.

80

Number of individual experiments in the retracted 'high replicability' paper Comprising 16 new findings, each with a confirmatory study and 4 replication studies.

2013

Year the 'high replicability' project started The project was a multi-year effort.

2019

Year data collection finished for the 'high replicability' project Data collection spanned several years.

2023

Year the 'high replicability' paper was published The paper was published several years after data collection concluded.

Almost 30 years

Duration of Brian Nosek's involvement in implicit bias research He started grad school in 1996 and implicit bias research in 1998.

Deep Dive Analysis

Progress and Challenges in Open Science

Interpreting Replication Failures in Social Science

The Role of Perspectivism in Scientific Progress

Psychology's Progress and Public Trust in Science

Ecological Validity in Behavioral Research

Communicating Scientific Uncertainty to the Public

Replicability Challenges Across Scientific Disciplines

Impact of Open Science Practices on Research Credibility

Registered Reports: Aligning Incentives in Publishing

The Messy Reality of Scientific Discovery

Understanding the Implicit Association Test (IAT)

IAT Interpretation, Reliability, and Use Cases

Retraction of a High Replicability Study: Lessons Learned

The Lifecycle Journal Project: Future of Scholarly Communication

Addressing 'Importance Hacking' in Research Claims

Replication Crisis

Perspectivism (Science)

Ecological Validity

Uncertainty in Science Communication

Open Science

Registered Reports

Implicit Association Test (IAT)

Importance Hacking

1. Embrace Perspectivism in Science

2. Take Replication Failures Seriously

3. Test Hypotheses for Failures

4. Communicate Scientific Uncertainty Clearly

5. Prioritize Uncertainty for Surprising Findings

6. Seek Long-Form & Systematic Reviews

7. Adopt Open Science Practices

8. Utilize Registered Reports for Publishing

9. Reward “Done Well” Over “Figured Out”

10. Surface and Reward Discovery Phase

11. Publish Unexpected Phenomena

12. Embrace Openness for Error Detection

13. Engage with Lifecycle Journal Models

14. Utilize Diverse Evaluation Services

15. Shift from Volume to Quality

16. Guard Against Importance Hacking

17. Develop Claim-Evidence Matching Services

18. Conduct Cross-Disciplinary Replicability Evaluations

Registered Reports Publishing Model

Lifecycle Journal Project (Proposed Evaluation Process)