#269 - Good vs. bad science: how to read and understand scientific studies
This episode, a rebroadcast of AMA #30, features Peter Attia and Bob Kaplan discussing how to critically read and interpret scientific studies. They cover study types, clinical trials, common biases, statistical concepts, and Peter's personal method for analyzing research papers.
Deep Dive Analysis
13 Topic Outline
Process for a Study: From Idea to Design to Execution
Types of Studies: Observational vs. Experimental
Understanding Observational Studies: Case Reports, Case Series, and Cohort Studies
Phases of Human Clinical Trials: Safety, Efficacy, and Approval
Biases in Observational Studies: Healthy User, Recall, and Performance
Rigorous Experimental Studies: Randomization, Blinding, and Outcomes
Statistical Concepts: Power, P-values, and Significance
Measuring Effect Size: Relative vs. Absolute Risk, Hazard Ratios, and NNT
Interpreting Confidence Intervals
Reasons for Stopping a Study Prematurely: Safety, Benefit, or Futility
Publication Bias and Strategies to Combat It
Journal Prestige and the Impact Factor
Peter's Process for Reading Scientific Papers
18 Key Concepts
Null Hypothesis
The default position in science, stating there is no relationship or difference between two phenomena being studied. The goal of an experiment is often to try to falsify this hypothesis.
Power Analysis
A crucial step in experimental design to determine the minimum number of subjects needed in a study to detect a statistically significant difference, if one truly exists, with a specified level of certainty.
Institutional Review Board (IRB)
An ethics committee that must approve studies involving human or animal subjects to ensure the ethical conduct and safety of the participants. This approval is required before a study can begin.
Observational Studies
Studies where researchers observe subjects and measure variables of interest without intervening or manipulating any variables. They can identify associations but cannot establish causality.
Experimental Studies
Studies where researchers actively intervene by manipulating one or more variables (treatments) and observing the effect on an outcome, allowing for the establishment of causality.
Meta-analysis
A statistical technique that combines data from multiple independent studies addressing the same question to derive a single, more precise estimate of an effect. Its quality depends entirely on the quality of the included studies, following the 'garbage in, garbage out' principle.
Healthy User Bias
A common bias in observational studies where individuals who engage in one healthy behavior (e.g., not eating meat) are also more likely to engage in other healthy behaviors (e.g., exercise, not smoking), making it difficult to isolate the effect of a single variable.
Recall Bias (Information Bias)
A bias in studies, particularly nutritional epidemiology, where subjects' ability to accurately remember past behaviors (e.g., food consumption) is poor, leading to inaccurate data. This makes it challenging to draw reliable conclusions.
Performance Bias (Hawthorne Effect)
A bias where subjects change their behavior simply because they know they are being observed or are part of a study, or when investigators' knowledge of treatment assignment influences their interaction with subjects. This can subtly alter study outcomes.
P-value (Alpha)
The probability that an observed effect in a study is due to random chance (a false positive). A p-value of 0.05 or less is typically considered the threshold for statistical significance, meaning there's a 5% or less chance the result is a false positive.
Statistical Power
The probability of correctly detecting a true effect if it exists (1 minus the false negative rate). Typically, studies aim for 80-90% power, meaning they have an 80-90% chance of finding a real effect if it's there.
Absolute Risk
The actual risk of an event occurring in a population or group (e.g., 5 heart attacks per 1,000 people). This provides a direct measure of how common an event is.
Relative Risk
The ratio of the risk of an event in an exposed group compared to an unexposed group, often expressed as a percentage increase or decrease. It can be misleading without knowing the absolute risk, as a large relative risk can still represent a small absolute change.
Hazard Ratio
A measure of the relative risk of an event occurring at any point in time during a study, capturing the temporal aspect of risk. A hazard ratio of 1 means no difference, less than 1 means reduced risk, and greater than 1 means increased risk.
Number Needed to Treat (NNT)
The average number of patients who need to be treated to prevent one additional adverse event. It is calculated as 1 divided by the absolute risk reduction and helps assess the clinical significance of an intervention.
Confidence Interval
A range of values within which the true population parameter (e.g., hazard ratio) is likely to lie. If the interval for a ratio (like hazard ratio) includes 1, or for a difference includes 0, the result is not statistically significant. A tighter interval indicates less uncertainty.
Publication Bias
The tendency for studies with positive or statistically significant results to be more likely to be published than those with negative or null results, leading to a skewed representation of scientific evidence and an incomplete body of knowledge.
Impact Factor
A metric used to gauge the relative importance or influence of a scientific journal, calculated as the average number of citations received by articles published in that journal over a specific period (typically one year). Higher impact factors generally indicate more prestigious and selective journals.
11 Questions Answered
It starts with a hypothesis (often a null hypothesis), followed by experimental design, power analysis to determine subject numbers, Institutional Review Board (IRB) approval, defining primary/secondary outcomes, developing a statistical plan, and pre-registering the study, all while securing funding.
Studies broadly fall into observational (case reports, case series, cohort studies) and experimental (randomized controlled trials, non-randomized trials), with meta-analyses and systematic reviews summarizing these. Observational studies identify associations, while experimental studies can establish causality through intervention.
Phase 1 focuses on dose escalation and safety in a small group; Phase 2 continues safety evaluation and looks for efficacy in a larger, often open-label group; Phase 3 is a large, rigorous, often randomized, blinded, placebo-controlled trial to confirm efficacy and safety for approval; Phase 4 involves post-marketing surveillance and exploring new indications.
Key pitfalls include selection bias (e.g., healthy user bias), information or recall bias (especially in nutritional epidemiology), and confounding variables that can create spurious associations.
Rigor is enhanced by proper randomization, blinding (single or double), having a control group, adequate sample size (power), clear primary and secondary outcomes, appropriate duration, generalizability to the target population, and transparent funding/conflict of interest declarations.
Statistical significance means that the observed result is unlikely to have occurred by random chance, typically indicated by a p-value of 0.05 or less, leading to the rejection of the null hypothesis. However, it does not necessarily mean the effect is clinically meaningful.
Effect size can be measured using relative risk, absolute risk, hazard ratios, and number needed to treat (NNT). Relative risk describes the proportional change in risk, while absolute risk is the actual difference in event rates between groups. Absolute risk is crucial for understanding clinical impact.
A confidence interval provides a range within which the true effect size (e.g., hazard ratio) is likely to lie. If the interval for a ratio (like hazard ratio) includes 1, or for a difference includes 0, the result is not statistically significant. A tighter interval indicates less uncertainty.
Studies can be stopped prematurely for three main reasons: safety concerns (the treatment is causing harm), overwhelming benefit (the treatment is so effective it's unethical to withhold it from the control group), or futility (it's clear no significant benefit will be found even if the study continues).
Many studies, especially those with negative or null results, are not published due to publication bias, leading to an incomplete body of evidence. This can be combated through pre-registration of trials on public databases (like clinicaltrials.gov) and publishing formats like 'registered reports' where protocols are peer-reviewed and provisionally accepted before data collection.
Journal prestige is often determined by its 'impact factor,' which is a measure of how frequently its articles are cited by other researchers. Journals with higher impact factors are generally considered more influential and selective.
56 Actionable Insights
1. Develop Scientific Literacy
Emphasize scientific literacy to better understand research, distinguish signal from noise, and critically evaluate studies for rigor and potential misrepresentation.
2. Commit to Rigorous Paper Reading
If you choose to engage with science, commit to rigorously reading scientific papers, understanding it requires effort and attention to detail, but improves with practice.
3. Critically Evaluate Media Science Reports
Exercise caution and critical thinking when consuming science information from social media or news, as reporters may lack the necessary analytical skills to accurately interpret studies.
4. Distinguish Primary from Secondary Outcomes
Pay close attention to a study’s pre-registered primary and secondary outcomes, understanding that failing the primary outcome typically renders a study null, regardless of secondary findings.
5. Clinical vs. Statistical Significance
Always differentiate between statistical significance (a study’s success in rejecting the null hypothesis) and clinical significance (whether the observed effect size is practically relevant or meaningful).
6. Demand Absolute Risk Data
When evaluating study results, always demand to know the absolute risk alongside the relative risk, as relative risk alone can be misleading and insufficient for understanding true impact.
7. Utilize Number Needed to Treat (NNT)
Calculate the Number Needed to Treat (NNT) by dividing one by the absolute risk reduction to understand how many people must be treated to prevent one event, informing the practical value of an intervention.
8. Prioritize Low NNT Interventions
When evaluating interventions, prioritize those with a low Number Needed to Treat (NNT), generally below 100, as this indicates a more impactful and efficient treatment.
9. Scrutinize Meta-Analysis Components
Do not accept a meta-analysis as gospel without examining each of its constitutive studies, as a meta-analysis of poor-quality studies will yield poor results.
10. Evaluate Study Generalizability
Consider the size, duration, and patient population of a study to determine if its results are generalizable and relevant to your specific interests or patient context.
11. Check Funding & Conflicts of Interest
Always examine who funded a trial and the declared conflicts of interest of the authors, as these can subtly influence study design, reporting, or interpretation.
12. Recognize Publication Bias
Understand that publication bias exists, where studies with negative or null results are less likely to be published, potentially skewing the available scientific literature.
13. Value Negative Research Findings
Recognize that negative or null research findings are just as important as positive ones for the advancement of knowledge, as they prevent wasted effort and inform future research directions.
14. Support Study Pre-Registration
Advocate for and prioritize studies that are pre-registered on platforms like clinicaltrials.gov, as this practice makes it harder for investigators to withhold negative results and combats publication bias.
15. Use Registered Reports for Unbiased Publication
Consider publishing or seeking out “registered reports” where the study protocol is peer-reviewed and provisionally accepted before data collection, ensuring publication regardless of the outcome and combating bias.
16. Prioritize Peer-Reviewed Research
When seeking scientific information, prioritize peer-reviewed publications as they represent the highest standard of vetting by experts in the field, unlike non-peer-reviewed content.
17. Confirm Adequate Study Power
When a study, especially one with a null outcome, is presented, always question if it was adequately powered to detect a meaningful difference, as underpowered studies can miss real effects.
18. Understand P-Value as False Positive Rate
Understand that a p-value represents the probability of observing a result as extreme as, or more extreme than, the one observed, assuming the null hypothesis is true, essentially a false positive rate.
19. Understand Confidence Intervals as Uncertainty
View confidence intervals as “uncertainty intervals” that indicate the range where the true population statistic likely lies, rather than a strict probability of containing the true mean.
20. Check Confidence Interval for Unity
To quickly assess statistical significance, check if the confidence interval for a hazard or odds ratio crosses one; if it does, the result is not statistically significant.
21. Value Tighter Confidence Intervals
Recognize that tighter confidence intervals indicate less uncertainty and more precision in the estimated effect, increasing confidence in the study’s findings.
22. Interpret Hazard Ratios
Familiarize yourself with how to calculate and interpret hazard ratios (e.g., 0.82 means an 18% reduction, 2.2 means a 120% increase) to understand the temporal risk of events in clinical trials.
23. Beware Healthy User Bias
When evaluating observational studies, especially in health, be aware of the “healthy user bias” where people making one health-conscious choice often make many others, confounding results.
24. Distrust Food Frequency Questionnaires
Be highly skeptical of nutritional epidemiology studies that rely on food frequency questionnaires due to their inherent clunkiness, recall bias, and metaphysical impossibility of accurate recall.
25. Be Cautious with Causality
When interpreting observational studies, recognize that while patterns will be seen, establishing causality from these patterns is difficult and requires careful consideration.
26. Acknowledge Hawthorne Effect
Understand that observation itself can change behavior (the Hawthorne effect), meaning people may alter their actions simply because they know they are being watched or recorded.
27. Account for Confounding Variables
When interpreting studies, especially observational ones, identify potential confounding variables (e.g., age, sex, smoking) that can affect results and obscure true causal relationships.
28. Caution with Homogeneous Extrapolation
Be cautious when extrapolating results from studies conducted in homogeneous populations (e.g., men only) to broader, more heterogeneous populations (e.g., women), as utility may differ.
29. Understand Multi-Site Study Trade-offs
Recognize that while multi-site studies offer heterogeneity and generalizability, they are harder to control and can introduce bias if sites are not run consistently.
30. Analyze Adverse Events Thoroughly
When evaluating a trial, pay close attention to the frequency, severity, and distribution of adverse events in all groups, not just the primary outcomes.
31. Know Reasons for Early Study Stops
Be aware that clinical trials can be stopped prematurely for three main reasons: safety concerns, overwhelming benefit, or futility (no chance of finding a significant effect).
32. Understand Journal Impact Factor
Recognize that journal prestige is often indicated by its impact factor, a yearly metric reflecting the average number of citations per article published in that journal over a given period.
33. Read Abstract First
When reading a scientific paper, start with the abstract to quickly determine if the paper’s content is relevant and interesting enough to warrant further reading.
34. Tailor Reading to Familiarity
Adjust your reading approach based on your familiarity with the subject matter; read the introduction if unfamiliar, but skip it if you already have a good grasp of the background.
35. Scrutinize Methods Section
After the abstract (and introduction if needed), go directly to the methods section to understand the study’s design, randomization, interventions, subject numbers, and specific procedures.
36. Start Results with Figures/Legends
When reviewing the results, begin by examining the figures and tables along with their legends, as well-designed figures should be standalone and convey key findings concisely.
37. Read Discussion Last
Read the discussion section last, after forming your own opinions on the study’s strengths, weaknesses, and remaining questions, to compare your thoughts with the authors’ interpretations.
38. Practice Regular Paper Reading
Improve your ability to understand scientific literature through consistent repetition, such as reading a scientific paper every week.
39. Start with Null Hypothesis
When approaching scientific inquiry, begin by assuming no relationship between two phenomena (the null hypothesis) to frame your investigation cleanly.
40. Use Randomized Controlled Experiments
To elegantly test a hypothesis, design your experiment as a randomized controlled trial, and blind it if possible, to minimize bias.
41. Implement Double Blinding for Rigor
To enhance study rigor and minimize bias, implement double blinding where neither subjects nor investigators know who is receiving treatment or placebo; single blinding (subjects don’t know) is a minimum.
42. Ensure Rigorous Randomization
For experimental studies, prioritize rigorous randomization, as it is crucial for making sense of results and minimizing bias, even if non-randomized studies aren’t useless.
43. Mitigate Performance Bias in RCTs
In lifestyle-based randomized controlled trials, ensure both treatment and control groups receive the exact same amount of attention, coaching, and advice to eliminate performance bias.
44. Aim to Falsify Hypotheses
When designing experiments, adopt a rigorous approach aimed at falsifying your hypothesis, rather than solely seeking to confirm it, to ensure robust scientific inquiry.
45. Perform Power Analysis
Before conducting a study, determine the necessary number of subjects by performing a power analysis to ensure the experiment is adequately powered to detect a true difference.
46. Pre-Register Study Protocols
Before conducting a study, define primary and secondary outcomes, get the protocol approved, develop a statistical plan, and pre-register the study to enhance transparency and rigor.
47. Secure IRB Approval
For any study involving human or animal subjects, secure Institutional Review Board (IRB) approval to ensure the ethical conduct of the study.
48. Secure Research Funding
Ensure adequate funding is secured in parallel with study design and approval processes, as research requires financial resources.
49. Value Case Reports for Hypotheses
Understand that individual case reports, while not generalizable, serve as valuable hypothesis-generating observations that can kickstart larger trials or research careers.
50. Avoid Underpowered Studies
Do not conduct underpowered experiments, as they lack sufficient subjects to detect a real difference, often leading to null results and wasted effort.
51. Avoid Overpowered Studies
Be cautious of overpowered studies, which may enroll more subjects than necessary and detect statistically significant but clinically irrelevant effects.
52. Target 80-90% Study Power
In study design, aim for 80% to 90% power (meaning a 10-20% false negative rate) to ensure the study has a high probability of detecting a true effect if one exists.
53. Larger Effects Need Fewer Subjects
Recognize that studies designed to detect larger effect sizes (bigger differences between groups) require fewer subjects to achieve adequate statistical power.
54. Prioritize Figures in Scientific Writing
When writing a scientific paper, start by creating the figures, tables, and their legends first, as this helps clarify the core findings and structure the rest of the manuscript.
55. Support Ad-Free Content
If you value content provided without paid ads, consider becoming a member to support the creators, as their work is often made possible by members.
56. Deepen Knowledge with Premium Membership
To take your knowledge of health and wellness to the next level, consider a premium membership, which aims to provide members with much more value than the subscription price.
4 Key Quotes
a thousand sows ears makes not a pearl necklace.
James Yang (quoted by Peter Attia)
garbage in, garbage out.
Peter Attia
We're number two. We try harder.
Bob Kaplan
The study didn't turn out the way we wanted it to.
Ivan Franz (quoted by Peter Attia)
1 Protocols
Peter Attia's Process for Reading a Scientific Paper
Peter Attia- Read the abstract first to determine interest in the paper.
- If unfamiliar with the subject matter, read the introduction; otherwise, skip it.
- Go directly to the methods section to understand the experimental details (subjects, randomization, interventions, measurements, etc.).
- Examine the results section, starting with figures and tables and their legends, then read the prose for additional context.
- Finally, read the discussion section to compare personal thoughts on the study's strengths and weaknesses with the authors' perspectives.