Definition
A p-value is the probability of observing results at least as extreme as yours, assuming the null hypothesis is true.
Common misunderstandings
- Not: "probability H₀ is true"
- Not: "clinical importance"
- Not: "replication guarantee"
Interactive: p-value from a z-score
Move the z-score and see the two-tailed p-value as shaded tail area.
Clinical note
A tiny p-value can happen with a trivial effect if n is huge. Always pair p-values with effect sizes and CIs.
Standard normal (H₀)
Shaded tails correspond to the two-tailed p-value.
Dental Scenario: Whitening Gel Effectiveness
Testing if a new whitening gel significantly improves shade scores
The study: A dental researcher tests a new whitening gel on n = 36 patients. The known population mean shade improvement with the standard gel is μ₀ = 2.0 shades. The new gel produces a sample mean of x̄ = 2.8 shades with s = 1.5. Does the new gel perform significantly better?
State the Hypotheses
Compute the z-Score
Locate z = 3.20 on the Normal Curve & Shade the Tail
The shaded right tail represents the probability of observing z ≥ 3.20 under H₀.
Interpret: p-value vs α
Conclusion: There is strong statistical evidence that the new whitening gel produces greater shade improvement than the standard gel (z = 3.20, p = 0.0007). However, clinicians should also consider whether a 0.8-shade difference is clinically meaningful to patients.
Real Dental Scenario
Interpreting p = 0.03 in a Toothpaste Comparison
The study: Researchers compared Brand A vs Brand B toothpaste on plaque reduction in 200 patients over 6 months. The study found a statistically significant difference with p = 0.03.
Significance Meter
Correct: Evidence against H₀
If H₀ were true (no difference), there is only a 3% chance of seeing a difference this large or larger. This is moderate evidence against H₀.
Correct: Statistically significant at α = 0.05
Since p = 0.03 < 0.05, we reject H₀ at the conventional 5% significance level.
Wrong: "There is a 97% probability Brand A is better"
p-values do NOT give the probability that one treatment is better. They assume H₀ is true and measure data extremeness.
Wrong: "The difference is clinically meaningful"
Statistical significance does NOT equal clinical significance. A tiny plaque reduction may be statistically significant with a large sample but clinically trivial.
Wrong: "If we repeat the study, 97% of the time we'll get the same result"
The p-value is NOT a replication probability. Replication depends on effect size, sample size, and other factors.
Wrong: "There is only a 3% chance H₀ is true"
The p-value is NOT the probability that H₀ is true. It is the probability of the data (or more extreme) given H₀. These are fundamentally different.
Dental example
Suppose a new sealant reduces mean caries score by 0.2 units with p=0.01. The evidence against H₀ is strong, but you still need to judge whether a 0.2-unit reduction is clinically meaningful, and whether it generalizes.