Hypothesis Testing

Decision-making under uncertainty (H₀ vs H₁, α, β, power)

Back

Framework

Null hypothesis (H₀)

"No effect / no difference." Example: mean plaque index is equal between treatments.

H₀: μ₁ = μ₂

Alternative (H₁)

"Effect exists." Example: plaque index differs between treatments.

H₁: μ₁ ≠ μ₂

α (Type I error)

False positive: reject H₀ when H₀ is true.

β (Type II error)

False negative: fail to reject H₀ when H₁ is true.

Power = 1 − β

Chance to detect a true effect.

Interactive: visualize α and β

We draw two normal curves: one under H₀ and one under H₁ (shifted by effect size). Move the decision threshold and watch α (false positive area) and β (false negative area) change.

00.602.0
-1.51.002.5
α: -
β: -
Power: -

Decision regions

Red: H₀ distribution, Blue: H₁ distribution. Vertical line: threshold c.

Real Dental Scenario

Fluoride Varnish Clinical Trial

Study design: A dental researcher wants to test whether a new fluoride varnish reduces mean DMFT (Decayed, Missing, Filled Teeth) compared to a placebo.

Control Group (Placebo)
n = 50, Mean DMFT = 4.2, SD = 1.8
Treatment Group (Varnish)
n = 50, Mean DMFT = 3.5, SD = 1.6
1

Step 1: State the Hypotheses

H₀ (Null): μtreatment = μcontrol (No difference in mean DMFT)
H₁ (Alternative): μtreatment < μcontrol (Varnish reduces mean DMFT)

Significance level: α = 0.05 (one-tailed test)

2

Step 2: Compute the Test Statistic

Pooled standard error:
SE = √(SD₁²/n₁ + SD₂²/n₂) = √(1.8²/50 + 1.6²/50)
SE = √(0.0648 + 0.0512) = √0.1160 = 0.3406
t-statistic:
t = (̄x₁ − ̄x₂) / SE = (3.5 − 4.2) / 0.3406
t = −0.7 / 0.3406 = −2.055

Degrees of freedom: df ≈ 96 (Welch's approximation)

3

Step 3: Make a Decision

Critical value (one-tailed, α = 0.05): tcrit = −1.661
Our t = −2.055 < −1.661 (falls in rejection region)
p-value ≈ 0.021 < 0.05

Reject H₀

There is statistically significant evidence (p = 0.021) that the fluoride varnish reduces mean DMFT compared to placebo.

Dental example

If you test "fluoride varnish reduces mean DMFT vs control," choosing α=0.05 limits false positives. Ensuring high power (e.g., 0.80-0.90) reduces the chance you miss a real benefit due to small sample size or noisy outcomes.