Hypothesis Testing II

STAT 240 - Fall 2025

Robert Sholl

Difference between two means

Case: Independence

  • Assume two means are independent processes

    • Simplest case = two separate populations

    • “data generating processes”

  • Always check if \(n > 30\)

Case: Independence

\[ \begin{aligned} & H_0: \mu_1 = \mu_2 \\ \\ & H_a: \mu_1 < \mu_2 \Rightarrow \text{Left-tailed} \\ \\ & H_a: \mu_1 > \mu_2 \Rightarrow \text{Right-tailed} \\ \\ & H_a: \mu_1 \neq \mu_2 \Rightarrow \text{Two-tailed} \\ \end{aligned} \]

Case: Independence

\[ t^* = \frac{(\bar{x}_1 - \bar{x}_2)- (\mu_1 - \mu_2)}{\sqrt{(s_1^2/n_1) + (s_2^2/n_2)}} \]

  • \(\mu_1 = \mu_2\), thus \(\mu_1-\mu_2=0\)

  • \(df = \text{Min}(n_1 - 1, n_2 - 1)\)

  • Ceteris paribus

Practical Example

Practical Example

Practical Example

  • Population 1 is fundamentally different from Population 2

    • Pop. 1 should have a genuinely lower mean than Pop. 2

\[ \begin{aligned} H_0: \mu_1 = \mu_2\\ \\ H_a: \mu_1 < \mu_2 \\ \\ \alpha = 0.05 \end{aligned} \]

Practical Example

\[ t^* = \frac{(43.7 - 82.24)}{\sqrt{(220.731/13) + (145.743/29)}} \]

\[ t^* = -8.22 \]

\[ p = 8.3 \times 10^{-7} < \alpha \]

Case: Dependence

  • Dependence makes everything complicated

  • We can’t assume their means are the same

    • Simplest case = “before and after” comparisons

Case: Dependence

\[ \begin{aligned} & H_0: \mu_d = \mu_0 \\ \\ & H_a: \mu_d < \mu_0 \Rightarrow \text{Left-tailed} \\ \\ & H_a: \mu_d > \mu_0 \Rightarrow \text{Right-tailed} \\ \\ & H_a: \mu_d \neq \mu_0 \Rightarrow \text{Two-tailed} \\ \end{aligned} \]

Case: Dependence

\[ \begin{aligned} & \bar{d} = \frac{1}{n}\sum_{i=1}^n ({x_1}_i - {x_2}_i) \\ \\ & s_d = \sqrt{\frac{1}{n-1}\sum_{i=1}^n (d_i - \bar{d})^2} \\ \\ & t^* = \frac{\bar{d} - \mu_0}{s_d / \sqrt{n}} \end{aligned} \]

Practical Example

  • Each population should see an increase in score from Exam 1 to Exam 2

    • This would imply improvement in comprehension a.k.a. I’m converting you to statisticians

\[ \begin{aligned} H_0: \mu_d = 0\\ \\ H_a: \mu_d > 0 \\ \\ \alpha = 0.05 \end{aligned} \]

Practical Example

\[ \begin{aligned} & \bar{d}_1 = -8.81\\ \\ & s_{d_1} = 19.734 \\ \\ & t^*_1 = \frac{-8.81}{19.734 / \sqrt{13}} = -1.61 \\ \\ & p = 0.93 \end{aligned} \]

Practical Example

\[ \begin{aligned} & \bar{d}_2 = 12.76\\ \\ & s_{d_2} = 10.4 \\ \\ & t^*_2 = \frac{12.76}{10.4 / \sqrt{29}} = 6.6\\ \\ & p = 1.85 \times 10^{-7} \end{aligned} \]

Bonus

  • I subscribe to the perspective that the DGP is all important

    • See: The Likelihood Principle
  • My preferred test would be if these two groups arose from the same DGP


    Exact two-sample Kolmogorov-Smirnov test

data:  dgp2$TWB and dgp1$TWB
D = 1, p-value = 7.837e-11
alternative hypothesis: two-sided

Conclusion and next steps

  • When we return:

    • Practice and wrap up NHST

    • Introduce TOS + N/P test

  • Over break:

    • Problem set + Question bank

Go away