Unit 3 Review

STAT 240 - Fall 2025

Robert Sholl

Confidence Intervals

Algorithm

  1. Compute or identify the point estimate

  2. Compute or identify standard deviation and sample size

    • Check CLT
  3. Compute \(t_{\alpha/2}\) or \(z_{\alpha/2}\) depending on CI type

  4. Compute margin of error

  5. Add and subtract margin of error to point estimate

Mean CI

\[ \begin{aligned} \text{Point Estimate} = \bar{x} \\ \\ \text{Margin of Error} = t_{\alpha/2}\frac{s}{\sqrt{n}} \text{ or } z_{\alpha/2}\frac{s}{\sqrt{n}} \\ \\ \text{CI} = \bar{x} \pm t_{\alpha/2}\frac{s}{\sqrt{n}} \text{ or } z_{\alpha/2}\frac{s}{\sqrt{n}} \end{aligned} \]

Proportion CI

\[ \begin{aligned} \text{Point Estimate} = \hat{p} \\ \\ \text{Margin of Error} = z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \\ \\ \text{CI} = \hat{p} \pm z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}} \end{aligned} \]

Practice 1

The lower bound of a certain \(95\%\) confidence interval is equal to \(10.78\), with a sample standard deviation of \(s=16.34\) and a sample size of \(n=40\). Calculate the upper bound of the interval.

Practice 2

Given sample proportion of \(\hat{p} = 0.129\), sample size of \(n=984\), and population proportion estimate of \(p = 0.118\), write out the full formula for calculating the confidence interval of this proportion using all of the numeric values. You do not have to finish the calculation.

Null Hypothesis Significance Testing

Vocab

Null Hypothesis \(H_0\): the statement we are holding as known and established information

  • i.e., The average body weight of an adult cat is \(10\) lbs.

\[H_0:\mu=10\]

Alternate Hypothesis \(H_a\) or \(H_1\): The statement we are testing to determine the accuracy of

Practical Example

I believe that the cats I interact with regularly have a different average body weight than the population

\[H_0:\mu=10\]

\[H_a:\mu \neq 10\]

Practical Example

  • Test Statistic \(t^*\)

A value calculated as part of the hypothesis testing process. We place it into a \(t\)-table (or \(z\)-table depending) to get a \(p\)-value.

\[t^* = \frac{\bar{x} - \mu_0}{{s}/{\sqrt{n}}}\]

Practical Example

  • I weighed \(4\) of my friends cats and my own cat and found that their average body weight was \(8\) pounds, with a standard deviation of \(2.49\)

\[t^* = \frac{8 - 10}{{2.49}/{\sqrt{5}}}\]

\[t^*=-1.796039\]

Practical Example

Significance level (\(\alpha\)): The percentage probability we incur Type 1 Error in our hypothesis testing process

  • I want to test my cat weight hypothesis at \(\alpha=0.05\)

Practical Example

p-value: The final statistic calculated in a hypothesis test, used to determine if we reject or fail to reject the null hypothesis

\[2*P(T>t^*)=0.15\]

\[0.15>\alpha \quad \text{Fail to Reject} \ H_0\]

Statistical Significance

We refer to a result as statistically significant if we tested it against a null hypothesis and proceeded to reject the null hypothesis

“There is insufficient evidence to suggest that the body weight of the cats that interact with regularly have a statistically significant difference in average body weight from the population”

But remember

  • p-values aren’t law

  • Statistical significance isn’t all important

    • scientific significance is
  • If one of my students throws away scientifically relevant results because of \(p = 0.051\)

Phoebe will be upset

Practice 1

Identify the following statements are true or false.

  1. If \(P = 0.03\), the result is statistically significant at the \(\alpha = 0.05\) level

  2. If \(P = 0.03\), the null hypothesis is rejected at the \(\alpha = 0.05\) level

  3. If \(P = 0.03\), the result is statistically significant at the \(\alpha = 0.01\) level

  4. If \(P = 0.03\), the null hypothesis is rejected at the \(\alpha = 0.01\) level

Practice 2

The average uptake of oxygen in the general adult population is 38.2 ml/kg. A sample of 40 joggers gave a sample mean of 40.5 ml/kg with a standard deviation of 6.0 ml/kg for oxygen uptake. A physician would like to know whether or not joggers have a significantly higher average oxygen uptake than the general population.

Practice 2 (Still)

Select the correct interval for the P-value.

  1. P-value \(\leq0.01\)
  2. \(0.01<\) P-value \(\leq0.025\)
  3. \(0.025<\) P-value \(\leq0.05\)
  4. P-value \(>0.05\)

Two mean NHST

Practical Example

\[ \begin{array}{|c|c|c|c|c|} \hline \text{Automobile} & 1 & 2 & 3 & 4\\ \hline \text{After Tune-up} & 35.44 & 35.17 & 31.07 & 31.57 \\ \hline \text{Before Tune-up} & 33.76 & 34.30 & 29.55 & 30.90 \\ \hline \text{Automobile} & 5 & 6 & 7 & 8 \\ \hline \text{After Tune-up} & 26.48 & 23.11 & 25.18 & 32.39 \\ \hline \text{Before Tune-up} & 24.92 & 21.78 & 24.30 & 31.25 \\ \hline \end{array} \]

Practical Example

\[H_0: \mu_d - \mu_0 = 0\]

\[H_a: \mu_d - \mu_0 \neq 0\]

Practical Example

\[ \begin{aligned} \bar{d} = & \sum_{i=1}^n \frac{x_1{_i}-x_2{_i}}{n}\\ \bar{d} = & \ \frac{1.68 + 0.87 + \ldots + 1.14}{8} \approx 1.2063 \\ \end{aligned} \]

\[ \begin{aligned} s_d = & \ \sqrt{\sum_{i=1}^n\frac{(d_i - \bar{d})}{n-1}}\\ s_d = & \ \sqrt{\frac{(1.68 - 1.206)^2 + \ldots + (1.14 - 1.206)^2}{7}} \approx 0.3732 \end{aligned} \]

Practical Example

\[ \begin{aligned} t^* = & \ \frac{\bar{d} - \mu_0}{s_d / \sqrt{n}} \\ t^* = & \ \frac{1.2063 - 0}{0.3732 / \sqrt{8}} = 1.143 \\ p = & \ 0.29 \end{aligned} \]

Fisher’s Exact

Algorithm

  1. Define null hypothesis

    • Natural language
  2. Set up contingency table

  3. Perform test

Very straight forward

Fisher’s assumptions

  • No reject/fail to reject/accept conditions

    • Significant or not significant
  • \(p\)-values are a interpreted as the probability we observe those results if the null is true

    • They can show magnitude
  • Data should be counts, the test assumes a hypergeometric is appropriate

Odds Ratios

  • Describe the strength of association between two events

  • Built from contingency table

  • “(Group) is (Odds ratio) times more likely to be (Effect) compared to (Inverse group)”

\[ OR = \frac{a / b}{c / d} \]

Neyman-Pearson

Algorithm

  1. Establish effect size

  2. Identify the optimal (uniformly most powerful) test

  3. Propose the main and alternate hypothesis

  4. Determine rejection region (\(\alpha\))

  5. Calculate sample size to achieve good power

  6. Perform test and compute critical value

N-P Assumptions

  • a priori: everything is set prior to observing data

    • Ideally prior to performing the experiment
  • Main and alternate represent two separate populations

    • To reject one is to accept the other
  • p-values are just another representation of critical value

    • They aren’t appropriate measurements of effect magnitude

Difference Table

\[ \begin{array}{|c|c|c|c|} \hline \text{Method} & \text{Hypotheses} & \text{Test} & \text{Results} \\ \hline \text{NHST} & H_0 \text{ & } H_a & \text{Convention} & \text{Reject or fail}\\ \hline \text{Fisher} & H_0 & \text{Convention} & \text{Significant or not}\\ \hline \text{Neyman-Pearson} & H_M \text{ & } H_A& \text{UMP} & \text{Accept/Reject} \\ \hline \end{array} \]

Questions?

Practice

Go Away