Hypothesis Testing I

STAT 240 - Fall 2025

Robert Sholl

Review

Confidence Intervals for Proportions

Point estimate for population proportions

\[\hat{p}\]

MOE (Margin of Error)

\[ z_{\alpha/2} \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \]

Confidence Intervals for Proportions

CLT for Proportions

\[ n\hat{p} \geq 10 \quad \text{and} \quad n(1 - \hat{p}) \geq 10 \]

CI for population proportions

\[ \hat{p} \pm z_{\alpha/2} \sqrt{\frac{\hat{p}(1 - \hat{p})}{n}} \quad \text{(Margin of Error)} \]

Confidence Interval Algorithm

Find your point estimate
Determine your confidence level
Compute the margin of error
Construct the confidence interval via \(\text{PE} \pm \text{MOE}\)
Handle any interpretation steps by asking what the original research question was

Word Problems

What do I have?

What relevant variables/information does the problem provide me outright?

What am I looking for?

What is the problem asking me to provide as a final answer?

Word Problems

What tools do I have to find what I’m looking for?

What equations do I know that have the final answer as a component of them?

What do I need to use the tools I have?

Of the relevant variables/information that I have, which ones fit into the equations I’m looking at? Am I missing any information? Do I have excess information?

Word Problems

Does it all make sense?

Does my final answer match the context of the original problem? If not, what would it need to make it make sense?

In Practice

The daily sales of a local coffee shop are normally distributed with a population mean of \(\mu = 300\) dollars and a standard deviation of \(\sigma = 75\) dollars. If a random sample of \(n = 49\) days is taken:

What is the probability that we will observe a sample mean over \(312\)?

What do I have?

\(\mu=300\) dollars, \(\sigma=75\) dollars, \(n=49\) days
Population is normally distributed

In Practice

What am I looking for?

\[P(\bar{x}>312)\]

In Practice

What tools do I have to find what I’m looking for?

\(z\)-score formula:

\[P(X>x)=P\left(Z>{x-\mu \over \sigma}\right)=P(Z>z)\]

\(z\)-table
Empirical rule

In Practice

What do I need to use the tools I have?

\(x\) or some single value
\(\mu\) or some mean
\(\sigma\) or some standard deviation
Normally distributed data

In Practice

These all fit very well into our z-score formula

\[P(X>312)=P\left(Z>{312-300 \over 75}\right)=P(Z>0.16)\]

\[P(Z>0.16)=1-P(Z<0.16)=0.436\]

In Practice

Does it all make sense?

In Practice

Let’s think about this

\[\approx 100\% \text{ data}\Rightarrow \pm3 \ \sigma\]

\[\approx 50\% \text{ data} \Rightarrow + \text{ OR} - 3 \ \sigma\]

\[312 = 0.16 \ \sigma\]

If we exclude a pretty central but small chunk of our strictly \(+\sigma\) population, we should expect our value to be something a little less than \(50\%\)

Null Hypothesis Significance Testing

Hypotheses

In hypothesis testing, there are two competing statements about population parameters:

\[H_0\equiv \text{null hypothesis} \quad \text{vs} \quad H_1 \equiv \text{alternate hypothesis}\]

Null Hypotheses

The null hypothesis, \(H_0\), states that the parameter is equal to a specific value

\[H_0 : \mu = 35\]

Alternate Hypotheses

The alternate hypothesis, \(H_1\), states that the value of the parameter differs from the value specified by the null hypothesis

\[H_1 : \mu < 35\]

\[H_1 : \mu > 35\]

\[H_1 : \mu \neq 35\]

Alternate Hypotheses

There are three types of alternate hypothesis

Consider \(H_0 : \mu = 35\)

\(H_1 : \mu < 35\) \(\Rightarrow\) called left-tailed alternate hypothesis
\(H_1 : \mu > 35\) \(\Rightarrow\) called right-tailed alternate hypothesis
\(H_1 : \mu \neq 35\) \(\Rightarrow\) called two-tailed alternate hypothesis

Hypotheses

Left-tailed and right-tailed hypotheses are called one-tailed hypotheses
A null hypothesis is generally thought of as a default state of nature (e.g. existing knowledge)
An alternate hypothesis, on the other hand, contradicts the default state (e.g. new knowledge)
In most cases, whatever we wish to establish is placed in the alternate hypothesis

Decisions

After developing \(H_0\) and \(H_1\), we collect a set of data

Based on the data, we construct a test statistic to reach one of the following decisions:
- Reject \(H_0\)
- Fail to reject \(H_0\)

Decisions

If we reject \(H_0\)

We conclude that \(H_1\) is true

If we fail to reject \(H_0\)

We conclude that the data do not provide enough evidence to reject \(H_0\)

Errors

Type I error: \(H_0\) is true in reality, but we reject \(H_0\)

Type II error: \(H_1\) is true in reality, but we do not reject \(H_0\)

\[ \begin{array}{|c|c|c|} \hline \text{Decision} & H_0 \ \text{True} & H_0 \ \text{False} \\ \hline \text{Reject} \ H_0 & \text{Type I error} & \text{Correct decision} \\ \hline \text{Don’t reject} \ H_0 & \text{Correct decision} & \text{Type II error} \\ \hline \end{array} \]

Errors

The probability of having the Type I error is denoted by \(\alpha\)

The probability of having the Type II error is denoted by \(\beta\)

NHST for Population Means

We do need the Central Limit Theorem to hold in order for us to proceed

State the null and alternate hypotheses
Choose a significance level \(\alpha =\) (allowed probability of Type I error)
Compute the test statistic:

\[t = \frac{\bar{x} - \mu_0}{{s}/{\sqrt{n}}}\]

NHST for Population Means

Since \(\sigma\) is unknown, we replace it with the sample standard deviation \(s\)

We use the \(t\) statistic, which comes from the \(t\) distribution with \(\text{df} = n - 1\)

NHST for Population Means

Compute the P-value of the test statistic \(t\)

Left-tailed test: \(P\)-value = area under the \(t\) distribution to the left of \(t\), i.e., \(P(T < t)\)

Right-tailed test: \(P\)-value = area under the \(t\) distribution to the right of \(t\), i.e., \(P(T > t)\)

Two-tailed test: \(P\)-value = sum of the areas under the \(t\) distribution to the left of \(-|t|\) and right of \(|t|\), i.e., \(2 * P(T < -|t|)\)

NHST for Population Means

Determine whether to reject \(H_0\)

Reject \(H_0\) if \(P\)-value \(\leq \alpha\)
Do not reject \(H_0\) if \(P\)-value \(> \alpha\)

State a conclusion

In Practice

In a recent medical study, 76 subjects were placed on a low-fat diet. After 12 months, their sample mean weight loss was \(\bar{x} = 2.2\) kilograms, with a sample standard deviation of \(s = 6.1\) kilograms. Use the \(\alpha = 0.05\) level of significance to test the claim that the mean weight loss is greater than 0.

In Practice

Step 1: State the null and alternate hypotheses

\[ H_0 : \mu = 0 \\ H_1 : \mu > 0 \\ (\text{right-tailed test}) \]

In Practice

Step 2: Choose a significance level \(\alpha\)

\[ \alpha = 0.05 \]

In Practice

Step 3: Compute the value of the test statistic

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} = \frac{2.2 - 0}{6.1 / \sqrt{76}} \approx 3.144 \]

In Practice

Step 4: Use the t-table to compute the P-value

Since \(H_1 : \mu > 0\) is right-tailed, the P-value is \(P(T > 3.144)\)
The degrees of freedom are \(df = 75\), which does not appear in the t-table, so we round it down to the nearest whole number, \(df = 60\)
In the t-table with \(df = 60\), we find that \(P(T > 3.144)\) is between \(P(T > 3.232)\) and \(P(T > 2.915)\), so the P-value is between \(0.001\) and \(0.0025\)

In Practice

Step 5: Determine whether to reject \(H_0\)

Our P-value is between \(0.0025\) and \(0.001\)
Since the P-value is less than \(\alpha = 0.05\), we reject \(H_0\)

In Practice

Step 6: State your conclusion

We conclude that the mean weight loss of people who were placed on a low-fat diet for \(12\) months is greater than \(0\)

Your Turn

A type of steel used by a manufacturing company is supposed to have an average hardness of 62 on the Rockwell hardness index. If the steel is too hard or too soft, defects can appear in the final product. A random sample of 10 specimens for a new steel supplier had a mean hardness of 64 with a standard deviation of 4. Test at the 5% significance level whether the mean hardness of the new supplier’s steel is different from the desired hardness of 62. (Assume that the population is normally distributed).

Solution

Step 1: State the null and alternate hypotheses

\[ H_0 : \mu = 62 \\ H_1 : \mu \neq 62 \\ (\text{two-tailed test}) \]

Solution

Step 2: Choose a significance level

\[ \alpha = 0.05 \]

Solution

Step 3: Compute the test statistic

Given:

Sample mean \(\bar{x} = 64\), Population mean \(\mu_0 = 62\)
Sample standard deviation \(s = 4\), Sample size \(n = 10\)

\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} = \frac{64 - 62}{4 / \sqrt{10}} \approx 1.581 \]

Solution

Step 4 & 5: Determine the P-value (two-tailed test)

Using the t-table with \(df = 9\), we find:

\(P(T > 1.581)\) is between \(P(T > 1.833)\) and \(P(T > 1.383)\), so \(P(T > 1.581)\) is between \(0.05\) and \(0.10\)

\[ \text{P-value} = 2 * P(T > 1.581) \approx \text{between } 0.1 \text{ and } 0.2 \]

Solution

Since the P-value is greater than \(\alpha = 0.05\), we fail to reject \(H_0\)

Step 6: State the conclusion

There is not enough evidence to conclude that the mean hardness of the new supplier’s steel is different from 62

NHST for Proportions

Now, we want to test a hypothesis for population proportion \(p\)

We still need the Central Limit Theorem for proportions to hold:

\[np_0 \geq 10 \quad \text{and} \quad n(1 - p_0) \geq 10\]

Where \(p_0\) is the population proportion specified by \(H_0\)

NHST for Proportions

Step 1: State the null and alternate hypotheses

The null hypothesis is of the form:

\[ H_0 : p = p_0 \]

The alternate hypothesis is in one of the three forms:

Left-tailed: \(H_1 : p < p_0\)
Right-tailed: \(H_1 : p > p_0\)
Two-tailed: \(H_1 : p \neq p_0\)

NHST for Proportions

Step 2: Choose a significance level \(\alpha\)

Step 3: Compute the test statistic:

\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} \]

NHST for Proportions

Step 4: Compute the P-value of the test statistic \(z\)

Left-tailed: P-value = area under the standard normal distribution to the left of \(z\)

i.e., \(P(Z < z)\)

Right-tailed: P-value = area under the standard normal distribution to the right of \(z\)

i.e., \(P(Z > z)\)

Two-tailed: P-value = sum of the areas under the standard normal distribution to the left of \(-|z|\) and right of \(|z|\)

i.e., \(2 * P(Z < -|z|)\)

NHST for Proportions

Step 5: Determine whether to reject \(H_0\):

Reject \(H_0\) if P-value \(\leq \alpha\)

Do not reject \(H_0\) if P-value \(> \alpha\)

Step 6: State a conclusion

In Practice

Suppose that 67% of all auto damage insurance claims in the US are made by singles under 25 years old. Also suppose that in a random sample of 53 auto damage claims in Manhattan, KS, there were 42 made by singles under 25.

Test at the 5% significance level whether the proportion of auto damage claims made by singles under 25 in Manhattan is different than the proportion for the entire US.

In Practice

State the null and alternate hypotheses

\[ H_0 : p = 0.67 \\ H_1 : p \neq 0.67 \\ \quad (\text{two-tailed test}) \]

In Practice

Compute the value of the test statistic

Given:

Sample proportion \(\hat{p} = \frac{42}{53} \approx 0.7925\), Population proportion \(p_0 = 0.67\)

The test statistic is:

\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.7925 - 0.67}{\sqrt{\frac{0.67(1 - 0.67)}{53}}} \approx 1.90 \]

In Practice

Determine whether to reject \(H_0\)

Using a two-tailed z-table:

\[ \text{P-value} = 2 \cdot P(Z < -1.90) = 2(0.0287) = 0.0574 \]

Since the P-value \(> \alpha (= 0.05)\), we fail to reject \(H_0\)

In Practice

State your conclusion

There is not enough evidence to conclude that the proportion of auto damage claims made in Manhattan by singles under 25 is different from the national proportion

Your Turn

An educational technology specialist is studying attitudes of teachers about the use of virtual reality in the classroom. She samples 500 teachers and finds that 471 of them believe that virtual reality would have a positive effect. Can she conclude that the proportion of teachers who believe that virtual reality would have a positive effect is greater than 0.90? Use the \(\alpha = 0.05\) level of significance.

Solution

Step 1: State the null and alternate hypotheses

\[ H_0 : p = 0.90 \\ H_1 : p > 0.90 \\ \quad (\text{right-tailed test}) \]

Solution

Step 2: Choose a significance level

\[ \alpha = 0.05 \]

Solution

Step 3: Compute the test statistic

Given:

Sample proportion \(\hat{p} = \frac{471}{500} = 0.942\), Population proportion \(p_0 = 0.90\)

The test statistic is:

\[ z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1 - p_0)}{n}}} = \frac{0.942 - 0.90}{\sqrt{\frac{0.90(1 - 0.90)}{500}}} \approx 3.13 \]

Solution

Step 4: Determine the P-value

\[ \text{P-value} = P(Z > 3.13) = 0.0009 \]

Since the P-value \(< \alpha\), we reject \(H_0\)

Solution

Step 5: State the conclusion

We conclude that more than 90% of teachers believe that virtual reality would have a positive effect on education

Hypothesis Testing I

Review

Confidence Intervals for Proportions

Confidence Intervals for Proportions

Confidence Interval Algorithm

Word Problems

Word Problems

Word Problems

In Practice

In Practice

In Practice

In Practice

In Practice

In Practice

In Practice

Null Hypothesis Significance Testing

Hypotheses

Null Hypotheses

Alternate Hypotheses

Alternate Hypotheses

Hypotheses

Decisions

Decisions

Errors

Errors

NHST for Population Means

NHST for Population Means

NHST for Population Means

NHST for Population Means

In Practice

In Practice

In Practice

In Practice

In Practice

In Practice

In Practice

Your Turn

Solution

Solution

Solution

Solution

Solution

NHST for Proportions

NHST for Proportions

NHST for Proportions

NHST for Proportions

NHST for Proportions

In Practice

In Practice

In Practice

In Practice

In Practice

Your Turn

Solution

Solution

Solution

Solution

Solution

Go away