STATISTICS · HL

Hypothesis Testing

Testing claims against the evidence.

Section 1 of 6

Margin of error

When we take a sample, we never get the population value exactly. The margin of error tells us how far off we could be.

Let $n$ = sample size.

Margin of error

$\text{Margin of error} = \dfrac{1}{\sqrt{n}}$

(i) Worked example — $n = 1001$

Find the margin of error if the sample size is $1001$.

$\dfrac{1}{\sqrt{1001}} = 0.031$

$= 3.1\%$

(ii) Worked example — $n = 97$

Find the margin of error if the sample size is $97$.

$\dfrac{1}{\sqrt{97}} = 0.10$

$= 10\%$

Small samples → big margin of error. Big samples → small margin of error.

(iii) Working backwards

What sample size $n$ will give a margin of error of $2\%$?

$\dfrac{1}{\sqrt{n}} = 0.02$

$\dfrac{1}{0.02} = \sqrt{n}$

$n = \dfrac{1}{0.0004}$

$n = 2500$

YOU TRY · 1

A sample of $400$ people is taken. What is the margin of error?

Use the quick rule: $\dfrac{1}{\sqrt{n}}$.

$\dfrac{1}{\sqrt{400}} = \dfrac{1}{20} = 0.05$

$5\%$

Section 2 of 6

Confidence interval — proportion

$\hat{p}$ = sample proportion (what we measured).

$p$ = population proportion (the unknown truth).

A confidence interval gives a range we are $95\%$ sure $p$ lies in.

(i) Worked example — quick

Sample says $\hat{p} = 35\%$. Margin of error $= 3\%$.

We are $95\%$ confident that:

$32\% \leq p \leq 38\%$

Centre on $\hat{p}$, add and subtract the margin of error.

(ii) Standard error formula

For a proportion, the standard error is:

Standard error (proportion)

$\text{S.E.} = \sqrt{\dfrac{\hat{p}\,(1-\hat{p})}{n}}$

$\text{Margin of error} = 1.96\,\sqrt{\dfrac{\hat{p}\,(1-\hat{p})}{n}}$ (at $95\%$)

$1.96\,\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}} \approx \dfrac{1}{\sqrt{n}}$. The quick rule from §1 is a shortcut for this.

(iii) Worked example — looking younger

$78$ out of $95$ people agree they look younger. Set up a $95\%$ confidence interval.

$\hat{p} = \dfrac{78}{95}$

$\text{S.E.} = \sqrt{\dfrac{\frac{78}{95}\left(\frac{17}{95}\right)}{95}}$

$\text{S.E.} = 0.039$

$\text{M.E.} = 1.96\,(0.039)$

$\text{M.E.} = 0.077$

(iv) Worked example — Fine Gael vote

$253$ out of $1001$ people say they will vote Fine Gael. Construct a $95\%$ confidence interval.

$\hat{p} = \dfrac{253}{1001} = 0.25$

$\text{S.E.} = \sqrt{\dfrac{0.25\,(0.75)}{1001}}$

$\text{S.E.} = 0.01$

$\text{M.E.} = 1.96\,(0.01) = 0.03$

$0.25 - 0.03 \leq p \leq 0.25 + 0.03$

$0.22 \leq p \leq 0.28$

Confidence interval — proportion

$\hat{p} - \text{M.E.} \leq p \leq \hat{p} + \text{M.E.}$

Section 3 of 6

Hypothesis testing — proportion

Null hypothesis $H_0$ = the statement being made (the claim).

Alternative hypothesis $H_A$ = the opposite.

Method — confidence interval test

1.Build the $95\%$ confidence interval from the sample.

2.If the claim is inside the interval → fail to reject $H_0$.

3.If the claim is outside the interval → reject $H_0$.

(i) Worked example — happy customers

A company claims $85\%$ of customers are happy. $253$ out of $354$ agree. Test the claim.

$H_0:$ $85\%$ happy

$H_A:$ not $85\%$ happy

$\hat{p} = \dfrac{253}{354} = 0.71$

$\text{M.E.} = 1.96\,\sqrt{\dfrac{\hat{p}\,(1-\hat{p})}{n}} = 0.05$

$0.71 - 0.05 \leq p \leq 0.71 + 0.05$

$0.66 \leq p \leq 0.76$

Is $0.85$ inside $[0.66,\,0.76]$? No.

Reject $H_0$

Conclusion: not $85\%$ happy.

(ii) Exercise 3.17 — drug for migraines

A drugs company claims its new migraine drug has an $80\%$ success rate. $1{,}232$ out of $1{,}600$ patients said symptoms were relieved. Test the claim at $95\%$.

$H_0:$ works for $80\%$

$H_A:$ does not work for $80\%$

$n = 1600,$ $\hat{p} = \dfrac{1232}{1600} = 0.77$

$\text{M.E.} = 1.96\,\sqrt{\dfrac{0.77\,(0.23)}{1600}}$

$\text{M.E.} = 0.02$

$0.77 - 0.02 \leq p \leq 0.77 + 0.02$

$0.75 \leq p \leq 0.79$

Is $0.80$ inside $[0.75,\,0.79]$? No.

Reject $H_0$

Conclusion: drug does not work for $80\%$.

Section 4 of 6

Distribution of sample means

Now we look at sample means instead of proportions.

Population has mean $\mu$ and standard deviation $\sigma$.

Take many samples of size $n$. Each one has its own sample mean $\bar{x}$.

Central Limit Theorem

1.Mean of the sample means $= \mu$

2.Standard deviation of the sample means $= \sigma_{\bar{x}} = \dfrac{\sigma}{\sqrt{n}}$

The sample means are tighter around $\mu$ than the raw data — by a factor of $\sqrt{n}$.

Z-score — two versions

Z formulas

1.Raw data: $z = \dfrac{x - \mu}{\sigma}$

2.Sample means: $z = \dfrac{\bar{x} - \mu}{\left(\dfrac{\sigma}{\sqrt{n}}\right)}$

Use the second one whenever the question gives you a sample mean and a sample size.

Section 5 of 6

Hypothesis testing — means

Same idea as Section 3 — but now we test a claim about the mean.

Method — z-test (at $95\%$)

1.State $H_0$ and $H_A$.

2.Compute $z = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}}$.

3.If $|z| > 1.96$ → reject $H_0$.

4.If $|z| \leq 1.96$ → fail to reject $H_0$.

(i) Worked example — bag of chips

Bags of chips have a mean of $35\,\text{g}$ and standard deviation of $2\,\text{g}$. A sample of $56$ bags has a sample mean of $36\,\text{g}$. The company claims bag weight has increased. Test the claim.

$H_0:$ $\mu = 35$

$H_A:$ $\mu$ ≠ $35$

$\mu = 35,\ \sigma = 2,\ n = 56,\ \bar{x} = 36$

$z = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}} = \dfrac{36 - 35}{2 / \sqrt{56}}$

$z = 3.7$

$3.7 > 1.96$ → outside the $95\%$ band.

Reject $H_0$

Conclusion: mean has increased.

(ii) Worked example — Leaving Cert maths

Nationally, LC maths has mean $\mu = 68\%$ and standard deviation $\sigma = 4$. In one school, the mean is $\bar{x} = 72\%$ out of $n = 73$ students. Test the claim that this school is higher than the national average.

$\mu = 68,\ \sigma = 4,\ \bar{x} = 72,\ n = 73$

$H_0:$ $\mu = 68$

$H_A:$ $\mu$ ≠ $68$

$z = \dfrac{72 - 68}{4 / \sqrt{73}}$

$z = 8.5$

Reject $H_0$

Conclusion: mean mark is higher.

(iii) Worked example — batteries

A company produces batteries that last $700$ hours with standard deviation $12$ hours. A test on $86$ batteries gives a sample mean of $708$ hours. Test the claim that battery life has increased.

$H_0:$ $\mu = 700$

$H_A:$ $\mu$ ≠ $700$

$\mu = 700,\ n = 86,\ \bar{x} = 708,\ \sigma = 12$

$z = \dfrac{708 - 700}{12 / \sqrt{86}}$

$z = 6.18$

Reject $H_0$

Conclusion: mean battery life has increased.

(iv) Worked example — pens (CI method)

A company manufactures pens with mean writing life $600$ hours, $\sigma = 12$. A retailer tests a sample of $98$ pens; their mean is $597$ hours. At $5\%$ significance, are these genuine?

$H_0:$ $\mu = 600$

$H_A:$ $\mu$ ≠ $600$

$\mu = 600,\ \sigma = 12,\ n = 98,\ \bar{x} = 597$

$z = \dfrac{597 - 600}{12 / \sqrt{98}}$

$z = -2.47$

$|-2.47| > 1.96$.

Reject $H_0.$ Not genuine.

Alternative — confidence interval on the mean

Same question, CI method.

Confidence interval — mean

$\bar{x} - 1.96\,\dfrac{\sigma}{\sqrt{n}} \leq \mu \leq \bar{x} + 1.96\,\dfrac{\sigma}{\sqrt{n}}$

or equivalently: $\mu - \text{M.E.} \leq \bar{x} \leq \mu + \text{M.E.}$

$\text{M.E.} = 1.96\,\dfrac{\sigma}{\sqrt{n}} = 1.96\,\left(\dfrac{12}{\sqrt{98}}\right)$

$\text{M.E.} = 2.37$

$600 - 2.37 \leq \bar{x} \leq 600 + 2.37$

$597.63 \leq \bar{x} \leq 602.37$

Sample mean of $597$ is not in this interval.

Reject $H_0$

Section 6 of 6

The p-value method

A third way to do the same test — using a p-value.

Method — p-value (two-tailed at $5\%$)

1.Compute the test statistic $z_1 = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}}$.

2.If $z_1 = +k$: $p = 2\,P(z \geq k)$

3.If $z_1 = -k$: $p = 2\,P(z \leq -k)$

4.If $p < 0.05$ → reject $H_0$.

5.If $p > 0.05$ → fail to reject $H_0$.

(i) Worked example — porridge (p208 Q9)

A company claims packets of porridge have a mean weight of $400\,\text{g}$ with $\sigma = 12\,\text{g}$. A sample of $64$ packets has $\bar{x} = 403\,\text{g}$. At $5\%$ significance, is the mean weight not $400\,\text{g}$?

$\mu = 400,\ \sigma = 12,\ n = 64,\ \bar{x} = 403$

$H_0:$ $\mu = 400$

$H_A:$ $\mu$ ≠ $400$

$z_1 = \dfrac{403 - 400}{12 / \sqrt{64}}$

$z_1 = 2$

$P(z \geq 2) = 1 - P(z \leq 2)$

$= 1 - 0.9772$

$= 0.0228$

$p = 2\,(0.0228)$

$p = 0.0456$

$0.0456 < 0.05$.

Reject $H_0$

(ii) Worked example — metal rods (Q10)

A machine produces metal rods with mean length $600\,\text{cm}$ and $\sigma = 4\,\text{cm}$. After a service, the company claims rods are not equal to $600\,\text{cm}$. A sample of $100$ rods has $\bar{x} = 600.6\,\text{cm}$. Test at $5\%$.

$\mu = 600,\ \sigma = 4,\ n = 100,\ \bar{x} = 600.6$

$H_0:$ $\mu = 600$

$H_A:$ $\mu$ ≠ $600$

$z = \dfrac{600.6 - 600}{4 / \sqrt{100}}$

$z = 1.5$

$P(z \geq 1.5) = 1 - 0.9332 = 0.0668$

$p = 2\,(0.0668)$

$p = 0.1336$

$0.1336 > 0.05$.

Fail to reject $H_0$

Cross-check with CI

$\text{M.E.} = 1.96\,\dfrac{4}{\sqrt{100}} = 0.79$

$600.6 - 0.79 \leq \mu \leq 600.6 + 0.79$

$599.81 \leq \mu \leq 601.39$

$600$ is inside the interval → same conclusion: fail to reject $H_0$.

SUM

The lot in one box

Hypothesis testing toolkit

1.Quick margin of error: $\dfrac{1}{\sqrt{n}}$

2.Proportion CI: $\hat{p} \pm 1.96\,\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}}$

3.Mean CI: $\bar{x} \pm 1.96\,\dfrac{\sigma}{\sqrt{n}}$

4.Test statistic: $z = \dfrac{\bar{x} - \mu}{\sigma / \sqrt{n}}$

5.Reject $H_0$ if $|z| > 1.96$, or if $p < 0.05$, or if claim is outside CI.

6.P-value: $p = 2\,P(z \geq |z_1|)$

End of lesson

Hypothesis Testing — HL · Mathslive.ie