Energy efficiency of chemical reactions | |
---|---|
Energy efficicency (J) |
Chemical Reaction Identification Number |
91.60 | 1 |
88.75 | 2 |
90.80 | 3 |
89.95 | 4 |
91.30 | 5 |
Hypothesis Testing
Exercise 1
We want to study the energy efficiency of a chemical reaction that is documented having a nominal energy efficiency of \(90\%\). Based on previous experiments on the same reaction, we know that the energy efficiency is a Gaussian random variable with unknown mean \(\mu\) and variance equal to \(2\). In the last \(5\) days, the plant has given the following energy efficiencies (in percentage):
1. Is the data in accordance with the specifications?
Let us first propose a mathematical formulation of the problem.
- Let \(X\) be a random variable which represents the energy efficiency of one random such chemical reaction. The specifications suggest us to assume that \(X \sim N(\mu, \sigma^2)\), with unknown mean energy efficiency \(\mu\) and known variance \(\sigma^2 = 2\).
The specification from the plant is that the nominal mean energy efficiency should be \(\mu_0 = 90\%\) which leads us to test:
\[ H_0: \mu = \mu_0 \quad v.s. \quad H_a: \mu \ne \mu_0. \] This suggests that an appropriate test statistic to use is
\[ Z_0 = \sqrt{n} \frac{\overline{X} - \mu_0}{\sigma} \stackrel{H_0}{\sim} N(0,1). \]
- Next, an experiment has been carried out to measure \(n = 5\) energy efficiencies corresponding to five random chemical reactions produced by the plant. Hence, an \(n\)-sample \(X_1, \dots, X_5 \sim X\) has been collected with corresponding observed values \(x_1, \dots, x_5\).
Let us now compute the observed value of the statistic \(Z_0\):
# n-sample
<- c(91.6, 88.75, 90.8, 89.95, 91.3)
x # sample size
<- length(x)
n # nominal value of energy efficiency
<- 90
mu0 # square root of variance (known)
<- sqrt(2)
sigma # observed value of test stat
<- sqrt(n) * (mean(x) - mu0) / sigma
z0 z0
[1] 0.7589466
Let’s say that we want a significance level \(\alpha = 5%\). We need to compute the quantile of order \(1 - \alpha/2\) of the standard normal distribution:
<- 0.05
alpha <- qnorm(1 - alpha / 2)
zs zs
[1] 1.959964
The value \(z_0\) does not belong to the critical region of the test. Hence, we lack statistical evidence to reject \(H_0\).
We could also use the p-value:
<- 2 * min(pnorm(z0), 1 - pnorm(z0))
pval pval
[1] 0.4478845
The decision rule for rejecting \(H_0\) is when the p-value is smaller than \(\alpha\). Here, we conclude that for any reasonable significance level, the p-value will always be higher, suggesting that we lack statistical evidence to reject \(H_0\).
2. What is a point estimate of the energy efficiency?
This is simply obtained by computing the sample mean of the sample of energy efficiencices which is provided by the mean()
function as follows:
mean(x)
[1] 90.48
3. Does that mean that the data significantly prove that the energy efficiency is larger than the expected nominal value?
Here we want to test the following hypotheses:
\[ H_0: \mu = \mu_0 \quad v.s. \quad H_a: \mu > \mu_0. \] Let’s compute the p-value:
# P_{H_0}(Z_0 > z_0)
<- 1 - pnorm(z0)
pval2 pval2
[1] 0.2239422
The p-value is greater than any reasonable significance level. Hence, we have not enough statistical evidence to reject \(H_0\). We can therefore conclude that, despite the point estimate of \(90.48\%\), the nominal energy efficiency cannot be claimed to be larger than \(90\%\).
Exercise 2
A study about air pollution done by a research station measured, on \(8\) different air samples, the following values of a polluant (in \(\mu\)g/m\(^2\)):
Concentration of a polluant in air samples | |
---|---|
Polluant Concentration (μg/m2) |
Air Sample Identification Number |
2.2 | 1 |
1.8 | 2 |
3.1 | 3 |
2.0 | 4 |
2.4 | 5 |
2.0 | 6 |
2.1 | 7 |
1.2 | 8 |
Assuming that the sampled population is normal,
1. Can we say that the polluant is present with less than \(2.5 \mu\)g/m\(^2\)?
Let \(X\) be a random variable which represents the concentration in polluant of an air sample taken at random. We assume that \(X \sim N(\mu, \sigma^2)\).
For both this question and the next one, we aim at testing the following hypotheses:
\[ H_0: \mu = \mu_0 \quad v.s. \quad \mu < \mu_0, \]
with \(\mu_0 = 2.5\) here and \(\mu_0 = 2.5\) in the 2nd question.
The variance \(\sigma^2\) is not provided which means that we will need to estimate it from the data using the sample variance. Hence, a good test statistic to look at is Student’s t-statistic:
\[ T_0 = \sqrt{n} \frac{\overline X - \mu_0}{s} \stackrel{H_0}{\sim} t(n - 1). \]
Only large negative values of \(T_0\) will be in favor of \(H_a\).
This leads us to apply Student’s t-test which is implemented in R
via the function t.test()
which we can use as follow:
<- c(2.2, 1.8, 3.1, 2.0, 2.4, 2.0, 2.1, 1.2)
x2 <- 0.05
alpha <- t.test(
out x = x2,
alternative = "less",
mu = 2.5,
conf.level = 1 - alpha
)|>
out ::tidy() |>
broom::gt() gt
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
---|---|---|---|---|---|---|---|
2.1 | -2.106097 | 0.03660458 | 7 | -Inf | 2.459827 | One Sample t-test | less |
If I set the upper bound for probability of committing type I errors to \(\alpha = 5\%\), then I have strong statistical evidence to reject \(H_0\) in favor of \(H_a\). I can therefore conclude that indeed the average amount of polluant is less than \(2.5\) \(\mu\)g/m\(^2\).
2. Can we say that the polluant is present with less than \(2.4 \mu\)g/m\(^2\)?
We answer this question in the same way as the previous but considering \(\mu_0 = 2.4\), which leads to:
<- t.test(x = x2, alternative = "less", mu = 2.4)
out |>
out ::tidy() |>
broom::gt() gt
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
---|---|---|---|---|---|---|---|
2.1 | -1.579573 | 0.0791076 | 7 | -Inf | 2.459827 | One Sample t-test | less |
If I still consider \(\alpha = 5\%\), I lack evidence for rejecting \(H_0\) and cannot claim that the average amount of polluant is less than \(2.4 \mu\)g/m\(^2\).
3. Is the normality hypothesis essential to justify the method used?
The normality assumption is essential to lead to an exact test.
We could invoke the CLT in case of large sample size but this is not the case here.
Let’s check the normality assumption using the Shapiro-Wilk test which is valid here because the sample size is between \(3\) and \(5000\):
<- shapiro.test(x = x2)
out |>
out ::tidy() |>
broom::gt() gt
statistic | p.value | method |
---|---|---|
0.9428648 | 0.639469 | Shapiro-Wilk normality test |
The null hypothesis in the Shapiro-Wilk test is that the sample has been drawn from a normal distribution. The resulting p-value is 0.64, which is larger than any reasonable significance level. Hence, I lack statistical evidence to reject the fact that the sample has been drawn from a normal distribution. This gives credit to the analysis conducted in the previous two questions which required normality of the sample.
Exercise 3
A medical inspection in an elementary school during a measles epidemic led to the examination of \(30\) children to assess whether they were affected. The results are in a tibble exam
which contains the following:
Medical inspection on a sample of children | |
---|---|
Child Status | Child Identification Number |
Healthy | 1 |
Healthy | 2 |
Healthy | 3 |
Healthy | 4 |
Healthy | 5 |
Healthy | 6 |
Healthy | 7 |
Healthy | 8 |
Healthy | 9 |
Healthy | 10 |
Healthy | 11 |
Healthy | 12 |
Healthy | 13 |
Sick | 14 |
Healthy | 15 |
Healthy | 16 |
Healthy | 17 |
Healthy | 18 |
Healthy | 19 |
Healthy | 20 |
Healthy | 21 |
Healthy | 22 |
Healthy | 23 |
Healthy | 24 |
Healthy | 25 |
Healthy | 26 |
Healthy | 27 |
Sick | 28 |
Healthy | 29 |
Healthy | 30 |
Let \(p\) be the probability that a child from the same school is sick.
1. Determine a point estimate \(\widehat{p}\) for \(p\).
Let \(X\) be a random variable which represents whether a child taken at random in the population is sick. Let \(X = 1\) if the child is sick and \(X = 0\) if not. Then, by definition, \(X \sim Be(p)\).
Let now \(X_1, \dots, X_n \sim X\) be an \(n\)-sample of children which were checked for the disease. The total number of infected students is given by \(\sum_{i=1}^n X_i\). Hence, a point estimate of the probability that a child is sick is given by:
\[ \widehat{p} = \overline X. \]
Numerical application. We can use the data and R
to help us with the calculation:
# first compute the values x_1, ..., x_n from the Status variable
<- dplyr::mutate(exam, x_values = child_status == "Sick")
exam
# next, compute the point estimate of p as the sample mean of the variable
# x_values
mean(exam$x_values)
[1] 0.06666667
2. The school will be closed if more than 5% of the children are sick. Can you conclude that, statistically, this is the case? Use a significance level of 5%.
Despite a point estimate that exceeds \(5\%\), since we only assess the presence of the disease in a subset of the total population, this might not be statistically significant given the variability. To provide insight into this, we can provide the p-value of an appropriate hypothesis test. We were asked to use a significance level \(\alpha = 5\%\) for such a test. The hypotheses that we want to test here are:
\[ H_0: p = p_0 \quad v.s. \quad p > p_0, \] with \(p_0 = 5\%\).
Since \(X_1, \dots, X_n \stackrel{iid}{\sim} Be(p)\), then:
\[
S_n = \sum_{i=1}^n X_i \sim Binom(n, p),
\] which can be used as test statistic because, under \(H_0\), its distribution is binomial with \(n\) and \(p_0\) parameters which are all known. We can use the exact.test()
function which can help us with the calculations:
<- binom.test(
out x = sum(exam$x_values), # Number of sick children in the sample
n = 30, # Total number of children in the sample
p = 0.05, # Value of p_0
alternative = "greater" # Type of alternative hypothesis
)|>
out ::tidy() |>
broom::gt() gt
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
---|---|---|---|---|---|---|---|
0.06666667 | 2 | 0.4464579 | 30 | 0.0119758 | 1 | Exact binomial test | greater |
The p-value of the test is 0 which exceeds any reasonable significance level. Hence, I lack statistical evidence to reject \(H_0\). There is thus no tangible reason to close the school.
Exercise 4
The capacities (in ampere-hours) of \(10\) batteries were recorded as follows:
Batteries | |
---|---|
Capacity (ampere-hours) |
Identification Number |
140 | 1 |
136 | 2 |
150 | 3 |
144 | 4 |
148 | 5 |
152 | 6 |
138 | 7 |
141 | 8 |
143 | 9 |
151 | 10 |
- Estimate the population variance \(\sigma^2\).
- Can we claim that the mean capacity of a battery is greater than 142 ampere-hours ?
- Can we claim that the mean capacity of a battery is greater than 140 ampere-hours ?
- Can we claim that the standard deviation of the capacity is less than 6 ampere-hours ?
Exercise 5
A company produces barbed wire in skeins of \(100\)m each, nominally. The real length of the skeins is a random variable \(X\) distributed as a \(\mathcal{N}(\mu, 4)\). Measuring \(10\) skeins, we get the following lengths:
Skeins | |
---|---|
Length (m) |
Identification Number |
98.683 | 1 |
96.599 | 2 |
99.617 | 3 |
102.544 | 4 |
100.110 | 5 |
102.000 | 6 |
98.394 | 7 |
100.324 | 8 |
98.743 | 9 |
103.247 | 10 |
- Perform a conformity test at significance level \(\alpha = 5\%\).
- Determine, on the basis of the observed values, the p-value of the test.
Exercise 6
In an atmospheric study the researchers registered, over \(8\) different samples of air, the following concentration of COG (in micrograms over cubic meter):
Concentration of COG in different air samples | |
---|---|
Concentration of COG (μg/m3) |
Air sample (Identification number) |
2.3 | 1 |
1.7 | 2 |
3.2 | 3 |
2.1 | 4 |
2.3 | 5 |
2.0 | 6 |
2.2 | 7 |
1.2 | 8 |
- Using unbiased estimators, determine a point estimate of the mean and variance of COG concentration.
Assume now that the COG concentration is normally distributed.
- Using a suitable statistical tool, establish whether the measured data allow to say that the mean concentration of COG is greater than \(1.8\) \(\mu\)g/m\(^3\).
Exercise 7
On a total of \(2350\) interviewed citizens, \(1890\) approve the construction of a new movie theater.
- Perform an hypothesis test of level \(5\%\), with null hypothesis that the percentage of citizens that approve the construction is at least \(81\%\), versus the alternative hypothesis that the percentage is less than \(81\%\).
- Compute the \(p\)-value of the test.
- [difficult] Determine the minimum sample size such that the power of the test with significance level \(\alpha = 0.05\) when the real proportion \(p\) is \(0.8\) is at least \(50\%\).
Question 1 & 2.
Let \(X\) be a random variable that represents the opinion about the construction of the movie theater of a citizen taken at random in the population. By definition, \(X \sim Be(p)\), where \(p\) is the proportion of citizens that approve the construction. The hypotheses that we want to test here are:
\[ H_0: p \geq p_0 \quad v.s. \quad H_a: p < p_0, \] with \(p_0 = 81\%\). Since \(X \sim Be(p)\), then:
\[ S_n = \sum_{i=1}^n X_i \sim Binom(n, p), \] which can be used as test statistic because, under \(H_0\), its distribution is binomial with \(n\) and \(p_0\) parameters which are all known.
Now, \(S_n / n\) is an unbiased estimator of \(p\). Hence, data in favor of the alternative hypothesis are those that are far from \(p_0\) in the direction of \(p_a < p_0\). We can therefore define a rejection region of the form \(S_n - n p_0 \leq c\) for some \(c \in \mathbb{R}\). With such a rejection region, the probability of making a type I error reads:
\[ \mathbb{P}[\mathrm{type\ I\ error}] = \mathbb{P}_{H_0}[S_n / n - p_0 \leq c]. \]
We want to find \(c\) such that the probability of making a type I error is upper-bounded by a predefined significance level \(\alpha\) of the test. This is equivalent to finding \(c\) such that:
\[ \begin{aligned} & \mathbb{P}_{H_0}[S_n / n - p_0 \leq c] = \alpha \\ \Longleftrightarrow \quad & \mathbb{P}_{H_0}[S_n - n p_0 \leq n c] = \alpha \\ \Longleftrightarrow \quad & \mathbb{P}_{H_0}[S_n \leq n c + n p_0] = \alpha \\ \Longleftrightarrow \quad & n c + n p_0 = q_{Bi(n, p_0)}(\alpha) \\ \Longleftrightarrow \quad & c = \frac{q_{Bi(n, p_0)}(\alpha) - n p_0}{n}, \end{aligned} \] where \(q_{Bi(n, p_0)}(\alpha)\) is the \(\alpha\)-quantile of the binomial distribution. Hence, the critical region of level \(\alpha = 5\%\) is:
\[ \begin{aligned} \mathcal{C}_\alpha(x_1, \dots, x_n) &= \left\{ (x_1, \dots, x_n) \in \mathbb{R}^n \mid \overline x_n - p_0 \leq \frac{q_{Bi(n, p_0)}(\alpha) - n p_0}{n} \right\}, \\ &= \left\{ (x_1, \dots, x_n) \in \mathbb{R}^n \mid n \overline x_n \leq q_{Bi(n, p_0)}(\alpha) \right\}. \end{aligned} \]
We can use the binom.test()
function which can help us with the calculations:
<- binom.test(
out x = 1890, # Number of citizens that approve the construction in the sample
n = 2350, # Total number of citizens in the sample
p = 0.81, # Value of p_0
alternative = "less", # Type of alternative hypothesis
conf.level = 0.95 # Confidence level
)
The output of the function is:
|>
out ::tidy() |>
broom::gt() gt
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
---|---|---|---|---|---|---|---|
0.8042553 | 1890 | 0.2462091 | 2350 | 0 | 0.8176451 | Exact binomial test | less |
The p-value of the test is 0.2462 which exceeds any reasonable significance level. Hence, I lack statistical evidence to reject \(H_0\).
Question 3.
The power of a test is the probability of rejecting \(H_0\) when \(H_a\) is true. In our case, the power of the test is:
\[ \begin{aligned} \mathbb{P}_{H_a}[\mathrm{reject\ } H_0] &= \mathbb{P}_{H_a} \left[ S_n / n - p_0 \leq \frac{q_{Bi(n, p_0)}(\alpha) - n p_0}{n} \right] \\ &= \mathbb{P}_{H_a}[S_n \leq q_{Bi(n, p_0)}(\alpha)] \\ &= F_{Bi(n, p)}(q_{Bi(n, p_0)}(\alpha)), \end{aligned} \] where \(F_{Bi(n, p)}\) is the cumulative distribution function of the binomial distribution with parameters \(n\) and \(p\), the latter being the real proportion of citizens that approve the construction. Assuming that \(p = 0.8\), we want to find \(n\) such that:
\[ F_{Bi(n, 0.8)}(q_{Bi(n, 0.81)}(0.05)) \geq 0.5. \]
This equation is not easy to solve analytically. We can use the uniroot()
function to find the root of the function \(f(n) = F_{Bi(n, 0.8)}(q_{Bi(n,
0.81)}(0.05)) - 0.5\) which can be implemented as follows:
<- function(n) {
f <- round(n)
n <- qbinom(p = 0.05, size = n, prob = 0.81)
x pbinom(q = x, size = n, prob = 0.8) - 0.5
}
The root of the function is:
round(uniroot(f = f, interval = c(1, 100000))$root)
[1] 4208
Exercise 8
A computer chip manufacturer claims that no more than \(1\%\) of the chips it sends out are defective. An electronics company, impressed with this claim, has purchased a large quantity of such chips. To determine if the manufacturer’s claim can be taken literally, the company has decided to test a sample of \(300\) of these chips. If \(5\) of these \(300\) chips are found to be defective, should the manufacturer’s claim be rejected?
Exercise 9
To determine the impurity level in alloys of steel, two different tests can be used. \(8\) specimens are tested, with both procedures, and the results are written in the following table:
Impurity level in a sample of alloys of steel | ||
---|---|---|
Impurity level (Test 1) |
Impurity level (Test 2) |
Specimen (Identification number) |
1.2 | 1.4 | 1 |
1.3 | 1.7 | 2 |
1.7 | 2.0 | 3 |
1.8 | 2.1 | 4 |
1.5 | 1.5 | 5 |
1.4 | 1.3 | 6 |
1.4 | 1.7 | 7 |
1.3 | 1.6 | 8 |
Assume that the data are normal.
- based on the data in the table, can we state that at significance level \(\alpha=5\%\) the Test 1 and 2 give a different average level of impurity?
- based on the data in the table, can we state that at significance level \(\alpha=1\%\) the Test 2 gives an average level of impurity greater than Test 1?
Exercise 10
A sample of \(300\) voters from region A and \(200\) voters from region B showed that the \(56\%\) and the \(48\%\), respectively, prefer a certain candidate. Can we say that at a significance level of \(5\%\) there is a difference between the two regions?
Exercise 11
In a sample of \(100\) measures of the boiling temperature of a certain liquid, we obtain a sample mean \(\overline{x} = 100^{o}C\) with a sample variance \(s^2 = 0.0098^{o}C^2\). Assuming that the observation comes from a normal population:
- What is the smallest level of significance that would lead to reject the null hypothesis that the variance is \(\leq 0.015\)?
- On the basis of the previous answer, what decision do we take if we fix the level of the test equal to \(0.01\)?