# Probability woman exceeds 180cm in height
# P(X > 180) = 1 - P(X <= 180)
1 - pnorm(180, mean = 163, sd = 8)
[1] 0.01679331
Lecture 4
The-one sample variance test uses chi-squared distribution
Recall: Chi-squared distribution with p degrees of freedom is \chi_p^2 = Z_1^2 + \ldots + Z_p^2 where Z_1, \ldots, Z_p are iid N(0, 1)
Assumption: Suppose given sample X_1,\ldots, X_n iid from N(\mu,\sigma^2)
Goal: Estimate variance \sigma^2 of population
Test:
Suppose \sigma_0 is guess for \sigma
The one-sided hypothesis test for \sigma is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0
Consider the sample variance S^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1}
Since we believe H_0, the variance is \sigma = \sigma_0
S^2 cannot be too far from the true variance \sigma
Therefore we cannot have that S^2 \gg \sigma^2 = \sigma_0^2
If we observe S^2 \gg \sigma_0^2 then our guess \sigma_0 is probably wrong
Therefore we reject H_0 if S^2 \gg \sigma_0^2
The rejection condition S^2 \gg \sigma_0^2 is equivalent to \frac{(n-1)S^2}{\sigma_0^2} \gg 1 where n is the sample size
We define our test statistic as \chi^2 := \frac{(n-1)S^2}{\sigma_0^2}
The rejection condition is hence \chi^2 \gg 1
In Lecture 2, we have proven that \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2
Assuming \sigma=\sigma_0, we therefore have \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} = \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2
We reject H_0 if \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} \gg 1
This means we do not want \chi^2 to be too extreme to the right
As \chi^2 \sim \chi_{n-1}^2, we decide to rejct H_0 if \chi^2 > \chi_{n-1}^2(0.05)
By definition, the critical value \chi_{n-1}^2(0.05) is such that P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05
x^* := \chi_{n-1}^2(0.05) is point on x-axis such that P(\chi_{n-1}^2 > x^* ) = 0.05
Therefore, the semi-open interval (x^*,+\infty) is the rejection region
In the picture we have n = 12 and \chi_{11}^2(0.05) = 19.68
Given the test statistic \chi^2, the p-value is defined as p := P( \chi_{n-1}^2 > \chi^2 )
Notice that p < 0.05 \qquad \iff \qquad \chi^2 > \chi_{n-1}^2(0.05)
This is because \chi^2 > \chi_{n-1}^2(0.05) iff p = P(\chi_{n-1}^2 > \chi^2) < P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05
Suppose given
The one-sided hypothesis test is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0 The variance ratio test consists of 3 steps
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cons. Expectation | 66 | 53 | 62 | 61 | 78 | 72 | 65 | 64 | 61 | 50 | 55 | 51 |
Cons. Spending | 72 | 55 | 69 | 65 | 82 | 77 | 72 | 78 | 77 | 75 | 77 | 77 |
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cons. Expectation | 66 | 53 | 62 | 61 | 78 | 72 | 65 | 64 | 61 | 50 | 55 | 51 |
Cons. Spending | 72 | 55 | 69 | 65 | 82 | 77 | 72 | 78 | 77 | 75 | 77 | 77 |
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
If X \sim N(\mu,\sigma^2) then P( \mu - 2 \sigma \leq X \leq \mu + 2\sigma ) \approx 0.95
Recall: \quad Difference = (CE - CS) \sim N(\mu,\sigma^2)
Hence if \sigma = 1 P( \mu - 2 \leq {\rm CE} - {\rm CS} \leq \mu + 2 ) \approx 0.95
Meaning of variance ratio test: \sigma=1 \quad \implies \quad \text{CS index is within } \pm{2} \text{ of CE index with probability } 0.95
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
Goal: Perform chi-squared variance ratio test in R
For this, we need to compute p-value p = P(\chi_{n-1}^2 > \chi^2)
Thus, we need to compute probabilities for chi-squared distribution in R
R command | Computes |
---|---|
pnorm(x, mean = mu, sd = sig) |
P(X \leq x) |
qnorm(p, mean = mu, sd = sig) |
q such that P(X \leq q) = p |
dnorm(x, mean = mu, sd = sig) |
f(x), where f is pdf of X |
rnorm(n, mean = mu, sd = sig) |
n random samples from distr. of X |
Note: Syntax of commands
norm = normal \qquad p = probability \qquad q = quantile
d = density \qquad \quad \,\,\,\, r = random
# The upper 10th percentile for women height, that is,
# height q such that P(X <= q) = 0.9
qnorm(0.90, mean = 163, sd = 8)
[1] 173.2524
Question: What is the height of the tallest woman, according to the model?
[1] 212.5849
df = n
denotes n degrees of feedomR command | Computes |
---|---|
pchisq(x, df = n) |
P(X \leq x) |
qchisq(p, df = n) |
q such that P(X \leq q) = p |
dchisq(x, df = n) |
f(x), where f is pdf of X |
rchisq(m, df = n) |
m random samples from distr. of X |
The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2
Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)
The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2
Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)
Observe that p := P(\chi_{n-1}^2 > \chi^2) = 1 - P(\chi_{n-1}^2 \leq \chi^2)
The code is therefore
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cons. Expectation | 66 | 53 | 62 | 61 | 78 | 72 | 65 | 64 | 61 | 50 | 55 | 51 |
Cons. Spending | 72 | 55 | 69 | 65 | 82 | 77 | 72 | 78 | 77 | 75 | 77 | 77 |
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
Back to the Worked Example: Monthly data on CE and CS
Question: Test the following hypothesis: H_0 \colon \sigma = 1 \qquad H_1 \colon \sigma > 1
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Cons. Expectation | 66 | 53 | 62 | 61 | 78 | 72 | 65 | 64 | 61 | 50 | 55 | 51 |
Cons. Spending | 72 | 55 | 69 | 65 | 82 | 77 | 72 | 78 | 77 | 75 | 77 | 77 |
Difference | -6 | -2 | -7 | -4 | -4 | -5 | -7 | -14 | -16 | -25 | -22 | -26 |
# Compute p-value
p_value <- 1 - pchisq(chi_squared, df = n - 1)
# Print p-value
cat("The p-value for one-sided variance test is", p_value)
The p-value for one-sided variance test is 0
In Lecture 3:
Goal: compare mean and variance of 2 independent normal samples
Tests available:
Hypothesis testing starts to get interesting with 2 or more samples
t-test and F-test show the normal distribution family in action
This is also the maths behind regression
Want to compare the variance of two independent samples
This can be done by studying the ratio of the sample variances S^2_X/S^2_Y
We have already shown that \frac{(n - 1) S^2_X}{\sigma^2_X} \sim \chi^2_{n - 1} \qquad \frac{(m - 1) S^2_Y}{\sigma^2_Y} \sim \chi^2_{m - 1}
Hence we can study statistic F = \frac{S^2_X / \sigma_X^2}{S^2_Y / \sigma_Y^2}
We will see that F has F-distribution (next week)
Assumptions: Suppose given samples from 2 normal populations
Further assumptions:
Note: Assuming same variance is simplification. Removing it leads to Welch t-test
Goal: Compare means \mu_X and \mu_Y
Hypothesis set: We test for a difference in means H_0 \colon \mu_X = \mu_Y \qquad H_1 \colon \mu_X \neq \mu_Y
t-statistic: The general form is T = \frac{\text{Estimate}-\text{Hypothesised value}}{\text{e.s.e.}}
Define the sample means \overline{X} = \frac{1}{n} \sum_{i=1}^n X_i \qquad \qquad \overline{Y} = \frac{1}{m} \sum_{i=1}^m Y_i
Notice that {\rm I\kern-.3em E}[ \overline{X} ] = \mu_X \qquad \qquad {\rm I\kern-.3em E}[ \overline{Y} ] = \mu_Y
Therefore we can estimate \mu_X - \mu_Y with the sample means, that is, \text{Estimate} = \overline{X} - \overline{Y}
Since we are testing for difference in mean, we have \text{Hypothesised value} = \mu_X - \mu_Y
The Estimated Standard Error is the standard deviation of estimator \text{e.s.e.} = \text{Standard Deviation of } \overline{X} -\overline{Y}
Therefore the two-sample t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}}
Under the Null Hypothesis that \mu_X = \mu_Y, the t-statistic becomes T = \frac{\overline{X} - \overline{Y} }{\text{e.s.e.}}
The general rule is \text{df} = \text{Sample size} - \text{No. of estimated parameters}
Sample size in two-sample t-test:
No. of estimated parameters is 2: Namely \mu_X and \mu_Y
Hence degree of freedoms in two-sample t-test is {\rm df} = n + m - 2 (more on this later)
Recall: We are assuming populations have same variance \sigma^2_X = \sigma^2_Y = \sigma^2
We need to compute the estimated standard error \text{e.s.e.} = \text{Standard Deviation of } \ \overline{X} -\overline{Y}
Variance of sample mean was computed in the Lemma in Slide 72 Lecture 2
Since \overline{X} \sim N(\mu_X,\sigma^2) and \overline{Y} \sim N(\mu_Y,\sigma^2), by the Lemma we get {\rm Var}[\overline{X}] = \frac{\sigma^2}{n} \,, \qquad \quad {\rm Var}[\overline{Y}] = \frac{\sigma^2}{m}
Since X_i and Y_i are independent we get {\rm Cov}(X_i,Y_j)=0
By bilinearity of covariance we infer {\rm Cov}( \overline{X} , \overline{Y} ) = \frac{1}{n \cdot m} \sum_{i=1}^n \sum_{j=1}^m {\rm Cov}(X_i,Y_j) = 0
We can then compute \begin{align*} {\rm Var}[ \overline{X} - \overline{Y} ] & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] - 2 {\rm Cov}( \overline{X} , \overline{Y} ) \\ & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] \\ & = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right) \end{align*}
Taking the square root gives \text{S.D.}(\overline{X} - \overline{Y} )= \sigma \ \sqrt{\frac{1}{n}+\frac{1}{m}}
Therefore, the t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}} = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}
The t-statistic is currently T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}
Variance \sigma^2 is unknown: we need to estimate it!
Define the sample variances
S_X^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1} \qquad \qquad S_Y^2 = \frac{ \sum_{i=1}^m Y_i^2 - m \overline{Y}^2 }{m-1}
Recall that X_1, \ldots , X_n \sim N(\mu_X, \sigma^2) \qquad \qquad Y_1, \ldots , Y_m \sim N(\mu_Y, \sigma^2)
From Lecture 2, we know that S_X^2 and S_Y^2 are unbiased estimators of \sigma^2, i.e. {\rm I\kern-.3em E}[ S_X^2 ] = {\rm I\kern-.3em E}[ S_Y^2 ] = \sigma^2
Therefore, both S_X^2 and S_Y^2 can be used to estimate \sigma^2
We can improve the estimate of \sigma^2 by combining S_X^2 and S_Y^2
We will consider a (convex) linear combination S^2 := \lambda_X S_X^2 + \lambda_Y S_Y^2 \,, \qquad \lambda_X + \lambda_Y = 1
S^2 is still an unbiased estimator of \sigma^2, since \begin{align*} {\rm I\kern-.3em E}[S^2] & = {\rm I\kern-.3em E}[ \lambda_X S_X^2 + \lambda_Y S_Y^2 ] \\ & = \lambda_X {\rm I\kern-.3em E}[S_X^2] + \lambda_Y {\rm I\kern-.3em E}[S_Y^2] \\ & = (\lambda_X + \lambda_Y) \sigma^2 \\ & = \sigma^2 \end{align*}
We choose coefficients \lambda_X and \lambda_Y which reflect sample sizes \lambda_X := \frac{n - 1}{n + m - 2} \qquad \qquad \lambda_Y := \frac{m - 1}{n + m - 2}
Notes:
We have \lambda_X + \lambda_Y = 1
Denominators in \lambda_X and \lambda_Y are degrees of freedom {\rm df } = n + m - 2
This choice is made so that S^2 has chi-squared distribution (more on this later)
Note:
The t-statistic has currently the form T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}
We replace \sigma with the pooled estimator S_p
Note: Under the Null Hypothesis that \mu_X = \mu_Y this becomes T = \frac{\overline{X} - \overline{Y}}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} = \frac{\overline{X} - \overline{Y}}{ \sqrt{ \dfrac{ (n-1) S_X^2 + (m-1) S_Y^2 }{n + m - 2} } \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}
We have already seen that \overline{X} - \overline{Y} is normal with {\rm I\kern-.3em E}[\overline{X} - \overline{Y}] = \mu_X - \mu_Y \qquad \qquad {\rm Var}[\overline{X} - \overline{Y}] = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right)
Therefore we can rescale \overline{X} - \overline{Y} to get U := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ \sigma \sqrt{ \dfrac{1}{n} + \dfrac{1}{m}}} \sim N(0,1)
We are assuming X_1, \ldots, X_n iid N(\mu_X,\sigma^2)
Therefore, as already shown, we have \frac{ (n-1) S_X^2 }{ \sigma^2 } \sim \chi_{n-1}^2
Similarly, since Y_1, \ldots, Y_m iid N(\mu_Y,\sigma^2), we get \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{m-1}^2
Since X_i and Y_j are independent, we also have that \frac{ (n-1) S_X^2 }{ \sigma^2 } \quad \text{ and } \quad \frac{ (m-1) S_Y^2 }{ \sigma^2 } \quad \text{ are independent}
In particular we obtain \frac{ (n-1) S_X^2 }{ \sigma^2 } + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n-1}^2 + \chi_{m-1}^2 \sim \chi_{m + n- 2}^2
Recall the definition of S_p^2 S_p^2 = \frac{(n-1) S_X^2 + (m-1) S_Y^2}{ n + m - 2 }
Therefore V := \frac{ (n+m-2) S_p^2 }{ \sigma^2 } = \frac{ (n - 1) S_X^2}{ \sigma^2} + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n + m - 2}^2
By construction \overline{X}- \overline{Y} is independent of S_X^2 and S_Y^2
Therefore \overline{X}- \overline{Y} is independent of S_p^2
We conclude that U and V are independent
In conclusion, we have shown that T = \frac{U}{\sqrt{V/(n+m-2)}} \,, \qquad U \sim N(0,1) \,, \qquad V \sim \chi_{n + m - 2}^2
By the Theorem in Slide 118 of Lecture 2, we conclude that T \sim t_{n+m-2}
Suppose given two independent samples
The two-sided hypothesis for difference in means is H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y
The one-sided alternative hypotheses are H_1 \colon \mu_X < \mu_Y \quad \text{ or } \quad H_1 \colon \mu_X > \mu_Y
x_sample <- c(x1, ..., xn)
y_sample <- c(y1, ..., ym)
x_sample
and y_sample
t.test(x_sample, y_sample, var.equal = TRUE)
t.test(x, y)
alternative = "greater"
which tests
H_1 \colon \mu_X - \mu_Y > \mu_0
alternative = "less"
which tests
H_1 \colon \mu_X - \mu_Y < \mu_0
t.test(x, y)
var.equal = TRUE
tells R to assume that populations have same variance
\sigma_X^2 = \sigma^2_Y
In this case R computes the t-statistic with formula discussed earlier t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}
t.test(x, y)
Warning: If var.equal = TRUE
is not specified then
R assumes that populations have different variance \sigma_X^2 \neq \sigma^2_Y
In this case the t-statistic t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }} is NOT t-distributed
R performs the Welch t-test instead of the classic t-test
(more on this later)
Mathematicians | x_1 | x_2 | x_3 | x_4 | x_5 | x_6 | x_7 | x_8 | x_9 | x_{10} |
---|---|---|---|---|---|---|---|---|---|---|
Wages | 36 | 40 | 46 | 54 | 57 | 58 | 59 | 60 | 62 | 63 |
Accountants | y_1 | y_2 | y_3 | y_4 | y_5 | y_6 | y_7 | y_8 | y_9 | y_{10} | y_{11} | y_{12} | y_{13} |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Wages | 37 | 37 | 42 | 44 | 46 | 48 | 54 | 56 | 59 | 60 | 60 | 64 | 64 |
Samples: Wage data on 10 Mathematicians and 13 Accountants
Quesion: Is there evidence of differences in average pay?
Answer: Two-sample two-sided t-test for the hypothesis H_0 \colon \mu_X = \mu_Y \,,\qquad H_1 \colon \mu_X \neq \mu_Y
Sample size: \ n = No. of Mathematicians = 10
Mean: \bar{x} = \frac{\sum_{i=1}^n x_i}{n} = \frac{36+40+46+ \ldots +62+63}{10}=\frac{535}{10}=53.5
Variance: \begin{align*} s^2_X & = \frac{\sum_{i=1}^n x_i^2 - n \bar{x}^2}{n -1 } \\ \sum_{i=1}^n x_i^2 & = 36^2+40^2+46^2+ \ldots +62^2+63^2 = 29435 \\ s^2_X & = \frac{29435-10(53.5)^2}{9} = 90.2778 \end{align*}
Sample size: \ m = No. of Accountants = 13
Mean: \bar{y} = \frac{37+37+42+ \dots +64+64}{13} = \frac{671}{13} = 51.6154
Variance: \begin{align*} s^2_Y & = \frac{\sum_{i=1}^m y_i^2 - m \bar{y}^2}{m - 1} \\ \sum_{i=1}^m y_i^2 & = 37^2+37^2+42^2+ \ldots +64^2+64^2 = 35783 \\ s^2_Y & = \frac{35783-13(51.6154)^2}{12} = 95.7547 \end{align*}
Pooled variance: \begin{align*} s_p^2 & = \frac{(n-1) s_X^2 + (m-1) s_Y^2}{ n + m - 2} \\ & = \frac{(9) 90.2778 + (12) 95.7547 }{ 10 + 13 - 2} \\ & = 93.40746 \end{align*}
Pooled standard deviation: s_p = \sqrt{93.40746} = 9.6648
\begin{align*} t & = \frac{\bar{x} - \bar{y} }{s_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \\ & = \frac{53.5 - 51.6154}{9.6648 \times \sqrt{\dfrac{1}{10}+\dfrac{1}{13}}} \\ & = \frac{1.8846}{9.6648{\times}0.4206} \\ & = 0.464 \,\, (3\ \text{d.p.}) \end{align*}
We have that | t | = 0.464 < 2.08 = t_{21}(0.025)
Therefore the p-value satisfies p>0.05
There is no evidence (p>0.05) in favor of H_1
Hence we accept that \mu_X = \mu_Y
# Enter Wages data in 2 vectors using function c()
mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)
# Perform two-sample t-test with null hypothesis mu_X = mu_Y
# Specify that populations have same variance
# Store result of t.test in answer
answer <- t.test(mathematicians, accountants, var.equal = TRUE)
# Print answer
print(answer)
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
mathematicians
and accountants
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
mathematicians
is 53.5accountants
is 51.61538
Two Sample t-test
data: mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.569496 10.338727
sample estimates:
mean of x mean of y
53.50000 51.61538
Conclusion: The p-value is p = 0.6477
The previous two-sample t-test was conducted under the following assumptions:
Using R, we can plot the data to see if these are reasonable (graphical exploration)
Warnings: Even if the assumptions hold
Suppose given a data sample stored in a vector z
If the sample is large, we can check normality by plotting the histogram of z
Example: z
sample of size 1000 form N(0,1) – Its histogram resembles N(0,1)
Drawback: Small samples \implies hard to check normality from histogram
z
below is sample of size 9 from N(0,1) – But histogram not normalSolution: Suppose given iid sample z
from a distribution f
The command density(z)
estimates the population distribution f
(Estimate based on the sampling distribution of z
and smoothing - Not easy task)
Example: z
as in previous slide. The plot of density(z)
shows normal behavior
The R object density(z)
models a 1D function (the estimated distribution of z
)
density(z)$x
density(z)$y
dz <- density(z)
plot(dz, # Plot dz
xlim = range(dz$x), # Set x-axis range
ylim = range(dz$y)) # Set y-axis range
Axes range set as the min and max values of components of dz
# Compute the estimated distributions
d.math <- density(mathematicians)
d.acc <- density(accountants)
# Plot the estimated distributions
plot(d.math, # Plot d.math
xlim = range(c(d.math$x, d.acc$x)), # Set x-axis range
ylim = range(c(d.math$y, d.acc$y)), # Set y-axis range
main = "Estimated Distributions of Wages") # Add title to plot
lines(d.acc, # Layer plot of d.acc
lty = 2) # Use different line style
legend("topleft", # Add legend at top-left
legend = c("Mathematicians", # Labels for legend
"Accountants"),
lty = c(1, 2)) # Assign curves to legend
Axes range set as the min and max values of components of d.math
and d.acc
Wages data looks approximately normally distributed (roughly bell-shaped)
The two populations have similar variance (spreads look similar)
Conclusion: Two-sample t-test with equal variance is appropriate \implies accept H_0
We just examined the two-sample t-tests
This assumes independent normal populations with equal variance
\sigma_X^2 = \sigma_Y^2
Question: What happens if variances are different?
Answer: Use the Welch Two-sample t-test
t.test(x, y)
var.equal = TRUE
var.equal = FALSE
Welch t-test consists in computing the Welch statistic w = \frac{\overline{x} - \overline{y}}{ \sqrt{ \dfrac{s_X^2}{n} + \dfrac{s_Y^2}{m} } }
If sample sizes m,n > 5, then w is approximately t-distributed
If variances are similar, the welch statistic is comparable to the t-statistic
w \approx t : = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}
If variances are similar:
If variances are very different:
# Enter Wages data
mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)
# Perform Welch two-sample t-test with null hypothesis mu_X = mu_Y
# Store result of t.test in answer
answer <- t.test(mathematicians, accountants)
# Print answer
print(answer)
var.equal = TRUE
in t.test
Welch Two Sample t-test
data: mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.566879 10.336109
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
Welch Two Sample t-test
data: mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.566879 10.336109
sample estimates:
mean of x mean of y
53.50000 51.61538
Comments on output:
Welch Two Sample t-test
data: mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-6.566879 10.336109
sample estimates:
mean of x mean of y
53.50000 51.61538
Conclusion: The p-values obtained with the 2 tests are almost the same
Welch t-test: p-value = 0.6467 \qquad Classic t-test: p-value = 0.6477
Both test: p > 0.05, and therefore do not reject H_0
Note: This was expected
We compare the Effect of Two Treatments on Blood Pressure Change
Both treatments are given to a group of patients
Measurements of changes in blood pressure are taken after 4 weeks of treatment
Note that changes represent both positive and negative shifts in blood pressure
Treat. A | -1.9 | -2.5 | -2.1 | -2.4 | -2.6 | -1.9 | ||||||
Treat. B | -1.1 | -0.9 | -1.4 | 0.2 | 0.3 | 0.6 | -5 | -2.4 | 1.5 | -2.3 | -2.8 | 2.1 |
# Enter changes in Blood pressure data
trA <- c(-1.9, -2.5, -2.1, -2.4, -2.6, -1.9)
trB <- c(-1.1, -0.9, -1.4, 0.2, 0.3, 0.6, -5,
-2.4, -1.5, 2.3, -2.8, 2.1)
cat("Mean of Treatment A:", mean(trA), "Mean of Treatment B:", mean(trB))
Mean of Treatment A: -2.233333 Mean of Treatment B: -0.8
Sample means show both Treatments are effective in decreasing blood pressure
However Treatment A seems slightly better
Question: Perform a t-test to see if Treatment A is better H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B
\sigma_A^2 \neq \sigma_B^2
Conclusion:
We are testing the one-sided hypothesis
H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B
# Perform Welch t-test and retrieve p-value
ans <- t.test(trA, trB, alternative = "less", var.equal = F)
ans$p.value
[1] 0.01866013
The p-value is p < 0.05
We reject H_0 \implies Treatment A is more effective
We are testing the one-sided hypothesis
H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B
# Perform two-sample t-test and retrieve p-value
ans <- t.test(trA, trB, alternative = "less", var.equal = T)
ans$p.value
[1] 0.05836482
The p-value is p > 0.05
H_0 cannot be rejected \implies There is no evidence that Treatment A is better
Wrong conclusion, because two-sample t-test does not apply
The previous data was synthetic, and the background story was made up!
Nonetheless, the example is still valid
To construct the data, I sampled as follows
We see that \mu_A < \mu_B \,, \qquad \sigma_A^2 \neq \sigma_B^2
This tells us that:
# Set seed for random generation
# This way you always get the same random numbers when
# you run this code
set.seed(21)
repeat {
# Generate random samples
x <- rnorm(6, mean = -2, sd = 1)
y <- rnorm(12, mean = -1.5, sd = 3)
# Round x and y to 1 decimal point
x <- round(x, 1)
y <- round(y, 1)
# Perform one-sided t-tests for alternative hypothesis mu_x < mu_y
ans_welch <- t.test(x, y, alternative = "less", var.equal = F)
ans_t_test <- t.test(x, y, alternative = "less", var.equal = T)
# Check that Welch test succeeds and two-sample test fails
if (ans_welch$p.value < 0.05 && ans_t_test$p.value > 0.05) {
cat("Data successfully generated!!!\n\n")
cat("Synthetic Data TrA:", x, "\n")
cat("Synthetic Data TrB:", y, "\n\n")
cat("Welch t-test p-value:", ans_welch$p.value, "\n")
cat("Two-sample t-test p-value:", ans_t_test$p.value)
break
}
}
Data successfully generated!!!
Synthetic Data TrA: -1.9 -2.5 -2.1 -2.4 -2.6 -1.9
Synthetic Data TrB: -1.1 -0.9 -1.4 0.2 0.3 0.6 -5 -2.4 -1.5 2.3 -2.8 2.1
Welch t-test p-value: 0.01866013
Two-sample t-test p-value: 0.05836482
Method:
Sample the data as in previous slide (round to 1 d.p. for cleaner looking data)
Repeat until Welch test succeeds, and two-sample t-test fails \text{p-value of Welch test } \, < 0.05 < \, \text{p-value of Two-sample t-test}
Assume to have two sample with same size
Sometimes the two samples depend on each other in some way
Assume to have two sample with same size
Sometimes the two samples depend on each other in some way
Suppose given two samples
Sample x_1, \ldots, x_n from N(\mu_X,\sigma^2_X)
Sample y_1, \ldots, y_n from N(\mu_Y,\sigma^2_Y)
The hypotheses for difference in means are H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y \,, \quad \mu_X < \mu_Y \,, \quad \text{ or } \quad \mu_X > \mu_Y
Assumption: The data is paired, meaning that the differences
d_i = x_i - y_i \,\, \text{ are iid} \,\, N(\mu,\sigma^2) \quad \text{where} \quad \mu := \mu_X - \mu_Y
The hypotheses for the difference in means are equivalent to
H_0 \colon \mu = 0 \,, \quad \qquad H_1 \colon \mu \neq 0 \,, \quad \mu < 0 \,, \quad \text{ or } \quad \mu > 0
These can be tested with a one-sample t-test
R commands: The paired t-test can be called with the equivalent commands
t.test(x, y, paired = TRUE)
\qquad \quad H_0 \colon \mu_X = \mu_Y
t.test(x - y)
\qquad \qquad \qquad \qquad\qquad H_0 \colon \mu = 0
Month | J | F | M | A | M | J | J | A | S | O | N | D |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CCI 2007 | 86 | 86 | 88 | 90 | 99 | 97 | 97 | 96 | 99 | 97 | 90 | 90 |
CCI 2009 | 24 | 22 | 21 | 21 | 19 | 18 | 17 | 18 | 21 | 23 | 22 | 21 |
Data: Monthly Consumer Confidence Index (CCI) in 2007 and 2009
Question: Did the crash of 2008 have lasting impact upon CCI?
Observations:
Method: Use paired t-test to investigate drop in mean CCI
H_0 \colon \mu_{2007} = \mu_{2009} \,, \quad H_1 \colon \mu_{2007} > \mu_{2009}
# Enter CCI data
score_2007 <- c(86, 86, 88, 90, 99, 97, 97, 96, 99, 97, 90, 90)
score_2009 <- c(24, 22, 21, 21, 19, 18, 17, 18, 21, 23, 22, 21)
# Perform paired t-test and print p-value
ans <- t.test(score_2007, score_2009, paired = T, alternative = "greater")
ans$p.value
[1] 2.430343e-13
The p-value is significant: \,\, p < 0.05 \, \implies \, reject H_0 \, \implies \, Drop in CCI
It would be wrong to use a two-sample t-test
This is because the samples are paired, and hence dependent
This is further supported by computing the correlation
High correlation implies dependence
[1] -0.6076749
Researchers wish to measure water quality
There are two possible tests, one less expensive than the other
10 water samples were taken, and each was measured both ways
method1 | 45.9 | 57.6 | 54.9 | 38.7 | 35.7 | 39.2 | 45.9 | 43.2 | 45.4 | 54.8 |
method2 | 48.2 | 64.2 | 56.8 | 47.2 | 43.7 | 45.7 | 53.0 | 52.0 | 45.1 | 57.5 |
Question: Do the tests give the same results?
Observation: The data is paired (twin study situation)
Method: Use paired t-test to investigate equality of results
H_0 \colon \mu_1 = \mu_2 \,, \quad H_1 \colon \mu_1 \neq \mu_2
# Enter tests data
method1 <- c(45.9, 57.6, 54.9, 38.7, 35.7, 39.2, 45.9, 43.2, 45.4, 54.8)
method2 <- c(48.2, 64.2, 56.8, 47.2, 43.7, 45.7, 53.0, 52.0, 45.1, 57.5)
# Perform paired t-test and print p-value
ans <- t.test(method1, method2, paired = T)
ans$p.value
[1] 0.0006648526
p-value is significant: \,\, p < 0.05 \, \implies \, reject H_0 \, \implies \, Methods perform differently
It would be wrong to use a two-sample t-test
This is because the samples are paired, and hence dependent
This is also supported by high samples correlation
[1] 0.9015147
In this Example, performing a two-sample t-test would lead to wrong decision
# Perform Welch t-test and print p-value
ans <- t.test(method1, method2, paired = F) # paired = F is default
ans$p.val
[1] 0.1165538
Wrong conclusion: \,\, p > 0.05 \, \implies \, cannot reject H_0 \, \implies \, Methods perform similarly
Bottom line: The data is paired, therefore a paired t-test must be used
Comments on command
t.test(x, y)
R will perform a two-sample t-test on populations
x
andy
R implicitly assumes the null hypothesis is H_0 \colon \mu_X - \mu_Y = 0
mu = mu0
tells R to test null hypothesis: H_0 \colon \mu_X - \mu_Y = \mu_0