Statistical Models

Lecture 4

Lecture 4:
The variance ratio and
two-sample t-tests

Outline of Lecture 4

  1. One-sample variance ratio test
  2. Worked example
  3. One-sample variance ratio test in R
  4. Two-sample hypothesis tests
  5. Two-sample t-test
  6. Two-sample t-test: Example
  7. The Welch t-test
  8. The t-test for paired samples

Part 1:
One-sample variance
ratio test

Task: Estimating mean and variance

  • Assume the population has normal distribution N(\mu,\sigma^2)
    • Mean \mu and variance \sigma^2 are unknown
  • Questions about \mu and \sigma^2
    1. What is my best guess of the value?
    2. How far away from the true value am I likely to be?
  • Answers:
    • The one-sample t-test answers questions about \mu (seen in Lecture 3)
    • The one-sample variance ratio test answers questions about \sigma^2

Reminder

  • The-one sample variance test uses chi-squared distribution

  • Recall: Chi-squared distribution with p degrees of freedom is \chi_p^2 = Z_1^2 + \ldots + Z_p^2 where Z_1, \ldots, Z_p are iid N(0, 1)

One-sample one-sided variance ratio test

Assumption: Suppose given sample X_1,\ldots, X_n iid from N(\mu,\sigma^2)

Goal: Estimate variance \sigma^2 of population

Test:

  • Suppose \sigma_0 is guess for \sigma

  • The one-sided hypothesis test for \sigma is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0

What to do?

  • Consider the sample variance S^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1}

  • Since we believe H_0, the variance is \sigma = \sigma_0

  • S^2 cannot be too far from the true variance \sigma

  • Therefore we cannot have that S^2 \gg \sigma^2 = \sigma_0^2

What to do?

  • If we observe S^2 \gg \sigma_0^2 then our guess \sigma_0 is probably wrong

  • Therefore we reject H_0 if S^2 \gg \sigma_0^2

  • The rejection condition S^2 \gg \sigma_0^2 is equivalent to \frac{(n-1)S^2}{\sigma_0^2} \gg 1 where n is the sample size

What to do?

  • We define our test statistic as \chi^2 := \frac{(n-1)S^2}{\sigma_0^2}

  • The rejection condition is hence \chi^2 \gg 1

What to do?

  • In Lecture 2, we have proven that \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2

  • Assuming \sigma=\sigma_0, we therefore have \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} = \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2

Summary: Rejection condition

  • We reject H_0 if \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} \gg 1

  • This means we do not want \chi^2 to be too extreme to the right

  • As \chi^2 \sim \chi_{n-1}^2, we decide to rejct H_0 if \chi^2 > \chi_{n-1}^2(0.05)

  • By definition, the critical value \chi_{n-1}^2(0.05) is such that P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05

Critical values of chi-squared

  • x^* := \chi_{n-1}^2(0.05) is point on x-axis such that P(\chi_{n-1}^2 > x^* ) = 0.05

  • Therefore, the semi-open interval (x^*,+\infty) is the rejection region

  • In the picture we have n = 12 and \chi_{11}^2(0.05) = 19.68

Critical values of chi-squared – Tables

  • Find Table 13.5 in this file
  • Look at the row with Degree of Freedom n-1 (or its closest value)
  • Find critical value \chi^2_{n-1}(0.05) in column \alpha = 0.05
  • Example: n=12, DF =11, \chi^2_{11}(0.05) = 19.68

The p-value

  • Given the test statistic \chi^2, the p-value is defined as p := P( \chi_{n-1}^2 > \chi^2 )

  • Notice that p < 0.05 \qquad \iff \qquad \chi^2 > \chi_{n-1}^2(0.05)

This is because \chi^2 > \chi_{n-1}^2(0.05) iff p = P(\chi_{n-1}^2 > \chi^2) < P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05

One-sample one-sided variance ratio test

The procedure

Suppose given

  • Sample x_1, \ldots, x_n of size n from N(\mu,\sigma^2)
  • Guess \sigma_0 for \sigma

The one-sided hypothesis test is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0 The variance ratio test consists of 3 steps

One-sample one-sided variance ratio test

The procedure

  1. Calculation: Compute the chi-squared statistic \chi^2 = \frac{(n-1) s^2}{\sigma_0^2} where sample mean and variance are \overline{x} = \frac{1}{n} \sum_{i=1}^n x_i \,, \qquad s^2 = \frac{\sum_{i=1}^n x_i^2 - n \overline{x}^2}{n-1}

One-sample one-sided variance ratio test

The procedure

  1. Statistical Tables or R: Find either
    • Critical value in Table 13.5 \chi_{n-1}^2(0.05)
    • p-value in R p := P( \chi_{n-1}^2 > \chi^2 ) (more on this later)

One-sample one-sided variance ratio test

The procedure

  1. Interpretation:
    • Reject H_0 if \chi^2 > \chi_{n-1}^2(0.05) \qquad \text{ or } \qquad p < 0.05
    • Do not reject H_0 if \chi^2 \leq \chi_{n-1}^2(0.05) \qquad \text{ or } \qquad p \geq 0.05

Part 2:
Worked example

One-sample variance ratio test: Example

Month J F M A M J J A S O N D
Cons. Expectation 66 53 62 61 78 72 65 64 61 50 55 51
Cons. Spending 72 55 69 65 82 77 72 78 77 75 77 77
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  • Data: Consumer Expectation (CE) and Consumer Spending (CS) in 2011
  • Assumption: CE and CS are normally distributed

One-sample variance ratio test: Example

Month J F M A M J J A S O N D
Cons. Expectation 66 53 62 61 78 72 65 64 61 50 55 51
Cons. Spending 72 55 69 65 82 77 72 78 77 75 77 77
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  • Remark: Monthly data on CE and CS can be matched
    • Hence consider: \quad Difference = CE - CS
    • CE and CS normal \quad \implies \quad Difference \sim N(\mu,\sigma^2)
  • Question: Test the following hypothesis: H_0 \colon \sigma = 1 \qquad H_1 \colon \sigma > 1

Motivation of test

  • If X \sim N(\mu,\sigma^2) then P( \mu - 2 \sigma \leq X \leq \mu + 2\sigma ) \approx 0.95

  • Recall: \quad Difference = (CE - CS) \sim N(\mu,\sigma^2)

  • Hence if \sigma = 1 P( \mu - 2 \leq {\rm CE} - {\rm CS} \leq \mu + 2 ) \approx 0.95

  • Meaning of variance ratio test: \sigma=1 \quad \implies \quad \text{CS index is within } \pm{2} \text{ of CE index with probability } 0.95

The variance ratio test by hand

Month J F M A M J J A S O N D
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  1. Calculations: Using the above data, compute
  • Sample mean: \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} = \frac{-6-2-7-4- \ldots -22-26}{12} = -\frac{138}{12} = -11.5

The variance ratio test by hand

Month J F M A M J J A S O N D
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  1. Calculations: Using the above data, compute
  • Sample variance: \begin{align*} \sum_{i=1}^{n} x_i^2 & = (-6)^2 + (-2)^2 + (-7)^2 + \ldots + (-22)^2 + (-26)^2 = 2432 \\ s^2 & = \frac{\sum_{i=1}^n x^2_i- n \bar{x}^2}{n-1} = \frac{2432-12(-11.5)^2}{11} = \frac{845}{11} = 76.8182 \end{align*}

The variance ratio test by hand

Month J F M A M J J A S O N D
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  1. Calculations: Using the above data, compute
  • Chi-squared statistic: \chi^2 = \frac{(n-1)s^2}{\sigma_0^2} = \frac{11 \left(\frac{845}{11}\right) }{1} = 845

The variance ratio test by hand

  1. Statistical Tables:
    • Sample size is n = 12
    • Degrees of freedom are {\rm df} = n-1 = 11
    • In Table 13.5 find \chi_{11}^2(0.05) = 19.68
  2. Interpretation:
    • Test statistic is \chi^2 = 845
    • We reject H_0 because \chi^2 = 845 > 19.68 = \chi_{11}^2(0.05)

The variance ratio test by hand

  1. Conclusion:
    • We accept H_1: The standard deviation satisfies \sigma > 1
    • A better estimate for \sigma could be sample standard deviation s=\sqrt{\frac{845}{11}}=8.765
    • This suggests: With probability 0.95 \text{CS index is within } \pm{2 \times 8.765 = \pm 17.53 } \text{ of CE index}

Part 3:
One-sample variance
ratio test in R

The variance ratio test in R

Goal: Perform chi-squared variance ratio test in R

  • For this, we need to compute p-value p = P(\chi_{n-1}^2 > \chi^2)

  • Thus, we need to compute probabilities for chi-squared distribution in R

Probability Distributions in R

  • R can natively do calculations with known probability distrubutions
  • Example: Let X be r.v. with N(\mu,\sigma^2) distribution
R command Computes
pnorm(x, mean = mu, sd = sig) P(X \leq x)
qnorm(p, mean = mu, sd = sig) q such that P(X \leq q) = p
dnorm(x, mean = mu, sd = sig) f(x), where f is pdf of X
rnorm(n, mean = mu, sd = sig) n random samples from distr. of X

Note: Syntax of commands

norm = normal \qquad p = probability \qquad q = quantile

d = density \qquad \quad \,\,\,\, r = random

Example

  • Suppose average height of women is normally distributed N(\mu,\sigma^2)
  • Assume mean \mu = 163 cm and standard deviation \sigma = 8 cm


# Probability woman exceeds 180cm in height
# P(X > 180) = 1 - P(X <= 180)

1 - pnorm(180, mean = 163, sd = 8)
[1] 0.01679331

# The upper 10th percentile for women height, that is,
# height q such that P(X <= q) = 0.9

qnorm(0.90, mean = 163, sd = 8)
[1] 173.2524


# Value of pdf at 163

dnorm(163, mean = 163, sd = 8)
[1] 0.04986779


# Generate random sample of size 5

rnorm(5, mean = 163, sd = 8)
[1] 158.4961 172.1274 165.9390 182.5685 154.1374

Question: What is the height of the tallest woman, according to the model?

  • The tallest woman could be found using quantiles
  • There are roughly 3.5 billion women
  • The tallest would be in the top 1/(3.5 billion) quantile
# Find the top 1/(3.5 billion) quantile

p = 1 - 1 / (3.5e9)
qnorm(p, mean = 163, sd = 8)
[1] 212.5849
  • The current (living) tallest woman is Rumeysa Gelci at 215 cm (Wikipedia Page)

Probability Distributions in R

Chi-squared distribution

  • Commands for chi-squared distrubution are similar
  • df = n denotes n degrees of feedom
R command Computes
pchisq(x, df = n) P(X \leq x)
qchisq(p, df = n) q such that P(X \leq q) = p
dchisq(x, df = n) f(x), where f is pdf of X
rchisq(m, df = n) m random samples from distr. of X

Example 1

  • From Tables we found quantile \chi_{11}^2 (0.05) = 19.68
  • Question: Compute such quantile in R

Example 1 – Solution

  • From Tables we found quantile \chi_{11}^2 (0.05) = 19.68
  • Question: Compute such quantile in R


# Compute 0.95 quantile for chi-squared with 11 degrees of freedom

quantile <- qchisq(0.95, df = 11)

cat("The 0.95 quantile for chi-squared with df = 11 is", quantile)
The 0.95 quantile for chi-squared with df = 11 is 19.67514

Example 2

  • The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2

  • Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)

Example 2 – Solution

  • The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2

  • Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)

  • Observe that p := P(\chi_{n-1}^2 > \chi^2) = 1 - P(\chi_{n-1}^2 \leq \chi^2)

  • The code is therefore

# Compute p-value for chi^2 = chi_squared and df = n

p_value <- 1 - pchisq(chi_squared, df = n)

The variance ratio test in R

Month J F M A M J J A S O N D
Cons. Expectation 66 53 62 61 78 72 65 64 61 50 55 51
Cons. Spending 72 55 69 65 82 77 72 78 77 75 77 77
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  • Back to the Worked Example: Monthly data on CE and CS

  • Question: Test the following hypothesis: H_0 \colon \sigma = 1 \qquad H_1 \colon \sigma > 1

The variance ratio test in R

Month J F M A M J J A S O N D
Cons. Expectation 66 53 62 61 78 72 65 64 61 50 55 51
Cons. Spending 72 55 69 65 82 77 72 78 77 75 77 77
Difference -6 -2 -7 -4 -4 -5 -7 -14 -16 -25 -22 -26


  • Start by entering data into R
# Enter Consumer Expectation and Consumer Spending data
CE <- c(66, 53, 62, 61, 78, 72, 65, 64, 61, 50, 55, 51)
CS <- c(72, 55, 69, 65, 82, 77, 72, 78, 77, 75, 77, 77)

# Compute difference
difference <- CE - CS

The variance ratio test in R

  • Compute chi-squared statistic \chi^2 = \frac{(n-1) s^2}{\sigma^2_0}
# Compute sample size
n <- length(difference)

# Enter null hypothesis
sigma_0 <- 1

# Compute sample standard deviation
s <- sd(difference)

# Compute chi-squared statistic
chi_squared <- (n - 1) * s ^ 2 / sigma_0 ^ 2

The variance ratio test in R

  • Compute the p-value, and print to screen p = P(\chi_{n-1}^2 > \chi^2) = 1 - P(\chi_{n-1}^2 \leq \chi^2)
# Compute p-value
p_value <- 1 - pchisq(chi_squared, df = n - 1)

# Print p-value
cat("The p-value for one-sided variance test is", p_value)


Running the code

The p-value for one-sided variance test is 0


  • Since p = 0 < 0.05 we reject H_0
  • Therefore the true variance seems to be \sigma^2 > 1

Part 4:
Two-sample
hypothesis tests

Overview

In Lecture 3:

  • We looked at data on CCI before and after the 2008 crash
  • In this case data for each month is directly comparable
  • Can then construct the difference between the 2007 and 2009 values
  • Analysis reduces from a two-sample to a one-sample problem

Question
How do we analyze two samples that cannot be paired?

Problem statement

Goal: compare mean and variance of 2 independent normal samples

  • First sample:
    • X_1, \ldots, X_n from normal population N(\mu_X,\sigma_X^2)
  • Second sample:
    • Y_1, \ldots, Y_m from normal population N(\mu_Y,\sigma_Y^2)
  • We may have n \neq m
    • Samples cannot be paired due to different size!

Tests available:

  • Two-sample t-test to test for difference in means
  • Two-sample F-test to test for difference in variances (next week)

Why is this important?

  • Hypothesis testing starts to get interesting with 2 or more samples

  • t-test and F-test show the normal distribution family in action

  • This is also the maths behind regression

    • Same methods apply to seemingly unrelated problems
    • Regression is a big subject in statistics

Normal distribution family in action

Two-sample t-test

  • Want to compare the means of two independent samples
  • At the same time population variances are unknown
  • Therefore both variances are estimated with sample variances
  • Test statistic is t_k-distributed with k linked to the total number of observations

Normal distribution family in action

Two-sample F-test

  • Want to compare the variance of two independent samples

  • This can be done by studying the ratio of the sample variances S^2_X/S^2_Y

  • We have already shown that \frac{(n - 1) S^2_X}{\sigma^2_X} \sim \chi^2_{n - 1} \qquad \frac{(m - 1) S^2_Y}{\sigma^2_Y} \sim \chi^2_{m - 1}

Normal distribution family in action

Two-sample F-test

  • Hence we can study statistic F = \frac{S^2_X / \sigma_X^2}{S^2_Y / \sigma_Y^2}

  • We will see that F has F-distribution (next week)

Part 5:
Two-sample t-test

The two-sample t-test

Assumptions: Suppose given samples from 2 normal populations

  • X_1, \ldots ,X_n iid with distribution N(\mu_X,\sigma_X^2)
  • Y_1, \ldots ,Y_m iid with distribution N(\mu_Y,\sigma_Y^2)

Further assumptions:

  • In general n \neq m, so that one-sample t-test cannot be applied
  • The two populations have same variance \sigma^2_X = \sigma^2_Y = \sigma^2

Note: Assuming same variance is simplification. Removing it leads to Welch t-test

The two-sample t-test

Goal: Compare means \mu_X and \mu_Y

Hypothesis set: We test for a difference in means H_0 \colon \mu_X = \mu_Y \qquad H_1 \colon \mu_X \neq \mu_Y

t-statistic: The general form is T = \frac{\text{Estimate}-\text{Hypothesised value}}{\text{e.s.e.}}

The two-sample t-statistic

  • Define the sample means \overline{X} = \frac{1}{n} \sum_{i=1}^n X_i \qquad \qquad \overline{Y} = \frac{1}{m} \sum_{i=1}^m Y_i

  • Notice that {\rm I\kern-.3em E}[ \overline{X} ] = \mu_X \qquad \qquad {\rm I\kern-.3em E}[ \overline{Y} ] = \mu_Y

  • Therefore we can estimate \mu_X - \mu_Y with the sample means, that is, \text{Estimate} = \overline{X} - \overline{Y}

The two-sample t-statistic

  • Since we are testing for difference in mean, we have \text{Hypothesised value} = \mu_X - \mu_Y

  • The Estimated Standard Error is the standard deviation of estimator \text{e.s.e.} = \text{Standard Deviation of } \overline{X} -\overline{Y}

The two-sample t-statistic

  • Therefore the two-sample t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}}

  • Under the Null Hypothesis that \mu_X = \mu_Y, the t-statistic becomes T = \frac{\overline{X} - \overline{Y} }{\text{e.s.e.}}

A note on the degrees of freedom (df)

  • The general rule is \text{df} = \text{Sample size} - \text{No. of estimated parameters}

  • Sample size in two-sample t-test:

    • n in the first sample
    • m in the second sample
    • Hence total number of observations is n + m
  • No. of estimated parameters is 2: Namely \mu_X and \mu_Y

  • Hence degree of freedoms in two-sample t-test is {\rm df} = n + m - 2 (more on this later)

The estimated standard error

  • Recall: We are assuming populations have same variance \sigma^2_X = \sigma^2_Y = \sigma^2

  • We need to compute the estimated standard error \text{e.s.e.} = \text{Standard Deviation of } \ \overline{X} -\overline{Y}

  • Variance of sample mean was computed in the Lemma in Slide 72 Lecture 2

  • Since \overline{X} \sim N(\mu_X,\sigma^2) and \overline{Y} \sim N(\mu_Y,\sigma^2), by the Lemma we get {\rm Var}[\overline{X}] = \frac{\sigma^2}{n} \,, \qquad \quad {\rm Var}[\overline{Y}] = \frac{\sigma^2}{m}

The estimated standard error

  • Since X_i and Y_i are independent we get {\rm Cov}(X_i,Y_j)=0

  • By bilinearity of covariance we infer {\rm Cov}( \overline{X} , \overline{Y} ) = \frac{1}{n \cdot m} \sum_{i=1}^n \sum_{j=1}^m {\rm Cov}(X_i,Y_j) = 0

  • We can then compute \begin{align*} {\rm Var}[ \overline{X} - \overline{Y} ] & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] - 2 {\rm Cov}( \overline{X} , \overline{Y} ) \\ & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] \\ & = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right) \end{align*}

The estimated standard error

  • Taking the square root gives \text{S.D.}(\overline{X} - \overline{Y} )= \sigma \ \sqrt{\frac{1}{n}+\frac{1}{m}}

  • Therefore, the t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}} = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Estimating the variance

The t-statistic is currently T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

  • Variance \sigma^2 is unknown: we need to estimate it!

  • Define the sample variances

S_X^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1} \qquad \qquad S_Y^2 = \frac{ \sum_{i=1}^m Y_i^2 - m \overline{Y}^2 }{m-1}

Estimating the variance

  • Recall that X_1, \ldots , X_n \sim N(\mu_X, \sigma^2) \qquad \qquad Y_1, \ldots , Y_m \sim N(\mu_Y, \sigma^2)

  • From Lecture 2, we know that S_X^2 and S_Y^2 are unbiased estimators of \sigma^2, i.e. {\rm I\kern-.3em E}[ S_X^2 ] = {\rm I\kern-.3em E}[ S_Y^2 ] = \sigma^2

  • Therefore, both S_X^2 and S_Y^2 can be used to estimate \sigma^2

Estimating the variance

  • We can improve the estimate of \sigma^2 by combining S_X^2 and S_Y^2

  • We will consider a (convex) linear combination S^2 := \lambda_X S_X^2 + \lambda_Y S_Y^2 \,, \qquad \lambda_X + \lambda_Y = 1

  • S^2 is still an unbiased estimator of \sigma^2, since \begin{align*} {\rm I\kern-.3em E}[S^2] & = {\rm I\kern-.3em E}[ \lambda_X S_X^2 + \lambda_Y S_Y^2 ] \\ & = \lambda_X {\rm I\kern-.3em E}[S_X^2] + \lambda_Y {\rm I\kern-.3em E}[S_Y^2] \\ & = (\lambda_X + \lambda_Y) \sigma^2 \\ & = \sigma^2 \end{align*}

Estimating the variance

We choose coefficients \lambda_X and \lambda_Y which reflect sample sizes \lambda_X := \frac{n - 1}{n + m - 2} \qquad \qquad \lambda_Y := \frac{m - 1}{n + m - 2}

Notes:

  • We have \lambda_X + \lambda_Y = 1

  • Denominators in \lambda_X and \lambda_Y are degrees of freedom {\rm df } = n + m - 2

  • This choice is made so that S^2 has chi-squared distribution (more on this later)

Pooled estimator of variance

Definition
The pooled estimator of \sigma^2 is defined as S_p^2 := \lambda_X S_X^2 + \lambda_Y S_Y^2 = \frac{(n-1) S_X^2 + (m-1) S_Y^2}{n + m - 2}

Note:

  • n=m implies \lambda_X = \lambda_Y
  • In this case S_X^2 and S_Y^2 have same weight in S_p^2

The two-sample t-statistic

  • The t-statistic has currently the form T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

  • We replace \sigma with the pooled estimator S_p

The two-sample t-statistic

Definition
The two sample t-statistic is defined as T := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Note: Under the Null Hypothesis that \mu_X = \mu_Y this becomes T = \frac{\overline{X} - \overline{Y}}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} = \frac{\overline{X} - \overline{Y}}{ \sqrt{ \dfrac{ (n-1) S_X^2 + (m-1) S_Y^2 }{n + m - 2} } \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Distribution of two-sample t-statistic

Theorem
The two sample t-statistic has t_{n+m-2} distribution T := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \sim t_{n + m - 2}

Distribution of two-sample t-statistic

Proof

  • We have already seen that \overline{X} - \overline{Y} is normal with {\rm I\kern-.3em E}[\overline{X} - \overline{Y}] = \mu_X - \mu_Y \qquad \qquad {\rm Var}[\overline{X} - \overline{Y}] = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right)

  • Therefore we can rescale \overline{X} - \overline{Y} to get U := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ \sigma \sqrt{ \dfrac{1}{n} + \dfrac{1}{m}}} \sim N(0,1)

Distribution of two-sample t-statistic

Proof

  • We are assuming X_1, \ldots, X_n iid N(\mu_X,\sigma^2)

  • Therefore, as already shown, we have \frac{ (n-1) S_X^2 }{ \sigma^2 } \sim \chi_{n-1}^2

  • Similarly, since Y_1, \ldots, Y_m iid N(\mu_Y,\sigma^2), we get \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{m-1}^2

Distribution of two-sample t-statistic

Proof

  • Since X_i and Y_j are independent, we also have that \frac{ (n-1) S_X^2 }{ \sigma^2 } \quad \text{ and } \quad \frac{ (m-1) S_Y^2 }{ \sigma^2 } \quad \text{ are independent}

  • In particular we obtain \frac{ (n-1) S_X^2 }{ \sigma^2 } + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n-1}^2 + \chi_{m-1}^2 \sim \chi_{m + n- 2}^2

Distribution of two-sample t-statistic

Proof

  • Recall the definition of S_p^2 S_p^2 = \frac{(n-1) S_X^2 + (m-1) S_Y^2}{ n + m - 2 }

  • Therefore V := \frac{ (n+m-2) S_p^2 }{ \sigma^2 } = \frac{ (n - 1) S_X^2}{ \sigma^2} + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n + m - 2}^2

Distribution of two-sample t-statistic

Proof

  • Rewrite T as \begin{align*} T & = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \\ & = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ \sigma \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} } } \Bigg/ \sqrt{ \frac{ (n + m - 2) S_p^2 \big/ \sigma^2}{ (n+ m - 2) } } \\ & = \frac{U}{\sqrt{V/(n+m-2)}} \end{align*}

Distribution of two-sample t-statistic

Proof

  • By construction \overline{X}- \overline{Y} is independent of S_X^2 and S_Y^2

  • Therefore \overline{X}- \overline{Y} is independent of S_p^2

  • We conclude that U and V are independent

  • In conclusion, we have shown that T = \frac{U}{\sqrt{V/(n+m-2)}} \,, \qquad U \sim N(0,1) \,, \qquad V \sim \chi_{n + m - 2}^2

  • By the Theorem in Slide 118 of Lecture 2, we conclude that T \sim t_{n+m-2}

The two-sample t-test

Suppose given two independent samples

  • Sample x_1, \ldots, x_n from N(\mu_X,\sigma^2) of size n
  • Sample y_1, \ldots, y_m from N(\mu_Y,\sigma^2) of size m

The two-sided hypothesis for difference in means is H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y

The one-sided alternative hypotheses are H_1 \colon \mu_X < \mu_Y \quad \text{ or } \quad H_1 \colon \mu_X > \mu_Y

Procedure: 3 Steps

  1. Calculation: Compute the two-sample t-statistic t = \frac{ \overline{x} - \overline{y}}{ s_p \ \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }} where sample means and pooled variance estimator are \overline{x} = \frac{1}{n} \sum_{i=1}^n x_i \qquad \overline{y} = \frac{1}{m} \sum_{i=1}^m y_i \qquad s_p^2 = \frac{ (n-1) s_X^2 + (m - 1) s_Y^2 }{ m + n - 2} s_X^2 = \frac{\sum_{i=1}^n x_i^2 - n \overline{x}^2}{n-1} \qquad s_Y^2 = \frac{\sum_{i=1}^m y_i^2 - m \overline{y}^2}{m-1}

Procedure: 3 Steps

  1. Statistical Tables or R: Find either
    • Critical value in Table 13.1 t^* = \begin{cases} t_{n + m - 2} (0.025) & \quad \text{ if two-sided test}\\ t_{n + m - 2} (0.05) & \quad \text{ if one-sided test} \end{cases}
    • p-value in R p = \begin{cases} 2 P( t_{n + m -2} > |t| ) & \quad \text{ if } \,\, H_1 \colon \mu_X \neq \mu_Y \\ P( t_{n + m -2} < t ) & \quad \text{ if } \,\, H_1 \colon \mu_X < \mu_Y \\ P( t_{n + m -2} > t ) & \quad \text{ if } \,\, H_1 \colon \mu_X > \mu_Y \end{cases}

Procedure: 3 Steps

  1. Interpretation:
    • For two-sided t-test we reject H_0 if |t| > t_{n + m - 2} (0.025) \quad \text{ or } \quad p < 0.05
    • For one-sided t-test we reject H_0 if t < t_{n + m - 2} (0.05) \quad \text{ or } \quad p < 0.05 \quad \text{when } \, H_1 \colon \mu_X < \mu_Y t > t_{n + m - 2} (0.05) \quad \text{ or } \quad p < 0.05 \quad \text{when } \,\, H_1 \colon \mu_X > \mu_Y

The two-sample t-test in R

General commands

  1. Store the samples x_1,\ldots,x_n and y_1,\ldots,y_m in two R vectors
    • x_sample <- c(x1, ..., xn)
    • y_sample <- c(y1, ..., ym)
  2. Perform a two-sided two-sample t-test on x_sample and y_sample
    • t.test(x_sample, y_sample, var.equal = TRUE)
  3. Read output
    • Output is similar to one-sample t-test
    • The main quantity of interest is p-value

Comments on command t.test(x, y)

  1. R will perform a two-sample t-test on populations x and y

  2. R implicitly assumes the null hypothesis is H_0 \colon \mu_X - \mu_Y = 0

  3. mu = mu0 tells R to test null hypothesis: H_0 \colon \mu_X - \mu_Y = \mu_0

Comments on command t.test(x, y)

  1. One-sided two sample t-test can be performed by specifying
    • alternative = "greater" which tests H_1 \colon \mu_X - \mu_Y > \mu_0
    • alternative = "less" which tests H_1 \colon \mu_X - \mu_Y < \mu_0

Comments on command t.test(x, y)

  1. var.equal = TRUE tells R to assume that populations have same variance \sigma_X^2 = \sigma^2_Y

  2. In this case R computes the t-statistic with formula discussed earlier t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}

Comments on command t.test(x, y)

Warning: If var.equal = TRUE is not specified then

  • R assumes that populations have different variance \sigma_X^2 \neq \sigma^2_Y

  • In this case the t-statistic t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }} is NOT t-distributed

  • R performs the Welch t-test instead of the classic t-test
    (more on this later)

Part 6:
Two-sample t-test
Example

Mathematicians x_1 x_2 x_3 x_4 x_5 x_6 x_7 x_8 x_9 x_{10}
Wages 36 40 46 54 57 58 59 60 62 63
Accountants y_1 y_2 y_3 y_4 y_5 y_6 y_7 y_8 y_9 y_{10} y_{11} y_{12} y_{13}
Wages 37 37 42 44 46 48 54 56 59 60 60 64 64


  • Samples: Wage data on 10 Mathematicians and 13 Accountants

    • Wages are independent and normally distributed
    • Populations have equal variance
  • Quesion: Is there evidence of differences in average pay?

  • Answer: Two-sample two-sided t-test for the hypothesis H_0 \colon \mu_X = \mu_Y \,,\qquad H_1 \colon \mu_X \neq \mu_Y

Calculations: First sample

  • Sample size: \ n = No. of Mathematicians = 10

  • Mean: \bar{x} = \frac{\sum_{i=1}^n x_i}{n} = \frac{36+40+46+ \ldots +62+63}{10}=\frac{535}{10}=53.5

  • Variance: \begin{align*} s^2_X & = \frac{\sum_{i=1}^n x_i^2 - n \bar{x}^2}{n -1 } \\ \sum_{i=1}^n x_i^2 & = 36^2+40^2+46^2+ \ldots +62^2+63^2 = 29435 \\ s^2_X & = \frac{29435-10(53.5)^2}{9} = 90.2778 \end{align*}

Calculations: Second sample

  • Sample size: \ m = No. of Accountants = 13

  • Mean: \bar{y} = \frac{37+37+42+ \dots +64+64}{13} = \frac{671}{13} = 51.6154

  • Variance: \begin{align*} s^2_Y & = \frac{\sum_{i=1}^m y_i^2 - m \bar{y}^2}{m - 1} \\ \sum_{i=1}^m y_i^2 & = 37^2+37^2+42^2+ \ldots +64^2+64^2 = 35783 \\ s^2_Y & = \frac{35783-13(51.6154)^2}{12} = 95.7547 \end{align*}

Calculations: Pooled Variance

  • Pooled variance: \begin{align*} s_p^2 & = \frac{(n-1) s_X^2 + (m-1) s_Y^2}{ n + m - 2} \\ & = \frac{(9) 90.2778 + (12) 95.7547 }{ 10 + 13 - 2} \\ & = 93.40746 \end{align*}

  • Pooled standard deviation: s_p = \sqrt{93.40746} = 9.6648

Calculations: t-statistic

  1. Calculation: Compute the two-sample t-statistic

\begin{align*} t & = \frac{\bar{x} - \bar{y} }{s_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \\ & = \frac{53.5 - 51.6154}{9.6648 \times \sqrt{\dfrac{1}{10}+\dfrac{1}{13}}} \\ & = \frac{1.8846}{9.6648{\times}0.4206} \\ & = 0.464 \,\, (3\ \text{d.p.}) \end{align*}

Completing the t-test

  1. Referencing Tables:
    • Degrees of freedom are {\rm df} = n + m - 2 = 10 + 13 - 2 = 21
    • Find corresponding critical value in Table 13.1 t_{21}(0.025) = 2.08

Completing the t-test

  1. Interpretation:
    • We have that | t | = 0.464 < 2.08 = t_{21}(0.025)

    • Therefore the p-value satisfies p>0.05

    • There is no evidence (p>0.05) in favor of H_1

    • Hence we accept that \mu_X = \mu_Y

  2. Conclusion: Average pay levels seem to be the same for both professions

The two-sample t-test in R: Code

# Enter Wages data in 2 vectors using function c()

mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)


# Perform two-sample t-test with null hypothesis mu_X = mu_Y
# Specify that populations have same variance
# Store result of t.test in answer

answer <- t.test(mathematicians, accountants, var.equal = TRUE)


# Print answer
print(answer)


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. First line: R tells us that a Two-Sample t-test is performed
  2. Second line: Data for t-test is mathematicians and accountants


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. Third line:
    • The t-statistic computed is t = 0.46359
    • Note: This coincides with the one computed by hand!
    • There are 21 degrees of freedom
    • The p-values is p = 0.6477


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. Fourth line: The alternative hypothesis is that the difference in means is not zero
    • This translates to H_1 \colon \mu_X \neq \mu_Y
    • Warning: This is not saying to reject H_0 – R is just stating H_1


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. Fifth line: R computes a 95 \% confidence interval for \mu_X - \mu_Y (\mu_X - \mu_Y) \in [-6.569496, 10.338727]
    • Interpretation: If you repeat the experiment (on new data) over and over, the interval [a,b] will contain \mu_X - \mu_Y about 95\% of the times


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. Seventh line: R computes sample mean for the two populations
    • Sample mean for mathematicians is 53.5
    • Sample mean for accountants is 51.61538


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Conclusion: The p-value is p = 0.6477

  • Since p > 0.05 we do not reject H_0
  • Hence \mu_X and \mu_Y appear to be similar
  • Average pay levels seem to be the same for both professions

Comment on Assumptions

The previous two-sample t-test was conducted under the following assumptions:

  1. Wages data is normally distributed
  2. The two populations have equal variance

Using R, we can plot the data to see if these are reasonable (graphical exploration)

Warnings: Even if the assumptions hold

  • we cannot expect the samples to be exactly normal (bell-shaped)
    • rather, look for approximate normality
  • we cannot expect the sample variances to match
    • rather, look for a similar spread in the data

Estimating the sample distribution in R

Suppose given a data sample stored in a vector z

  • If the sample is large, we can check normality by plotting the histogram of z

  • Example: z sample of size 1000 form N(0,1) – Its histogram resembles N(0,1)

z <- rnorm(1000)                # Sample 1000 times from N(0,1) 
hist(z, probability = TRUE)     # Plot histogram, with area scaled to 1

Drawback: Small samples \implies hard to check normality from histogram

  • This is true even if the data is normal
  • Example: z below is sample of size 9 from N(0,1) – But histogram not normal
z <- c(-0.78, -1.67, -0.38,  0.92, -0.58,  
       0.61, -1.62, -0.06, 0.52)           # Random from N(0,1)
hist(z, probability = TRUE)                # Histogram not normal

Solution: Suppose given iid sample z from a distribution f

  • The command density(z) estimates the population distribution f
    (Estimate based on the sampling distribution of z and smoothing - Not easy task)

  • Example: z as in previous slide. The plot of density(z) shows normal behavior

z <- c(-0.78, -1.67, -0.38,  0.92, -0.58,  
       0.61, -1.62, -0.06, 0.52)           # Random from N(0,1)
dz <- density(z)                           # Estimate the density of z
plot(dz)                                   # Plot the estimated density

The R object density(z) models a 1D function (the estimated distribution of z)

  • As such, it contains a grid of x values, with associated y values
    • x values are stored in vector density(z)$x
    • y values are stored in vector density(z)$y
  • These values are useful to set the axis range in a plot
dz <- density(z)

plot(dz,                        # Plot dz
     xlim = range(dz$x),        # Set x-axis range
     ylim = range(dz$y))        # Set y-axis range

Axes range set as the min and max values of components of dz

Checking the Assumptions on our Example

# Compute the estimated distributions
d.math <- density(mathematicians)
d.acc <- density(accountants)

# Plot the estimated distributions

plot(d.math,                                    # Plot d.math
     xlim = range(c(d.math$x, d.acc$x)),        # Set x-axis range
     ylim = range(c(d.math$y, d.acc$y)),        # Set y-axis range
     main = "Estimated Distributions of Wages") # Add title to plot
lines(d.acc,                                    # Layer plot of d.acc
      lty = 2)                                  # Use different line style
         
legend("topleft",                               # Add legend at top-left
       legend = c("Mathematicians",             # Labels for legend
                  "Accountants"), 
       lty = c(1, 2))                           # Assign curves to legend

Axes range set as the min and max values of components of d.math and d.acc

  1. Wages data looks approximately normally distributed (roughly bell-shaped)

  2. The two populations have similar variance (spreads look similar)

Conclusion: Two-sample t-test with equal variance is appropriate \implies accept H_0

Part 7:
The Welch t-test

Samples with different variance

  • We just examined the two-sample t-tests

  • This assumes independent normal populations with equal variance

\sigma_X^2 = \sigma_Y^2

  • Question: What happens if variances are different?

  • Answer: Use the Welch Two-sample t-test

    • This is a generalization of the two-sample t-test to the case \sigma_X^2 \neq \sigma_Y^2
    • In R it is performed with t.test(x, y)
    • Note that we are just omitting the option var.equal = TRUE
    • Equivalently, you may specify var.equal = FALSE

The Welch two-sample t-test

  • Welch t-test consists in computing the Welch statistic w = \frac{\overline{x} - \overline{y}}{ \sqrt{ \dfrac{s_X^2}{n} + \dfrac{s_Y^2}{m} } }

  • If sample sizes m,n > 5, then w is approximately t-distributed

    • Degrees of freedom are not integer, and depend on S_X, S_Y, n, m
  • If variances are similar, the welch statistic is comparable to the t-statistic

w \approx t : = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}

Welch t-test Vs two-sample t-test

If variances are similar:

  • Welch statistic and t-statistic are similar
  • p-value from Welch t-test is similar to p-value from two-sample t-test
  • Since p-values are similar, most times the 2 tests yield same decision
  • The tests can be used interchangeably

If variances are very different:

  • Welch statistic and t-statistic are different
  • p-values from the two tests can differ a lot
  • The two tests might give different decision
  • Wrong to apply two-sample t-test, as variances are different

The Welch two-sample t-test in R

# Enter Wages data

mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)


# Perform Welch two-sample t-test with null hypothesis mu_X = mu_Y
# Store result of t.test in answer

answer <- t.test(mathematicians, accountants)


# Print answer
print(answer)
  • Note:
    • This is almost the same code as in Slide 86
    • Only difference: we are omitting the option var.equal = TRUE in t.test


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. First line: R tells us that a Welch Two-Sample t-test is performed
    • The rest of the output is similar to classic t-test
    • Main difference is that p-value and t-statistic differ from classic t-test


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Comments on output:

  1. Third line:
    • The Welch t-statistic is w = 0.46546 (standard t-test gave t = 0.46359)
    • Degrees of freedom are fractionary \rm{df} = 19.795 (standard t-test \rm{df} = 21)
    • The Welch t-statistic is approximately t-distributed with W \approx t_{19.795}
  2. Fifth line: The confidence interval for \mu_X - \mu_Y is also different


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538 

Conclusion: The p-values obtained with the 2 tests are almost the same

  • Welch t-test: p-value = 0.6467 \qquad Classic t-test: p-value = 0.6477

  • Both test: p > 0.05, and therefore do not reject H_0

  • Note: This was expected

    • The spread of the two populations is similar \implies \, \sigma_X^2 \approx \sigma_Y^2
    • Hence, Welch t-statistic approximates t-statistic \implies p-values are similar

Exercise

  • We compare the Effect of Two Treatments on Blood Pressure Change

  • Both treatments are given to a group of patients

  • Measurements of changes in blood pressure are taken after 4 weeks of treatment

  • Note that changes represent both positive and negative shifts in blood pressure


Treat. A -1.9 -2.5 -2.1 -2.4 -2.6 -1.9
Treat. B -1.1 -0.9 -1.4 0.2 0.3 0.6 -5 -2.4 1.5 -2.3 -2.8 2.1

# Enter changes in Blood pressure data

trA <- c(-1.9, -2.5, -2.1, -2.4, -2.6, -1.9)
trB <- c(-1.1, -0.9, -1.4, 0.2, 0.3, 0.6, -5,
         -2.4, -1.5, 2.3, -2.8, 2.1)

cat("Mean of Treatment A:", mean(trA), "Mean of Treatment B:", mean(trB))
Mean of Treatment A: -2.233333 Mean of Treatment B: -0.8


  • Sample means show both Treatments are effective in decreasing blood pressure

  • However Treatment A seems slightly better

Question: Perform a t-test to see if Treatment A is better H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

Solution: Estimated density of Treatment A

plot(density(trA))    
  • Estimated density looks bell-shaped \implies First population is normal
  • Sample seems concentrated between -3 and -1.5

Estimated density of Treatment B

plot(density(trB))    
  • Estimated density looks bell-shaped \implies Second population is normal
  • Sample seems concentrated between -7 and 4

Findings

  • Both populations are normal \implies t-test is appropriate
  • First sample seems concentrated between -3 and -1.5
  • Second sample seems concentrated between -7 and 4
  • Treatment B has larger spread
  • Therefore we suspect that populations have different variance

\sigma_A^2 \neq \sigma_B^2

Conclusion:

  • The Welch t-test is appropriate
  • Two sample t-test would not be appropriate (as it assumes equal variance)

Apply the Welch t-test

We are testing the one-sided hypothesis

H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

# Perform Welch t-test and retrieve p-value

ans <- t.test(trA, trB, alternative = "less", var.equal = F) 
ans$p.value
[1] 0.01866013


  • The p-value is p < 0.05

  • We reject H_0 \implies Treatment A is more effective

Two-sample t-test gives different decision

We are testing the one-sided hypothesis

H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

# Perform two-sample t-test and retrieve p-value

ans <- t.test(trA, trB, alternative = "less", var.equal = T) 
ans$p.value
[1] 0.05836482


  • The p-value is p > 0.05

  • H_0 cannot be rejected \implies There is no evidence that Treatment A is better

  • Wrong conclusion, because two-sample t-test does not apply

Disclaimer

  • The previous data was synthetic, and the background story was made up!

  • Nonetheless, the example is still valid

  • To construct the data, I sampled as follows

    • Treatment A: Sample of size 6 from N(-2,1)
    • Treatment B: Sample of size 12 from N(-1.5,9)
  • We see that \mu_A < \mu_B \,, \qquad \sigma_A^2 \neq \sigma_B^2

  • This tells us that:

    • We can expect that some samples will support that \mu_A < \mu_B
    • Two-sample t-test is inappropriate because \sigma_A^2 \neq \sigma_B^2

Generating the Data

Click here to see the code I used
# Set seed for random generation
# This way you always get the same random numbers when
# you run this code
set.seed(21) 

repeat {
  # Generate random samples
  x <- rnorm(6, mean = -2, sd = 1)
  y <- rnorm(12, mean = -1.5, sd = 3)

  # Round x and y to 1 decimal point
  x <- round(x, 1)
  y <- round(y, 1)
  
  # Perform one-sided t-tests for alternative hypothesis mu_x < mu_y 
  ans_welch <- t.test(x, y, alternative = "less", var.equal = F)
  ans_t_test <- t.test(x, y, alternative = "less", var.equal = T)
   
  # Check that Welch test succeeds and two-sample test fails
  if (ans_welch$p.value < 0.05 && ans_t_test$p.value > 0.05) {
    cat("Data successfully generated!!!\n\n")
    cat("Synthetic Data TrA:", x, "\n")
    cat("Synthetic Data TrB:", y, "\n\n")
    cat("Welch t-test p-value:", ans_welch$p.value, "\n")
    cat("Two-sample t-test p-value:", ans_t_test$p.value)
    break
    }
}
Data successfully generated!!!

Synthetic Data TrA: -1.9 -2.5 -2.1 -2.4 -2.6 -1.9 
Synthetic Data TrB: -1.1 -0.9 -1.4 0.2 0.3 0.6 -5 -2.4 -1.5 2.3 -2.8 2.1 

Welch t-test p-value: 0.01866013 
Two-sample t-test p-value: 0.05836482

Method:

  • Sample the data as in previous slide (round to 1 d.p. for cleaner looking data)

  • Repeat until Welch test succeeds, and two-sample t-test fails \text{p-value of Welch test } \, < 0.05 < \, \text{p-value of Two-sample t-test}

Part 8:
The t-test for
paired samples

Paired samples

  • Assume to have two sample with same size

  • Sometimes the two samples depend on each other in some way

  1. Twin studies:
    • Twins are used as pairs (to control genetic or environmental factors)
    • Example: test effectiveness of a medical treatment against placebo
    • The two samples are clearly dependent
      (think of twins as the same person)
    • As such, the usual two-sample t-test is not applicable
      (because it assumes independence)

Paired samples

  • Assume to have two sample with same size

  • Sometimes the two samples depend on each other in some way

  1. Pre-test and Post-test
    • Measure the outcome of a certain action
    • Example: does this module work for teaching R?
    • We can assess the effectiveness of something with a pre-test and a post-test
    • The two samples are clearly dependent
      (each individual takes a test twice)
    • As such, the usual two-sample t-test is not applicable
      (because it assumes independence)

The paired t-test

Suppose given two samples

  • Sample x_1, \ldots, x_n from N(\mu_X,\sigma^2_X)

  • Sample y_1, \ldots, y_n from N(\mu_Y,\sigma^2_Y)

The hypotheses for difference in means are H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y \,, \quad \mu_X < \mu_Y \,, \quad \text{ or } \quad \mu_X > \mu_Y

The paired t-test

Assumption: The data is paired, meaning that the differences

d_i = x_i - y_i \,\, \text{ are iid} \,\, N(\mu,\sigma^2) \quad \text{where} \quad \mu := \mu_X - \mu_Y

The hypotheses for the difference in means are equivalent to

H_0 \colon \mu = 0 \,, \quad \qquad H_1 \colon \mu \neq 0 \,, \quad \mu < 0 \,, \quad \text{ or } \quad \mu > 0

These can be tested with a one-sample t-test


R commands: The paired t-test can be called with the equivalent commands

  • t.test(x, y, paired = TRUE) \qquad \quad H_0 \colon \mu_X = \mu_Y

  • t.test(x - y) \qquad \qquad \qquad \qquad\qquad H_0 \colon \mu = 0

Example 1: The 2008 crisis (again!)

Month J F M A M J J A S O N D
CCI 2007 86 86 88 90 99 97 97 96 99 97 90 90
CCI 2009 24 22 21 21 19 18 17 18 21 23 22 21
  • Data: Monthly Consumer Confidence Index (CCI) in 2007 and 2009

  • Question: Did the crash of 2008 have lasting impact upon CCI?

  • Observations:

    • Data shows a massive drop in CCI between 2007 and 2009
    • Data is clearly paired (Pre-test and Post-test situation)
  • Method: Use paired t-test to investigate drop in mean CCI

H_0 \colon \mu_{2007} = \mu_{2009} \,, \quad H_1 \colon \mu_{2007} > \mu_{2009}

Perform paired t-test


# Enter CCI data
score_2007 <- c(86, 86, 88, 90, 99, 97, 97, 96, 99, 97, 90, 90)
score_2009 <- c(24, 22, 21, 21, 19, 18, 17, 18, 21, 23, 22, 21)

# Perform paired t-test and print p-value
ans <- t.test(score_2007, score_2009, paired = T, alternative = "greater")
ans$p.value
[1] 2.430343e-13


The p-value is significant: \,\, p < 0.05 \, \implies \, reject H_0 \, \implies \, Drop in CCI

Warning

It would be wrong to use a two-sample t-test

  • This is because the samples are paired, and hence dependent

  • This is further supported by computing the correlation

  • High correlation implies dependence


cor(score_2007, score_2009)                # Correlation is high
[1] -0.6076749

Example 2: Water quality samples

  • Researchers wish to measure water quality

  • There are two possible tests, one less expensive than the other

  • 10 water samples were taken, and each was measured both ways

method1 45.9 57.6 54.9 38.7 35.7 39.2 45.9 43.2 45.4 54.8
method2 48.2 64.2 56.8 47.2 43.7 45.7 53.0 52.0 45.1 57.5

Question: Do the tests give the same results?

  • Observation: The data is paired (twin study situation)

  • Method: Use paired t-test to investigate equality of results

H_0 \colon \mu_1 = \mu_2 \,, \quad H_1 \colon \mu_1 \neq \mu_2

Perform paired t-test


# Enter tests data
method1 <- c(45.9, 57.6, 54.9, 38.7, 35.7, 39.2, 45.9, 43.2, 45.4, 54.8)
method2 <- c(48.2, 64.2, 56.8, 47.2, 43.7, 45.7, 53.0, 52.0, 45.1, 57.5)

# Perform paired t-test and print p-value
ans <- t.test(method1, method2, paired = T)
ans$p.value
[1] 0.0006648526


p-value is significant: \,\, p < 0.05 \, \implies \, reject H_0 \, \implies \, Methods perform differently

Warning

It would be wrong to use a two-sample t-test

  • This is because the samples are paired, and hence dependent

  • This is also supported by high samples correlation

cor(method1, method2)                # Correlation is very high
[1] 0.9015147

Warning

In this Example, performing a two-sample t-test would lead to wrong decision


# Perform Welch t-test and print p-value
ans <- t.test(method1, method2, paired = F)       # paired = F is default
ans$p.val
[1] 0.1165538


Wrong conclusion: \,\, p > 0.05 \, \implies \, cannot reject H_0 \, \implies \, Methods perform similarly


Bottom line: The data is paired, therefore a paired t-test must be used