Statistical Models

Lecture 4

Dr. Silvio Fanzon

S.Fanzon@hull.ac.uk

University of Hull

Lecture 4:
The variance ratio &
Two-sample t-tests

Outline of Lecture 4

One-sample variance ratio test
Worked example
One-sample variance ratio test in R
Two-sample hypothesis tests
Two-sample t-test
Two-sample t-test: Example
The Welch t-test
The t-test for paired samples

Part 1:
One-sample variance
ratio test

Task: Estimating mean and variance

Assume the population has normal distribution N(\mu,\sigma^2)
- Mean \mu and variance \sigma^2 are unknown
Questions about \mu and \sigma^2
1. What is my best guess of the value?
2. How far away from the true value am I likely to be?
Answers:
- The one-sample t-test answers questions about \mu (seen in Lecture 3)
- The one-sample variance ratio test answers questions about \sigma^2

Reminder

The-one sample variance test uses chi-squared distribution
Recall: Chi-squared distribution with p degrees of freedom is \chi_p^2 = Z_1^2 + \ldots + Z_p^2 where Z_1, \ldots, Z_p are iid N(0, 1)

One-sample one-sided variance ratio test

Assumption: Suppose given sample X_1,\ldots, X_n iid from N(\mu,\sigma^2)

Goal: Estimate variance \sigma^2 of population

Test:

Suppose \sigma_0 is guess for \sigma
The one-sided hypothesis test for \sigma is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0

What to do?

Consider the sample variance S^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1}
Since we believe H_0, the variance is \sigma = \sigma_0
S^2 cannot be too far from the true variance \sigma
Therefore we cannot have that S^2 \gg \sigma^2 = \sigma_0^2

What to do?

If we observe S^2 \gg \sigma_0^2 then our guess \sigma_0 is probably wrong
Therefore we reject H_0 if S^2 \gg \sigma_0^2
The rejection condition S^2 \gg \sigma_0^2 is equivalent to \frac{(n-1)S^2}{\sigma_0^2} \gg 1 where n is the sample size

What to do?

We define our test statistic as \chi^2 := \frac{(n-1)S^2}{\sigma_0^2}
The rejection condition is hence \chi^2 \gg 1

What to do?

In Lecture 2, we have proven that \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2
Assuming \sigma=\sigma_0, we therefore have \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} = \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2

Summary: Rejection condition

We reject H_0 if \chi^2 = \frac{(n-1)S^2}{\sigma_0^2} \gg 1
This means we do not want \chi^2 to be too extreme to the right
As \chi^2 \sim \chi_{n-1}^2, we decide to rejct H_0 if \chi^2 > \chi_{n-1}^2(0.05)
By definition, the critical value \chi_{n-1}^2(0.05) is such that P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05

Critical values of chi-squared

x^* := \chi_{n-1}^2(0.05) is point on x-axis such that P(\chi_{n-1}^2 > x^* ) = 0.05
Therefore, the semi-open interval (x^*,+\infty) is the rejection region
In the picture we have n = 12 and \chi_{11}^2(0.05) = 19.68

Critical values of chi-squared – Tables

Find Table 2 in this file
Look at the row with Degree of Freedom n-1 (or its closest value)
Find critical value \chi^2_{n-1}(0.05) in column \alpha = 0.05
Example: n=12, DF =11, \chi^2_{11}(0.05) = 19.68

The p-value

Given the test statistic \chi^2, the p-value is defined as p := P( \chi_{n-1}^2 > \chi^2 )
Notice that p < 0.05 \qquad \iff \qquad \chi^2 > \chi_{n-1}^2(0.05)

This is because \chi^2 > \chi_{n-1}^2(0.05) iff p = P(\chi_{n-1}^2 > \chi^2) < P(\chi_{n-1}^2 > \chi_{n-1}^2(0.05) ) = 0.05

One-sample one-sided variance ratio test

Suppose given

Sample x_1, \ldots, x_n of size n from N(\mu,\sigma^2)
Guess \sigma_0 for \sigma

The one-sided hypothesis test is H_0 \colon \sigma = \sigma_0 \qquad H_1 \colon \sigma > \sigma_0

Procedure: 3 steps

Calculation: Compute the chi-squared statistic \chi^2 = \frac{(n-1) s^2}{\sigma_0^2} where sample mean and variance are \overline{x} = \frac{1}{n} \sum_{i=1}^n x_i \,, \qquad s^2 = \frac{\sum_{i=1}^n x_i^2 - n \overline{x}^2}{n-1}

Statistical Tables or R: Find either
- Critical value in Table 2 \chi_{n-1}^2(0.05)
- p-value in R p := P( \chi_{n-1}^2 > \chi^2 ) (more on this later)

Interpretation: Reject H_0 if either \chi^2 > \chi_{n-1}^2(0.05) \qquad \text{ or } \qquad p < 0.05

Part 2:
Worked example

One-sample variance ratio test: Example

Month	J	F	M	A	M	J	J	A	S	O	N	D
Cons. Expectation	66	53	62	61	78	72	65	64	61	50	55	51
Cons. Spending	72	55	69	65	82	77	72	78	77	75	77	77
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Data: Consumer Expectation (CE) and Consumer Spending (CS) in 2011
Assumption: CE and CS are normally distributed

One-sample variance ratio test: Example

Month	J	F	M	A	M	J	J	A	S	O	N	D
Cons. Expectation	66	53	62	61	78	72	65	64	61	50	55	51
Cons. Spending	72	55	69	65	82	77	72	78	77	75	77	77
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Remark: Monthly data on CE and CS can be matched
- Hence consider: \quad Difference = CE - CS
- CE and CS normal \quad \implies \quad Difference \sim N(\mu,\sigma^2)
Question: Test the following hypothesis: H_0 \colon \sigma = 1 \qquad H_1 \colon \sigma > 1

Motivation of test

If X \sim N(\mu,\sigma^2) then P( \mu - 2 \sigma \leq X \leq \mu + 2\sigma ) \approx 0.95
Recall: \quad Difference = (CE - CS) \sim N(\mu,\sigma^2)
Hence if \sigma = 1 P( \mu - 2 \leq {\rm CE} - {\rm CS} \leq \mu + 2 ) \approx 0.95
Meaning of variance ratio test: \sigma=1 \quad \implies \quad \text{CS index is within } \pm{2} \text{ of CE index with probability } 0.95

The variance ratio test by hand

Month	J	F	M	A	M	J	J	A	S	O	N	D
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Calculations: Using the above data, compute

Sample mean: \bar{x} = \frac{\sum_{i=1}^{n} x_i}{n} = \frac{-6-2-7-4- \ldots -22-26}{12} = -\frac{138}{12} = -11.5

The variance ratio test by hand

Month	J	F	M	A	M	J	J	A	S	O	N	D
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Calculations: Using the above data, compute

Sample variance: \begin{align*} \sum_{i=1}^{n} x_i^2 & = (-6)^2 + (-2)^2 + (-7)^2 + \ldots + (-22)^2 + (-26)^2 = 2432 \\ s^2 & = \frac{\sum_{i=1}^n x^2_i- n \bar{x}^2}{n-1} = \frac{2432-12(-11.5)^2}{11} = \frac{845}{11} = 76.8182 \end{align*}

The variance ratio test by hand

Month	J	F	M	A	M	J	J	A	S	O	N	D
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Calculations: Using the above data, compute

Chi-squared statistic: \chi^2 = \frac{(n-1)s^2}{\sigma_0^2} = \frac{11 \left(\frac{845}{11}\right) }{1} = 845

The variance ratio test by hand

Statistical Tables:
- Sample size is n = 12
- Degrees of freedom are {\rm df} = n-1 = 11
- In Table 2 find \chi_{11}^2(0.05) = 19.68
Interpretation:
- Test statistic is \chi^2 = 845
- This falls in the rejection region \chi^2 = 845 > 19.68 = \chi_{11}^2(0.05)
- We reject H_0

The variance ratio test by hand

Conclusion:
- We reject H_0: The standard deviation satisfies \sigma > 1
- A better estimate for \sigma could be sample standard deviation s=\sqrt{\frac{845}{11}}=8.765
- This suggests: With probability 0.95 \text{CS index is within } \pm{2 \times 8.765 = \pm 17.53 } \text{ of CE index}

Part 3:
One-sample variance
ratio test in R

The variance ratio test in R

Goal: Perform chi-squared variance ratio test in R

For this, we need to compute p-value p = P(\chi_{n-1}^2 > \chi^2)
Thus, we need to compute probabilities for chi-squared distribution in R

Probability Distributions in R

R can natively do calculations with known probability distrubutions
Example: Let X be r.v. with N(\mu,\sigma^2) distribution

R command	Computes
`pnorm(x, mean = mu, sd = sig)`	P(X \leq x)
`qnorm(p, mean = mu, sd = sig)`	q such that P(X \leq q) = p
`dnorm(x, mean = mu, sd = sig)`	f(x), where f is pdf of X
`rnorm(n, mean = mu, sd = sig)`	n random samples from distr. of X

Note: Syntax of commands

norm = normal \qquad p = probability \qquad q = quantile

d = density \qquad \quad \,\,\,\, r = random

Example

Suppose average height of women is normally distributed N(\mu,\sigma^2)
Assume mean \mu = 163 cm and standard deviation \sigma = 8 cm

# Probability woman exceeds 180cm in height
# P(X > 180) = 1 - P(X <= 180)

1 - pnorm(180, mean = 163, sd = 8)

[1] 0.01679331

# The upper 10th percentile for women height, that is,
# height q such that P(X <= q) = 0.9

qnorm(0.90, mean = 163, sd = 8)

[1] 173.2524

# Value of pdf at 163

dnorm(163, mean = 163, sd = 8)

[1] 0.04986779

# Generate random sample of size 5

rnorm(5, mean = 163, sd = 8)

[1] 152.1163 158.5055 159.1208 154.5700 154.2921

Question: What is the height of the tallest woman, according to the model?

The tallest woman could be found using quantiles
There are roughly 3.5 billion women
The tallest would be in the top 1/(3.5 billion) quantile

# Find the top 1/(3.5 billion) quantile

p = 1 - 1 / (3.5e9)
qnorm(p, mean = 163, sd = 8)

[1] 212.5849

The current (living) tallest woman is Rumeysa Gelci at 215 cm (Wikipedia Page)

Probability Distributions in R

Chi-squared distribution

Commands for chi-squared distrubution are similar
df = n denotes n degrees of feedom

R command	Computes
`pchisq(x, df = n)`	P(X \leq x)
`qchisq(p, df = n)`	q such that P(X \leq q) = p
`dchisq(x, df = n)`	f(x), where f is pdf of X
`rchisq(m, df = n)`	m random samples from distr. of X

Example 1

From Tables we found quantile \chi_{11}^2 (0.05) = 19.68
Question: Compute such quantile in R

Example 1 – Solution

From Tables we found quantile \chi_{11}^2 (0.05) = 19.68
Question: Compute such quantile in R

# Compute 0.95 quantile for chi-squared with 11 degrees of freedom

quantile <- qchisq(0.95, df = 11)

cat("The 0.95 quantile for chi-squared with df = 11 is", quantile)

The 0.95 quantile for chi-squared with df = 11 is 19.67514

Example 2

The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2
Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)

Example 2 – Solution

The \chi^2 statistic for variance ratio test has distribution \chi_{n-1}^2
Question: Compute the p-value p := P(\chi_{n-1}^2 > \chi^2)
Observe that p := P(\chi_{n-1}^2 > \chi^2) = 1 - P(\chi_{n-1}^2 \leq \chi^2)
The code is therefore

# Compute p-value for chi^2 = chi_squared and df = n

p_value <- 1 - pchisq(chi_squared, df = n)

The variance ratio test in R

Month	J	F	M	A	M	J	J	A	S	O	N	D
Cons. Expectation	66	53	62	61	78	72	65	64	61	50	55	51
Cons. Spending	72	55	69	65	82	77	72	78	77	75	77	77
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Back to the Worked Example: Monthly data on CE and CS
Question: Test the following hypothesis: H_0 \colon \sigma = 1 \qquad H_1 \colon \sigma > 1

The variance ratio test in R

Month	J	F	M	A	M	J	J	A	S	O	N	D
Cons. Expectation	66	53	62	61	78	72	65	64	61	50	55	51
Cons. Spending	72	55	69	65	82	77	72	78	77	75	77	77
Difference	-6	-2	-7	-4	-4	-5	-7	-14	-16	-25	-22	-26

Start by entering data into R

# Enter Consumer Expectation and Consumer Spending data
CE <- c(66, 53, 62, 61, 78, 72, 65, 64, 61, 50, 55, 51)
CS <- c(72, 55, 69, 65, 82, 77, 72, 78, 77, 75, 77, 77)

# Compute difference
difference <- CE - CS

The variance ratio test in R

Compute chi-squared statistic \chi^2 = \frac{(n-1) s^2}{\sigma^2_0}

# Compute sample size
n <- length(difference)

# Enter null hypothesis
sigma_0 <- 1

# Compute sample standard deviation
s <- sd(difference)

# Compute chi-squared statistic
chi_squared <- (n - 1) * s ^ 2 / sigma_0 ^ 2

The variance ratio test in R

Compute the p-value, and print to screen p = P(\chi_{n-1}^2 > \chi^2) = 1 - P(\chi_{n-1}^2 \leq \chi^2)

# Compute p-value
p_value <- 1 - pchisq(chi_squared, df = n - 1)

# Print p-value
cat("The p-value for one-sided variance test is", p_value)

The full code can be downloaded here variance_ratio_test.R

Running the code

Running variance_ratio_test.R gives the following output:

The p-value for one-sided variance test is 0

Since p = 0 < 0.05 we reject H_0
Therefore the true variance seems to be \sigma^2 > 1

Part 4:
Two-sample
hypothesis tests

Overview

In Lecture 3:

We looked at data on CCI before and after the 2008 crash
In this case data for each month is directly comparable
Can then construct the difference between the 2007 and 2009 values
Analysis reduces from a two-sample to a one-sample problem

Question

How do we analyze two samples that cannot be paired?

Problem statement

Goal: compare mean and variance of 2 independent normal samples

First sample:
- X_1, \ldots, X_n from normal population N(\mu_X,\sigma_X^2)
Second sample:
- Y_1, \ldots, Y_m from normal population N(\mu_Y,\sigma_Y^2)
We may have n \neq m
- Samples cannot be paired due to different size!

Tests available:

Two-sample t-test to test for difference in means
Two-sample F-test to test for difference in variances (next week)

Why is this important?

Hypothesis testing starts to get interesting with 2 or more samples
t-test and F-test show the normal distribution family in action
This is also the maths behind regression
- Same methods apply to seemingly unrelated problems
- Regression is a big subject in statistics

Normal distribution family in action

Two-sample t-test

Want to compare the means of two independent samples
At the same time population variances are unknown
Therefore both variances are estimated with sample variances
Test statistic is t_k-distributed with k linked to the total number of observations

Normal distribution family in action

Two-sample F-test

Want to compare the variance of two independent samples
This can be done by studying the ratio of the sample variances S^2_X/S^2_Y
We have already shown that \frac{(n - 1) S^2_X}{\sigma^2_X} \sim \chi^2_{n - 1} \qquad \frac{(m - 1) S^2_Y}{\sigma^2_Y} \sim \chi^2_{m - 1}

Normal distribution family in action

Two-sample F-test

Hence we can study statistic F = \frac{S^2_X / \sigma_X^2}{S^2_Y / \sigma_Y^2}
We will see that F has F-distribution (next week)

Part 5:
Two-sample t-test

The two-sample t-test

Assumptions: Suppose given samples from 2 independent normal populations

X_1, \ldots ,X_n iid with distribution N(\mu_X,\sigma_X^2)
Y_1, \ldots ,Y_m iid with distribution N(\mu_Y,\sigma_Y^2)

Further assumptions:

In general n \neq m, so that one-sample t-test cannot be applied
The two populations have same variance \sigma^2_X = \sigma^2_Y = \sigma^2

Note: Assuming same variance is simplification. Removing it leads to Welch t-test

The two-sample t-test

Goal: Compare means \mu_X and \mu_Y

Hypothesis set: We test for a difference in means H_0 \colon \mu_X = \mu_Y \qquad H_1 \colon \mu_X \neq \mu_Y

t-statistic: The general form is T = \frac{\text{Estimate}-\text{Hypothesised value}}{\text{e.s.e.}}

The two-sample t-statistic

Define the sample means \overline{X} = \frac{1}{n} \sum_{i=1}^n X_i \qquad \qquad \overline{Y} = \frac{1}{m} \sum_{i=1}^m Y_i
Notice that {\rm I\kern-.3em E}[ \overline{X} ] = \mu_X \qquad \qquad {\rm I\kern-.3em E}[ \overline{Y} ] = \mu_Y
Therefore we can estimate \mu_X - \mu_Y with the sample means, that is, \text{Estimate} = \overline{X} - \overline{Y}

The two-sample t-statistic

Since we are testing for difference in mean, we have \text{Hypothesised value} = \mu_X - \mu_Y
The Estimated Standard Error is the standard deviation of estimator \text{e.s.e.} = \text{Standard Deviation of } \overline{X} -\overline{Y}

The two-sample t-statistic

Therefore the two-sample t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}}
Under the Null Hypothesis that \mu_X = \mu_Y, the t-statistic becomes T = \frac{\overline{X} - \overline{Y} }{\text{e.s.e.}}

A note on the degrees of freedom (df)

The general rule is \text{df} = \text{Sample size} - \text{No. of estimated parameters}
Sample size in two-sample t-test:
- n in the first sample
- m in the second sample
- Hence total number of observations is n + m
No. of estimated parameters is 2: Namely \mu_X and \mu_Y
Hence degree of freedoms in two-sample t-test is {\rm df} = n + m - 2 (more on this later)

The estimated standard error

Recall: We are assuming populations have same variance \sigma^2_X = \sigma^2_Y = \sigma^2
We need to compute the estimated standard error \text{e.s.e.} = \text{Standard Deviation of } \ \overline{X} -\overline{Y}
Variance of sample mean was computed in the Lemma in Slide 72 Lecture 2
Since \overline{X} \sim N(\mu_X,\sigma^2) and \overline{Y} \sim N(\mu_Y,\sigma^2), by the Lemma we get {\rm Var}[\overline{X}] = \frac{\sigma^2}{n} \,, \qquad \quad {\rm Var}[\overline{Y}] = \frac{\sigma^2}{m}

The estimated standard error

Since X_i and Y_i are independent we get {\rm Cov}(X_i,Y_j)=0
By bilinearity of covariance we infer {\rm Cov}( \overline{X} , \overline{Y} ) = \frac{1}{n \cdot m} \sum_{i=1}^n \sum_{j=1}^m {\rm Cov}(X_i,Y_j) = 0
We can then compute \begin{align*} {\rm Var}[ \overline{X} - \overline{Y} ] & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] - 2 {\rm Cov}( \overline{X} , \overline{Y} ) \\ & = {\rm Var}[ \overline{X} ] + {\rm Var}[ \overline{Y} ] \\ & = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right) \end{align*}

The estimated standard error

Taking the square root gives \text{S.D.}(\overline{X} - \overline{Y} )= \sigma \ \sqrt{\frac{1}{n}+\frac{1}{m}}
Therefore, the t-statistic is T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\text{e.s.e.}} = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Estimating the variance

The t-statistic is currently T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Variance \sigma^2 is unknown: we need to estimate it!
Define the sample variances

S_X^2 = \frac{ \sum_{i=1}^n X_i^2 - n \overline{X}^2 }{n-1} \qquad \qquad S_Y^2 = \frac{ \sum_{i=1}^m Y_i^2 - m \overline{Y}^2 }{m-1}

Estimating the variance

Recall that X_1, \ldots , X_n \sim N(\mu_X, \sigma^2) \qquad \qquad Y_1, \ldots , Y_m \sim N(\mu_Y, \sigma^2)
From Lecture 2, we know that S_X^2 and S_Y^2 are unbiased estimators of \sigma^2, i.e. {\rm I\kern-.3em E}[ S_X^2 ] = {\rm I\kern-.3em E}[ S_Y^2 ] = \sigma^2
Therefore, both S_X^2 and S_Y^2 can be used to estimate \sigma^2

Estimating the variance

We can improve the estimate of \sigma^2 by combining S_X^2 and S_Y^2
We will consider a (convex) linear combination S^2 := \lambda_X S_X^2 + \lambda_Y S_Y^2 \,, \qquad \lambda_X + \lambda_Y = 1
S^2 is still an unbiased estimator of \sigma^2, since \begin{align*} {\rm I\kern-.3em E}[S^2] & = {\rm I\kern-.3em E}[ \lambda_X S_X^2 + \lambda_Y S_Y^2 ] \\ & = \lambda_X {\rm I\kern-.3em E}[S_X^2] + \lambda_Y {\rm I\kern-.3em E}[S_Y^2] \\ & = (\lambda_X + \lambda_Y) \sigma^2 \\ & = \sigma^2 \end{align*}

Estimating the variance

We choose coefficients \lambda_X and \lambda_Y which reflect sample sizes \lambda_X := \frac{n - 1}{n + m - 2} \qquad \qquad \lambda_Y := \frac{m - 1}{n + m - 2}

Notes:

We have \lambda_X + \lambda_Y = 1
Denominators in \lambda_X and \lambda_Y are degrees of freedom {\rm df } = n + m - 2
This choice is made so that S^2 has chi-squared distribution (more on this later)

Pooled estimator of variance

Definition

The pooled estimator of \sigma^2 is defined as S_p^2 := \lambda_X S_X^2 + \lambda_Y S_Y^2 = \frac{(n-1) S_X^2 + (m-1) S_Y^2}{n + m - 2}

Note:

n=m implies \lambda_X = \lambda_Y
In this case S_X^2 and S_Y^2 have same weight in S_p^2

The two-sample t-statistic

The t-statistic has currently the form T = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{\sigma \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}
We replace \sigma with the pooled estimator S_p

The two-sample t-statistic

Definition

The two sample t-statistic is defined as T := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Note: Under the Null Hypothesis that \mu_X = \mu_Y this becomes T = \frac{\overline{X} - \overline{Y}}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} = \frac{\overline{X} - \overline{Y}}{ \sqrt{ \dfrac{ (n-1) S_X^2 + (m-1) S_Y^2 }{n + m - 2} } \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}}

Distribution of two-sample t-statistic

Theorem

The two sample t-statistic has t_{n+m-2} distribution T := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \sim t_{n + m - 2}

Distribution of two-sample t-statistic

Proof

We have already seen that \overline{X} - \overline{Y} is normal with {\rm I\kern-.3em E}[\overline{X} - \overline{Y}] = \mu_X - \mu_Y \qquad \qquad {\rm Var}[\overline{X} - \overline{Y}] = \sigma^2 \left( \frac{1}{n} + \frac{1}{m} \right)
Therefore we can rescale \overline{X} - \overline{Y} to get U := \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ \sigma \sqrt{ \dfrac{1}{n} + \dfrac{1}{m}}} \sim N(0,1)

Distribution of two-sample t-statistic

Proof

We are assuming X_1, \ldots, X_n iid N(\mu_X,\sigma^2)
Therefore, as already shown, we have \frac{ (n-1) S_X^2 }{ \sigma^2 } \sim \chi_{n-1}^2
Similarly, since Y_1, \ldots, Y_m iid N(\mu_Y,\sigma^2), we get \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{m-1}^2

Distribution of two-sample t-statistic

Proof

Since X_i and Y_j are independent, we also have that \frac{ (n-1) S_X^2 }{ \sigma^2 } \quad \text{ and } \quad \frac{ (m-1) S_Y^2 }{ \sigma^2 } \quad \text{ are independent}
In particular we obtain \frac{ (n-1) S_X^2 }{ \sigma^2 } + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n-1}^2 + \chi_{m-1}^2 \sim \chi_{m + n- 2}^2

Distribution of two-sample t-statistic

Proof

Recall the definition of S_p^2 S_p^2 = \frac{(n-1) S_X^2 + (m-1) S_Y^2}{ n + m - 2 }
Therefore V := \frac{ (n+m-2) S_p^2 }{ \sigma^2 } = \frac{ (n - 1) S_X^2}{ \sigma^2} + \frac{ (m-1) S_Y^2 }{ \sigma^2 } \sim \chi_{n + m - 2}^2

Distribution of two-sample t-statistic

Proof

Rewrite T as \begin{align*} T & = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ S_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \\ & = \frac{\overline{X} - \overline{Y} - (\mu_X - \mu_Y)}{ \sigma \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} } } \Bigg/ \sqrt{ \frac{ (n + m - 2) S_p^2 \big/ \sigma^2}{ (n+ m - 2) } } \\ & = \frac{U}{\sqrt{V/(n+m-2)}} \end{align*}

Distribution of two-sample t-statistic

Proof

By construction \overline{X}- \overline{Y} is independent of S_X^2 and S_Y^2
Therefore \overline{X}- \overline{Y} is independent of S_p^2
We conclude that U and V are independent
In conclusion, we have shown that T = \frac{U}{\sqrt{V/(n+m-2)}} \,, \qquad U \sim N(0,1) \,, \qquad V \sim \chi_{n + m - 2}^2
By the Theorem in Slide 118 of Lecture 2, we conclude that T \sim t_{n+m-2}

The two-sample t-test

Suppose given two independent samples

Sample x_1, \ldots, x_n from N(\mu_X,\sigma^2) of size n
Sample y_1, \ldots, y_m from N(\mu_Y,\sigma^2) of size m

The two-sided hypothesis for difference in means is H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y

The one-sided alternative hypotheses are H_1 \colon \mu_X < \mu_Y \quad \text{ or } \quad H_1 \colon \mu_X > \mu_Y

Procedure: 3 Steps

Calculation: Compute the two-sample t-statistic t = \frac{ \overline{x} - \overline{y}}{ s_p \ \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }} where sample means and pooled variance estimator are \overline{x} = \frac{1}{n} \sum_{i=1}^n x_i \qquad \overline{y} = \frac{1}{m} \sum_{i=1}^m y_i \qquad s_p^2 = \frac{ (n-1) s_X^2 + (m - 1) s_Y^2 }{ m + n - 2} s_X^2 = \frac{\sum_{i=1}^n x_i^2 - n \overline{x}^2}{n-1} \qquad s_Y^2 = \frac{\sum_{i=1}^m y_i^2 - m \overline{y}^2}{m-1}

Statistical Tables or R: Find either
- Critical value t^* in Table 1
- p-value in R

Interpretation: Reject H_0 when either p < 0.05 \qquad \text{ or } \qquad t \in \,\,\text{Rejection Region} \qquad \qquad \qquad \qquad (T \, \sim \, t_{n+m-1})

Alternative	Rejection Region	t^*	p-value
\mu_X \neq \mu_Y	\|t\| > t^*	t_{n+m-1}(0.025)	2P(T > \|t\|)
\mu_X < \mu_Y	t < - t^*	t_{n+m-1}(0.05)	P(T < t)
\mu_X > \mu_Y	t > t^*	t_{n+m-1}(0.05)	P(T > t)

Reject H_0 if t-statistic t falls in the Rejection Region (in gray). Here t \sim t_{n+m-1}

The two-sample t-test in R

General commands

Store the samples x_1,\ldots,x_n and y_1,\ldots,y_m in two R vectors
- x <- c(x1, ..., xn) \qquad y <- c(y1, ..., ym)
Perform a two-sample t-test on x and y

Alternative	R command
\mu_X \neq \mu_Y	`t.test(x, y, var.equal = T)`
\mu_X < \mu_Y	`t.test(x, y, var.equal = T, alt = "less")`
\mu_X > \mu_Y	`t.test(x, y, var.equal = T, alt = "greater")`

Read output: similar to one-sample t-test
- The main quantity of interest is p-value

Comments on command `t.test(x, y)`

mu = mu0 tells R to test null hypothesis: H_0 \colon \mu_X - \mu_Y = \mu_0 \qquad \quad (\text{default is } \, \mu_0 = 0)
var.equal = T tells R to assume that populations have same variance \sigma_X^2 = \sigma^2_Y
In this case R computes the t-statistic with formula discussed earlier t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}

Comments on command `t.test(x, y)`

Warning: If var.equal = T is not specified then

R assumes that populations have different variance \sigma_X^2 \neq \sigma^2_Y
In this case the t-statistic t = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }} is NOT t-distributed
R performs the Welch t-test instead of the classic t-test
(more on this later)

Part 6:
Two-sample t-test
Example

Mathematicians	x_1	x_2	x_3	x_4	x_5	x_6	x_7	x_8	x_9	x_{10}
Wages	36	40	46	54	57	58	59	60	62	63

Accountants	y_1	y_2	y_3	y_4	y_5	y_6	y_7	y_8	y_9	y_{10}	y_{11}	y_{12}	y_{13}
Wages	37	37	42	44	46	48	54	56	59	60	60	64	64

Samples: Wage data on 10 Mathematicians and 13 Accountants
- Wages are independent and normally distributed
- Populations have equal variance
Quesion: Is there evidence of differences in average pay?
Answer: Two-sample two-sided t-test for the hypothesis H_0 \colon \mu_X = \mu_Y \,,\qquad H_1 \colon \mu_X \neq \mu_Y

Calculations: First sample

Sample size: \ n = No. of Mathematicians = 10
Mean: \bar{x} = \frac{\sum_{i=1}^n x_i}{n} = \frac{36+40+46+ \ldots +62+63}{10}=\frac{535}{10}=53.5
Variance: \begin{align*} s^2_X & = \frac{\sum_{i=1}^n x_i^2 - n \bar{x}^2}{n -1 } \\ \sum_{i=1}^n x_i^2 & = 36^2+40^2+46^2+ \ldots +62^2+63^2 = 29435 \\ s^2_X & = \frac{29435-10(53.5)^2}{9} = 90.2778 \end{align*}

Calculations: Second sample

Sample size: \ m = No. of Accountants = 13
Mean: \bar{y} = \frac{37+37+42+ \dots +64+64}{13} = \frac{671}{13} = 51.6154
Variance: \begin{align*} s^2_Y & = \frac{\sum_{i=1}^m y_i^2 - m \bar{y}^2}{m - 1} \\ \sum_{i=1}^m y_i^2 & = 37^2+37^2+42^2+ \ldots +64^2+64^2 = 35783 \\ s^2_Y & = \frac{35783-13(51.6154)^2}{12} = 95.7547 \end{align*}

Calculations: Pooled Variance

Pooled variance: \begin{align*} s_p^2 & = \frac{(n-1) s_X^2 + (m-1) s_Y^2}{ n + m - 2} \\ & = \frac{(9) 90.2778 + (12) 95.7547 }{ 10 + 13 - 2} \\ & = 93.40746 \end{align*}
Pooled standard deviation: s_p = \sqrt{93.40746} = 9.6648

Calculations: t-statistic

Calculation: Compute the two-sample t-statistic

\begin{align*} t & = \frac{\bar{x} - \bar{y} }{s_p \ \sqrt{\dfrac{1}{n}+\dfrac{1}{m}}} \\ & = \frac{53.5 - 51.6154}{9.6648 \times \sqrt{\dfrac{1}{10}+\dfrac{1}{13}}} \\ & = \frac{1.8846}{9.6648{\times}0.4206} \\ & = 0.464 \,\, (3\ \text{d.p.}) \end{align*}

Completing the t-test

Referencing Tables:
- Degrees of freedom are {\rm df} = n + m - 2 = 10 + 13 - 2 = 21
- Find corresponding critical value in Table 1 t_{21}(0.025) = 2.08 (Note the value 0.025, since this is two-sided test)

Completing the t-test

Interpretation:
- We have that | t | = 0.464 < 2.08 = t_{21}(0.025)
- t falls in the acceptance region
- Therefore the p-value satisfies p>0.05
- There is no evidence (p>0.05) in favor of H_1
- Hence we accept that \mu_X = \mu_Y
Conclusion: Average pay levels seem to be the same for both professions

The two-sample t-test in R

This is a two-sided t-test with assumption of equal variance. The p-value is

p = 2 P(t_{n-1} > |t|) \,, \qquad t = \frac{\bar{x} - \bar{y} }{s_p \ \sqrt{1/n + 1/m}}

# Enter Wages data in 2 vectors using function c()
mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)


# Two-sample t-test with null hypothesis mu_X = mu_Y
# and equal variance assumption. Store result in answer and print.

answer <- t.test(mathematicians, accountants, var.equal = TRUE)

print(answer)

Code can be downloaded here two_sample_t_test.R


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

First line: R tells us that a Two-Sample t-test is performed
Second line: Data for t-test is mathematicians and accountants


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

Third line:
- The t-statistic computed is t = 0.46359
- Note: This coincides with the one computed by hand!
- There are 21 degrees of freedom
- The p-values is p = 0.6477


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

Fourth line: The alternative hypothesis is that the difference in means is not zero
- This translates to H_1 \colon \mu_X \neq \mu_Y
- Warning: This is not saying to reject H_0 – R is just stating H_1


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

Fifth line: R computes a 95 \% confidence interval for \mu_X - \mu_Y (\mu_X - \mu_Y) \in [-6.569496, 10.338727]
- Interpretation: If you repeat the experiment (on new data) over and over, the interval [a,b] will contain \mu_X - \mu_Y about 95\% of the times


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

Seventh line: R computes sample mean for the two populations
- Sample mean for mathematicians is 53.5
- Sample mean for accountants is 51.61538


    Two Sample t-test

data:  mathematicians and accountants
t = 0.46359, df = 21, p-value = 0.6477
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.569496 10.338727
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Conclusion: The p-value is p = 0.6477

Since p > 0.05 we do not reject H_0
Hence \mu_X and \mu_Y appear to be similar
Average pay levels seem to be the same for both professions

Comment on Assumptions

The previous two-sample t-test was conducted under the following assumptions:

Wages data is normally distributed
The two populations have equal variance

Using R, we can plot the data to see if these are reasonable (graphical exploration)

Warnings: Even if the assumptions hold

we cannot expect the samples to be exactly normal (bell-shaped)
- rather, look for approximate normality
we cannot expect the sample variances to match
- rather, look for a similar spread in the data

Estimating the sample distribution in R

Suppose given a data sample stored in a vector z

If the sample is large, we can check normality by plotting the histogram of z
Example: z sample of size 1000 form N(0,1) – Its histogram resembles N(0,1)

z <- rnorm(1000)                # Sample 1000 times from N(0,1) 
hist(z, probability = TRUE)     # Plot histogram, with area scaled to 1

Drawback: Small samples \implies hard to check normality from histogram

This is true even if the data is normal
Example: z below is sample of size 9 from N(0,1) – But histogram not normal

z <- c(-0.78, -1.67, -0.38,  0.92, -0.58,  
       0.61, -1.62, -0.06, 0.52)           # Random from N(0,1)
hist(z, probability = TRUE)                # Histogram not normal

Solution: Suppose given iid sample z from a distribution f

The command density(z) estimates the population distribution f
(Estimate based on the sampling distribution of z and smoothing - Not easy task)
Example: z as in previous slide. The plot of density(z) shows normal behavior

z <- c(-0.78, -1.67, -0.38,  0.92, -0.58,  
       0.61, -1.62, -0.06, 0.52)           # Random from N(0,1)
dz <- density(z)                           # Estimate the density of z
plot(dz)                                   # Plot the estimated density

The R object density(z) models a 1D function (the estimated distribution of z)

As such, it contains a grid of x values, with associated y values
- x values are stored in vector density(z)$x
- y values are stored in vector density(z)$y
These values are useful to set the axis range in a plot

dz <- density(z)

plot(dz,                        # Plot dz
     xlim = range(dz$x),        # Set x-axis range
     ylim = range(dz$y))        # Set y-axis range

Axes range set as the min and max values of components of dz

Checking the Assumptions on our Example

# Compute the estimated distributions
d.math <- density(mathematicians)
d.acc <- density(accountants)

# Plot the estimated distributions

plot(d.math,                                    # Plot d.math
     xlim = range(c(d.math$x, d.acc$x)),        # Set x-axis range
     ylim = range(c(d.math$y, d.acc$y)),        # Set y-axis range
     main = "Estimated Distributions of Wages") # Add title to plot
lines(d.acc,                                    # Layer plot of d.acc
      lty = 2)                                  # Use different line style
         
legend("topleft",                               # Add legend at top-left
       legend = c("Mathematicians",             # Labels for legend
                  "Accountants"), 
       lty = c(1, 2))                           # Assign curves to legend

Axes range set as the min and max values of components of d.math and d.acc

Wages data looks approximately normally distributed (roughly bell-shaped)
The two populations have similar variance (spreads look similar)

Conclusion: Two-sample t-test with equal variance is appropriate \implies accept H_0

Part 7:
The Welch t-test

Samples with different variance

We just examined the two-sample t-tests
This assumes independent normal populations with equal variance

\sigma_X^2 = \sigma_Y^2

Question: What happens if variances are different?
Answer: Use the Welch Two-sample t-test
- This is a generalization of the two-sample t-test to the case \sigma_X^2 \neq \sigma_Y^2
- In R it is performed with t.test(x, y)
- Note that we are just omitting the option var.equal = TRUE
- Equivalently, you may specify var.equal = FALSE

The Welch two-sample t-test

Welch t-test consists in computing the Welch statistic w = \frac{\overline{x} - \overline{y}}{ \sqrt{ \dfrac{s_X^2}{n} + \dfrac{s_Y^2}{m} } }
If sample sizes m,n > 5, then w is approximately t-distributed
- Degrees of freedom are not integer, and depend on S_X, S_Y, n, m
If variances are similar, the welch statistic is comparable to the t-statistic

w \approx t : = \frac{ \overline{x} - \overline{y} }{s_p \sqrt{ \dfrac{1}{n} + \dfrac{1}{m} }}

Welch t-test Vs two-sample t-test

If variances are similar:

Welch statistic and t-statistic are similar
p-value from Welch t-test is similar to p-value from two-sample t-test
Since p-values are similar, most times the 2 tests yield same decision
The tests can be used interchangeably

If variances are very different:

Welch statistic and t-statistic are different
p-values from the two tests can differ a lot
The two tests might give different decision
Wrong to apply two-sample t-test, as variances are different

The Welch two-sample t-test in R

# Enter Wages data

mathematicians <- c(36, 40, 46, 54, 57, 58, 59, 60, 62, 63)
accountants <- c(37, 37, 42, 44, 46, 48, 54, 56, 59, 60, 60, 64, 64)


# Perform Welch two-sample t-test with null hypothesis mu_X = mu_Y
# Store result of t.test in answer

answer <- t.test(mathematicians, accountants)


# Print answer
print(answer)

Note:
- This is almost the same code as in Slide 86
- Only difference: we are omitting the option var.equal = TRUE in t.test


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

First line: R tells us that a Welch Two-Sample t-test is performed
- The rest of the output is similar to classic t-test
- Main difference is that p-value and t-statistic differ from classic t-test


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Comments on output:

Third line:
- The Welch t-statistic is w = 0.46546 (standard t-test gave t = 0.46359)
- Degrees of freedom are fractionary \rm{df} = 19.795 (standard t-test \rm{df} = 21)
- The Welch t-statistic is approximately t-distributed with W \approx t_{19.795}
Fifth line: The confidence interval for \mu_X - \mu_Y is also different


    Welch Two Sample t-test

data:  mathematicians and accountants
t = 0.46546, df = 19.795, p-value = 0.6467
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -6.566879 10.336109
sample estimates:
mean of x mean of y 
 53.50000  51.61538

Conclusion: The p-values obtained with the 2 tests are almost the same

Welch t-test: p-value = 0.6467 \qquad Classic t-test: p-value = 0.6477
Both test: p > 0.05, and therefore do not reject H_0
Note: This was expected
- The spread of the two populations is similar \implies \, \sigma_X^2 \approx \sigma_Y^2
- Hence, Welch t-statistic approximates t-statistic \implies p-values are similar

Exercise

We compare the Effect of Two Treatments on Blood Pressure Change
Both treatments are given to a group of patients
Measurements of changes in blood pressure are taken after 4 weeks of treatment
Note that changes represent both positive and negative shifts in blood pressure

Treat. A	-1.9	-2.5	-2.1	-2.4	-2.6	-1.9
Treat. B	-1.1	-0.9	-1.4	0.2	0.3	0.6	-5	-2.4	1.5	-2.3	-2.8	2.1

# Enter changes in Blood pressure data

trA <- c(-1.9, -2.5, -2.1, -2.4, -2.6, -1.9)
trB <- c(-1.1, -0.9, -1.4, 0.2, 0.3, 0.6, -5,
         -2.4, -1.5, 2.3, -2.8, 2.1)

cat("Mean of Treatment A:", mean(trA), "Mean of Treatment B:", mean(trB))

Mean of Treatment A: -2.233333 Mean of Treatment B: -0.8

Sample means show both Treatments are effective in decreasing blood pressure
However Treatment A seems slightly better

Question: Perform a t-test to see if Treatment A is better H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

Solution: Estimated density of Treatment A

plot(density(trA))

Estimated density looks bell-shaped \implies First population is normal
Sample seems concentrated between -3 and -1.5

Estimated density of Treatment B

plot(density(trB))

Estimated density looks bell-shaped \implies Second population is normal
Sample seems concentrated between -7 and 4

Findings

Both populations are normal \implies t-test is appropriate
First sample seems concentrated between -3 and -1.5
Second sample seems concentrated between -7 and 4
Treatment B has larger spread
Therefore we suspect that populations have different variance

\sigma_A^2 \neq \sigma_B^2

Conclusion:

The Welch t-test is appropriate
Two sample t-test would not be appropriate (as it assumes equal variance)

Apply the Welch t-test

We are testing the one-sided hypothesis

H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

# Perform Welch t-test and retrieve p-value

ans <- t.test(trA, trB, alt = "less", var.equal = F) 
ans$p.value

[1] 0.01866013

The p-value is p < 0.05
We reject H_0 \implies Treatment A is more effective

Two-sample t-test gives different decision

We are testing the one-sided hypothesis

H_0 \colon \mu_A = \mu_B \, , \qquad H_1 \colon \mu_A < \mu_B

# Perform two-sample t-test and retrieve p-value

ans <- t.test(trA, trB, alt = "less", var.equal = T) 
ans$p.value

[1] 0.05836482

The p-value is p > 0.05
H_0 cannot be rejected \implies There is no evidence that Treatment A is better
Wrong conclusion, because two-sample t-test does not apply

Disclaimer

The previous data was synthetic, and the background story was made up!
Nonetheless, the example is still valid
To construct the data, I sampled as follows
- Treatment A: Sample of size 6 from N(-2,1)
- Treatment B: Sample of size 12 from N(-1.5,9)
We see that \mu_A < \mu_B \,, \qquad \sigma_A^2 \neq \sigma_B^2
This tells us that:
- We can expect that some samples will support that \mu_A < \mu_B
- Two-sample t-test is inappropriate because \sigma_A^2 \neq \sigma_B^2

Generating the Data

Click here to see the code I used

# Set seed for random generation
# This way you always get the same random numbers when
# you run this code
set.seed(21) 

repeat {
  # Generate random samples
  x <- rnorm(6, mean = -2, sd = 1)
  y <- rnorm(12, mean = -1.5, sd = 3)

  # Round x and y to 1 decimal point
  x <- round(x, 1)
  y <- round(y, 1)
  
  # Perform one-sided t-tests for alternative hypothesis mu_x < mu_y 
  ans_welch <- t.test(x, y, alt = "less", var.equal = F)
  ans_t_test <- t.test(x, y, alt = "less", var.equal = T)
   
  # Check that Welch test succeeds and two-sample test fails
  if (ans_welch$p.value < 0.05 && ans_t_test$p.value > 0.05) {
    cat("Data successfully generated!!!\n\n")
    cat("Synthetic Data TrA:", x, "\n")
    cat("Synthetic Data TrB:", y, "\n\n")
    cat("Welch t-test p-value:", ans_welch$p.value, "\n")
    cat("Two-sample t-test p-value:", ans_t_test$p.value)
    break
    }
}

Data successfully generated!!!

Synthetic Data TrA: -1.9 -2.5 -2.1 -2.4 -2.6 -1.9 
Synthetic Data TrB: -1.1 -0.9 -1.4 0.2 0.3 0.6 -5 -2.4 -1.5 2.3 -2.8 2.1 

Welch t-test p-value: 0.01866013 
Two-sample t-test p-value: 0.05836482

Method:

Sample the data as in previous slide (round to 1 d.p. for cleaner looking data)
Repeat until Welch test succeeds, and two-sample t-test fails \text{p-value of Welch test } \, < 0.05 < \, \text{p-value of Two-sample t-test}

Part 8:
The t-test for
paired samples

Paired samples

Assume to have two sample with same size
Sometimes the two samples depend on each other in some way

Twin studies:
- Twins are used as pairs (to control genetic or environmental factors)
- Example: test effectiveness of a medical treatment against placebo
- The two samples are clearly dependent
  (think of twins as the same person)
- As such, the usual two-sample t-test is not applicable
  (because it assumes independence)

Paired samples

Assume to have two sample with same size
Sometimes the two samples depend on each other in some way

Pre-test and Post-test
- Measure the outcome of a certain action
- Example: does this module work for teaching R?
- We can assess the effectiveness of something with a pre-test and a post-test
- The two samples are clearly dependent
  (each individual takes a test twice)
- As such, the usual two-sample t-test is not applicable
  (because it assumes independence)

The paired t-test

Suppose given two samples

Sample x_1, \ldots, x_n from N(\mu_X,\sigma^2_X)
Sample y_1, \ldots, y_n from N(\mu_Y,\sigma^2_Y)

The hypotheses for difference in means are H_0 \colon \mu_X = \mu_Y \,, \quad \qquad H_1 \colon \mu_X \neq \mu_Y \,, \quad \mu_X < \mu_Y \,, \quad \text{ or } \quad \mu_X > \mu_Y

The paired t-test

Assumption: The data is paired, meaning that the differences

d_i = x_i - y_i \,\, \text{ are iid} \,\, N(\mu,\sigma^2) \quad \text{where} \quad \mu := \mu_X - \mu_Y

The hypotheses for the difference in means are equivalent to

H_0 \colon \mu = 0 \,, \quad \qquad H_1 \colon \mu \neq 0 \,, \quad \mu < 0 \,, \quad \text{ or } \quad \mu > 0

These can be tested with a one-sample t-test

R commands: The paired t-test can be called with the equivalent commands

t.test(x, y, paired = TRUE) \qquad \quad H_0 \colon \mu_X = \mu_Y
t.test(x - y) \qquad \qquad \qquad \qquad\qquad H_0 \colon \mu = 0

Example 1: The 2008 crisis (again!)

Month	J	F	M	A	M	J	J	A	S	O	N	D
CCI 2007	86	86	88	90	99	97	97	96	99	97	90	90
CCI 2009	24	22	21	21	19	18	17	18	21	23	22	21

Data: Monthly Consumer Confidence Index (CCI) in 2007 and 2009
Question: Did the crash of 2008 have lasting impact upon CCI?
Observations:
- Data shows a massive drop in CCI between 2007 and 2009
- Data is clearly paired (Pre-test and Post-test situation)
Method: Use paired t-test to investigate drop in mean CCI

H_0 \colon \mu_{2007} = \mu_{2009} \,, \quad H_1 \colon \mu_{2007} > \mu_{2009}

Perform paired t-test

# Enter CCI data
score_2007 <- c(86, 86, 88, 90, 99, 97, 97, 96, 99, 97, 90, 90)
score_2009 <- c(24, 22, 21, 21, 19, 18, 17, 18, 21, 23, 22, 21)

# Perform paired t-test and print p-value
ans <- t.test(score_2007, score_2009, paired = T, alt = "greater")
ans$p.value

[1] 2.430343e-13

The p-value is significant: \,\, p < 0.05 \, \implies \, reject H_0 \, \implies \, Drop in CCI

Warning

It would be wrong to use a two-sample t-test

This is because the samples are paired, and hence dependent
This is further supported by computing the correlation
High correlation implies dependence

cor(score_2007, score_2009)                # Correlation is high

[1] -0.6076749

Example 2: Water quality samples

Researchers wish to measure water quality
There are two possible tests, one less expensive than the other
10 water samples were taken, and each was measured both ways

method1	45.9	57.6	54.9	38.7	35.7	39.2	45.9	43.2	45.4	54.8
method2	48.2	64.2	56.8	47.2	43.7	45.7	53.0	52.0	45.1	57.5

Question: Do the tests give the same results?

Observation: The data is paired (twin study situation)
Method: Use paired t-test to investigate equality of results

H_0 \colon \mu_1 = \mu_2 \,, \quad H_1 \colon \mu_1 \neq \mu_2

Perform paired t-test

# Enter tests data
method1 <- c(45.9, 57.6, 54.9, 38.7, 35.7, 39.2, 45.9, 43.2, 45.4, 54.8)
method2 <- c(48.2, 64.2, 56.8, 47.2, 43.7, 45.7, 53.0, 52.0, 45.1, 57.5)

# Perform paired t-test and print p-value
ans <- t.test(method1, method2, paired = T)
ans$p.value

[1] 0.0006648526

p-value is significant: \,\, p < 0.05 \implies reject H_0 \implies Methods perform differently

Warning

It would be wrong to use a two-sample t-test

This is because the samples are paired, and hence dependent
This is also supported by high samples correlation

cor(method1, method2)                # Correlation is very high

[1] 0.9015147

Warning

In this Example, performing a two-sample t-test would lead to wrong decision

# Perform Welch t-test and print p-value
ans <- t.test(method1, method2, paired = F)       # paired = F is default
ans$p.val

[1] 0.1165538

Wrong conclusion: \,\, p > 0.05 \implies can’t reject H_0 \implies Methods perform similarly

Bottom line: The data is paired, therefore a paired t-test must be used

Statistical Models

Lecture 4: The variance ratio & Two-sample t-tests

Outline of Lecture 4

Part 1: One-sample variance ratio test

Task: Estimating mean and variance

Reminder

One-sample one-sided variance ratio test

What to do?

What to do?

What to do?

What to do?

Summary: Rejection condition

Critical values of chi-squared

Critical values of chi-squared – Tables

The p-value

One-sample one-sided variance ratio test

Procedure: 3 steps

Part 2: Worked example

One-sample variance ratio test: Example

One-sample variance ratio test: Example

Motivation of test

The variance ratio test by hand

The variance ratio test by hand

The variance ratio test by hand

The variance ratio test by hand

The variance ratio test by hand

Part 3: One-sample variance ratio test in R

The variance ratio test in R

Probability Distributions in R

Example

Probability Distributions in R

Chi-squared distribution

Example 1

Example 1 – Solution

Example 2

Example 2 – Solution

The variance ratio test in R

The variance ratio test in R

The variance ratio test in R

The variance ratio test in R

Running the code

Part 4: Two-sample hypothesis tests

Overview

Problem statement

Why is this important?

Normal distribution family in action

Two-sample t-test

Normal distribution family in action

Two-sample F-test

Normal distribution family in action

Two-sample F-test

Part 5: Two-sample t-test

The two-sample t-test

The two-sample t-test

The two-sample t-statistic

The two-sample t-statistic

The two-sample t-statistic

A note on the degrees of freedom (df)

The estimated standard error

The estimated standard error

The estimated standard error

Estimating the variance

Estimating the variance

Estimating the variance

Estimating the variance

Pooled estimator of variance

The two-sample t-statistic

The two-sample t-statistic

Distribution of two-sample t-statistic

Distribution of two-sample t-statistic

Proof

Distribution of two-sample t-statistic

Proof

Distribution of two-sample t-statistic

Proof

Distribution of two-sample t-statistic

Proof

Distribution of two-sample t-statistic

Proof

Distribution of two-sample t-statistic

Lecture 4:
The variance ratio &
Two-sample t-tests

Part 1:
One-sample variance
ratio test

Part 2:
Worked example

Part 3:
One-sample variance
ratio test in R

Part 4:
Two-sample
hypothesis tests

Part 5:
Two-sample t-test

Comments on command `t.test(x, y)`

Comments on command `t.test(x, y)`

Part 6:
Two-sample t-test
Example

Part 7:
The Welch t-test

Part 8:
The t-test for
paired samples