Central Limit Theorem

Learning Objectives

After this unit, students should be able to

describe the sampling distribution.
describe the central limit theorem.
comprehend why the central limit theorem is a cornerstone in the statistical inference.

Revisiting Population vs Sample

Consider the following dataset of five students with their test score out of \(5\): \(\{(A, 3), (B, 4), (C, 2), (D, 3), (E, 1)\}\). We will treat this dataset as the population with mean \(\mu\) equals to \(2.6\).

Let us assume that we do not have resources to process the data for five persons; but we can process three persons. Thus we can only work on the samples of size three. Our aim in this case is to estimate the population mean using the sample mean. Since this simply a toy example, we can exhaustively study each of the \(10\) samples of size three from the population. Following table lists those samples with their sample means.

Samples	Sample Mean
\(\{ACE, DCE\}\)	\(2.00\)
\(\{ADE, BCE\}\)	\(2.33\)
\(\{ABE, DBE, ACD\}\)	\(2.67\)
\(\{ABC, DBC\}\)	\(3.00\)
\(\{ABD\}\)	\(3.33\)

We observe that the sample mean can be treated as a random variable. The randomness for this variable is introduced by the sampling procedure. The probability distribution for \(\bar{X}\) is shown as below.

\(x_i\)	\(Pr[\bar{X} = x_i]\)
\(2.00\)	\(2/10 = 0.2\)
\(2.33\)	\(2/10 = 0.2\)
\(2.67\)	\(3/10 = 0.3\)
\(3.00\)	\(2/10 = 0.2\)
\(3.33\)	\(1/10 = 0.1\)

Let's compute the expected value.

\[ \begin{aligned} E[\bar{X}] &= \sum_{x_i} x_i Pr[\bar{X} = x_i] \\ &= (2 * 0.2) + (2.33 * 0.2) + (2.67 * 0.3) + (3.00 * 0.2) + (3.33 * 0.1) \\ &= 2.6 \end{aligned} \]

Isn't this the population mean we wanted to discover?

Sampling Distribution

The sampling induces the randomness (or indeterminism) which enables us to treat any statistic as a random variable. Sampling distribution is the probability distribution of any statistic computed based on random sampling. It provides us a gateway to the idea of statistical inference mentioned in Unit 3.

Sampling distribution depends on

the distribution of the underlying population
the statistic
the sampling procedure
the sample size

Well-known statistics and their sampling distributions are listed in the following table. There are a few more distributions that we will use. We will introduce them when required.

Population	Statistic	Sampling distribution
\(\mathcal{N}(\mu, \sigma^2)\)	Sample mean \(\bar{X}\)	\(\bar{X} \sim \mathcal{N}(\mu, \frac{\sigma^2}{n})\)
\(Bernoulli(p)\)	Proportion of successes \(\bar{X}\)	\(n\bar{X} \sim Binomial(n, p)\)

Examining the data in the table, we notice that the mean of sample mean aligns with the population mean. Consequently, we can confidently extrapolate the population mean from the sample mean statistically. This discovery allows us to estimate population characteristics based on their representations in a smaller sample—a promising development. However, there's a significant obstacle: the sampling distribution is influenced by numerous factors, not all of which can be consistently managed by an analyst. Is it feasible for us to reliably ascertain the sampling distribution each time?

Standard Error

The standard deviation of the sampling distribution of a statistic is called as a standard error. For the sample mean, it is

\[ \sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]

We can clearly observe that the variance in the statistic quadratically reduces as we increase the size of the sample.

Central Limit Theorem

Central limit theorem is considered as a cornerstone in the field of statistics. It is a key that bridges the statistics to the probability theory. Central limit theorem provides the sampling distribution for the sample mean without any assumption on the distribution of the underlying population. The statement of the theorem is as follows:

The sampling distribution of the sample mean of any sufficiently large (of size \(n\)) i.i.d. samples drawn from a population with a finite mean \(\mu\) and standard deviation \(\sigma\) follows normal distribution with mean \(\mu\) and standard deviation \(\sigma/\sqrt{n}\). Alternatively,

\[ \frac{\bar{X} - \mu}{\sigma / \sqrt{n}} \sim \mathcal{N}(0, 1) \]

How to quantify the sufficiently large value? Let us defer this discussion until Unit 10. You can play around the impact of sample size on the sampling distribution at this applet.

A common misconception!

Remember that the central limit theorem provides the sampling distribution of the sample mean. It is not applicable for any general statistic.