# Weak Law of Large Numbers

Visualize Law of Large NumbersSuppose your classroom consists of

\(300\)

students and you want to know what is the average height of those \(300\)

students?Now say you measure height of

\(50\)

students and suppose that the average height of those \(50\)

students is somewhat near to the average height of all \(300\)

students. Here allSo now we have our\(300\)students is thepopulationand those\(50\)students is thesample

sopopulation sizeis\(300\)andsample size\((n)\)is\(50\)

\(50\)

observations \(X_1,X_2,\cdots,X_{50}\)

, and these observations are **random**, they are the result of some unknown

**random process**, so we call these random observations as

**Random variables**.

These random variables are resultant of a common random process therefore they are

**identically distributed**and all of them are independent of each other, so we call then

**I.I.D.**(Independent and Identically Distributed) random variables.

So average height of those

\(50\)

students **(sample mean)**is

\(\displaystyle\overline{X}_{50}=\frac{X_1+X_2+\cdots+X_{50}}{50}\)

. What we actually strive is to get the average over total population (Average height of all

\(300\)

students), the **True mean**. Let's say that the true mean is

\(\mu\left(=\mathbb{E}[X_i]\right)\)

. Note

True mean\((\mu)\)is over the entire population.True mean\((\mu)\)isnotrandom, it's a number.Sample mean\((\overline{X}_n)\)is over the observed values during an experiment.Sample mean\((\overline{X}_n)\)is aRandom variablebecause\(X_1,\cdots,X_n\)are random.

Weak Law of Large Numbers says, as we increases ourSo according to the Weak Law of Large Numbers,sample size\((n)\)then,in probabilityoursample meangoes toward theTrue mean(this is what we referred asTruthin our central dogma).

\[\overline{X}_n:=\frac{1}{n}\sum _{i=1}^ n X_ i \xrightarrow [n\to \infty ] {\mathbb{P}} \mu\]

\(:=\quad\)this symbol means"by definition"\(\mathbb{P}\quad\)it means"in Probability"

### Explanation

For\(X_1,X_2,\dots,X_n\)

**I.I.D.**random variables with finite mean

\(\mu\)

and variance \(\sigma^2\)

. **Sample mean**

\(\displaystyle\overline{X}_n=\frac{X_1+\cdots+X_n}{n}\)

## \(\displaystyle\mathbb{E}[\overline{X}_n]=\mu\)

\(\displaystyle\mathbb{E}[\overline{X}_n]=\mathbb{E}\left[\frac{X_1+\cdots+X_n}{n}\right]\)\(\displaystyle\mathbb{E}[\overline{X}_n]=\frac{\mathbb{E}[X_1]+\cdots+\mathbb{E}[X_n]}{n}\)\(\displaystyle\mathbb{E}[\overline{X}_n]=\frac{n\mu}{n}\)\(\displaystyle\mathbb{E}[\overline{X}_n]=\mu\)

## \(\displaystyle\text{Var}[\overline{X}_n]=\frac{n\sigma^2}{n^2}=\frac{\sigma^2}{n}\)

\(\displaystyle\text{Var}[\overline{X}_n]=\text{Var}\left[\frac{X_1+\cdots+X_n}{n}\right]\)\(\displaystyle\text{Var}[\overline{X}_n]=\frac{\text{Var}[X_1]+\cdots+\text{Var}[X_n]}{n^2}\)\(\displaystyle\text{Var}[\overline{X}_n]=\frac{n\sigma^2}{n^2}=\frac{\sigma^2}{n}\)

## \(\mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \xrightarrow [n\to \infty ] {} 0;\quad\forall\epsilon\gt 0\)

By Chebyshev's inequality^{}\(\displaystyle \mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \leq \frac{\text{Var}(\overline{X}_n)}{\epsilon^2}\)\(\displaystyle \mathbb{P}\left(|\overline{X}_n - \mu| \geq \epsilon\right) \leq \frac{\sigma^2}{n\epsilon^2}\xrightarrow [n\to \infty ] {} 0;\quad\forall\epsilon\gt 0\)

So for any\(\epsilon\geq0\),\[\mathbb{P}\left(|\overline{X}_n - \mu|\geq \epsilon\right) \xrightarrow [n\to \infty ]{} 0\]This is convergence in probabilityLet's assume a very small number, like\(0.00001\). Now convergence in probability says that if\(n\)is large enough then it's highly unlikely for\(\overline{X}_n\)to be more than\(0.00001\)units away from\(\mu\).

Or say that, if\(n\)is large then it's extremely likely that\(\overline{X}_n\)is extremely close to\(\mu\).

#### Interpretation

For any\(\epsilon\gt0\)(it's constant), probability that thesample meanfalls away from the\((\overline{X}_n)\)true meanby\((\mu)\)more than\(\epsilon\)goes to\(0\)as our sample size\((n)\to\infty\).

In our above example we have a population of\(300\)students, among those\(300\)students we randomly select\(50\)students and measure their heights\(X_1,\cdots,X_{50}\).

If thetrue meanof all\(300\)students is\(\mu(=\mathbb{E}[X_i])\), then we can say that,So our

- Height of the
\(i^{th}\)student is\(X_i = \mu + W_i\), where\(W_i\)is the measurement noise for the\(i^{th}\)student, and the Weak Law of Large Numbers tells us that as\(n\to\infty\)thenin probabilitytheaveragesample noise\(\to 0\).sample mean\((\overline{X}_n)\)isunlikelyto be far from thetrue mean\((\mu)\).

So according to Weak Law of Large Numbers if we increase the number of students in our sample from

\(n=50\)

to say something like \(n=100\)

then we **should**get a better estimate of the

**true mean**.

### There is also a **Strong Law of large numbers**.

Strong Law of Large Numbers,\[\overline{X}_n:=\frac{1}{n}\sum _{i=1}^ n X_ i \xrightarrow [n\to \infty ] {\mathbb{P},\text{ a.s.}} \mu\]\(:=\quad\)this symbol means"by definition"\(\mathbb{P}\quad\)it means"in Probability"\(\text{a.s.}\quad\)it means "almost surely"(with probability\(1\))Note:\(\text{ a.s.}\)implies\(\mathbb{P}\)

Ok now we know that the Law of large numbers says, if we have large enough Sample size then our estimator

\(\overline{X}_n\)

and real parameter \(\mu\)

are close \(\overline{X}_n \xrightarrow [n\to \infty] {} \mu\)

, but how much close, we don't know! We don't know that how fast(at what rate) \(\overline{X}_n\)

approaches to \(\mu\)

. We can think it as:

\[ \left|\overline{X}_n -\mu \right| \propto \frac{1}{f(n)} \]

where \(f(n)\)

is an increasing function __w.r.t.__

\(n\)

.As

\(f(n)\)

increases \(\left|\overline{X}_n -\mu\right| \)

decreases, so we want a function \(f(n)\)

that increases rapidly w.r.t. \(n\)

. For example

\(log(log(n))\)

increases very slowly so function like this are not useful.So what is the rate at which

\(\overline{X}_n\)

approaches \(\mu\)

? The answer is hidden in Central Limit Theorem.

## Gambler's Fallacy

**Gambler's Fallacy**also known as

**Monte Carlo Fallacy**is a rather popular

**mistaken belief**that,

If anindependentevent is occurringmorefrequently (then it normally does), then it'slesslikely to occur in the future.Note that this statement is not true, as it's a mistaken belief

**Example:**

Say you start flipping a

**fair**coin

\((p=0.5)\)

, and you observe that first \(20\)

tosses are \(\text{Heads}\)

. Then some might say that,

"According to Law of Large Numbers the average proportion of

\(\text{Heads}\)

shall be \(50\%\)

and we got \(20\)

\(\text{Heads}\)

in a row so there are high chances for our next toss to be \(\text{Tails}\)

." **But the above statement is Incorrect**

Even if you got

\(1000\)

\(\text{Heads}\)

in a row, but the probability of next toss to be \(\text{Tails}\)

is still \(50\%\)

. **But why exactly is above statement False?**

Because

**Law of Large Numbers**says

**as**.

\(n\to\infty\)

our Sample mean \(\to\)

True meanSo even if we got

\(1000\)

\(\text{Heads}\)

in a row, there is still \(\infty\)

tosses are left to make our **Sample mean**a

**True mean**.

Now let's see some Simulation, choose your language of choice,

,

Launch Statistics App

## Recommended Watching

Chebyshev's Inequality? (by Prof. John Tsitsiklis)

Chebyshev's Inequality? (by Sir Ben Lambert)

The Weak Law of Large Numbers (by Prof. John Tsitsiklis)

Law of Large Numbers (by Sir Jeremy Jones)## Also checkout Sir's Probability Fallacies playlist

The Gambler's Fallacy (by Sir Kevin deLaplante)