tl;dr Chebychev Inequality

MCMC Chebychev

Chebychev Inequality proof and use.

Robert Settlage
2022-06-12

Quick proof and example of use of the Chebychev Inequality.

Suppose we have \(x\sim f(x), g(x) \gt 0, \epsilon \gt 0\). The Chebychev Inequality is written as:

\[ \begin{equation} \tag{1} Pr(g(x) \ge \epsilon) \le \frac{E[g(x)]}{\epsilon} \end{equation} \]

Proof

\[ \begin{eqnarray} \tag{2} E[g(x)] &=& \int_{\Omega_x} g(x) f(x) dx \;\;\;\;\;\;\;\;\;\;\;\;\text{ :def of expectation}\\ &\ge& \int_{\{x: g(x)\ge \epsilon\}}g(x) f(x) dx \;\;\;\;\text{ :limit x}\\ &\ge& \int_{\{x: g(x)\ge \epsilon\}} \epsilon f(x) dx \;\;\;\;\;\;\;\;\text{ :smallest g(x)=}\epsilon \\ &\ge& \epsilon Pr(g(x)\ge \epsilon) \;\;\;\;\;\;\;\;\;\;\;\;\text{ :by def of prob} \end{eqnarray} \]

Slight rearrangment gives desired Chebychev Inequality equation.

Example

Say we observe data \(\mathfrak D\) which was generated by some process \(f(\cdot )\) and we want to know \(Pr(X=\mathfrak D)\). We can use Monte Carlo to approximate this. How?

\[ \begin{equation} \frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N} \end{equation} \] The question may then be, how good is this or how do we measure. We could use relative error:

\[ \begin{equation} \frac{\mid \frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N} - Pr(X=\mathfrak D) \mid }{Pr(X=\mathfrak D)} \end{equation} \] Or, perhaps, we want a bit more and want to know the probability the error is within some bound:

\[ \begin{equation} Pr\left(\frac{\mid \frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N} - Pr(X=\mathfrak D) \mid }{Pr(X=\mathfrak D)} \le \epsilon \right) \end{equation} \]

Now this is starting to feel like Chebychev. What if we were to square the inner terms and follow along with Chebychev:

\[ \begin{eqnarray} Pr\left(\frac{\mid \frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N} - Pr(X=\mathfrak D) \mid^2 }{Pr(X=\mathfrak D)^2} \le \epsilon^2 \right) &\le& \frac{E\left[\left(\frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N} - Pr(X=\mathfrak D)\right)^2\right]}{\epsilon^2 Pr(X=\mathfrak D)^2} \\ &\le& \frac{Var\left(\frac{\sum_{x_i \sim f(\cdot)} \mathbb{1}_{\{x_i = \mathfrak D\}}}{N}\right)}{\epsilon^2 Pr(X=\mathfrak D)^2} \\ &\le& \frac{Var\left(x_i=\mathfrak D\right)}{N^2 \epsilon^2 Pr(X=\mathfrak D)^2} \text{ numerator is just iid Bernoulli} \\ &=& \frac{\left(1- Pr(x=\mathfrak D) \right) Pr(X=\mathfrak D)} {N\epsilon^2Pr(X=\mathfrak D)^2} \\ &=& \frac{\left(1- Pr(x=\mathfrak D) \right) } {N\epsilon^2Pr(X=\mathfrak D)} \end{eqnarray} \] This leads to some insights. As N increases, error decreases. Additionally, if \(Pr(X=\mathfrak D)\) is small, error is relatively larger for fixed N. Finally, if we pick the # of successes, then the error rate is fixed by N. This last observation is the Las Vegas alternative to Monte Carlo.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/rsettlage/rsettlage.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Settlage (2022, June 12). Robert E. Settlage: tl;dr Chebychev Inequality. Retrieved from https://rsettlage.github.io/mcmc/2022-06-12-chebychev-tldr/

BibTeX citation

@misc{settlage2022tl;dr,
  author = {Settlage, Robert},
  title = {Robert E. Settlage: tl;dr Chebychev Inequality},
  url = {https://rsettlage.github.io/mcmc/2022-06-12-chebychev-tldr/},
  year = {2022}
}