Non-Parametric Methods

Estimation of CDF

Empirical distribution function as an estimator

The estimator for any CDF \(F\) is the discrete estimator \(\widehat{F}_n\) which assigns a mass \(1/n\) to every point in sample \(\{X_i\}_{i=1}^n\).

Note

Let

\[\begin{split}I(X_i\leq x)=\begin{cases}1 & \text{if $X_i\leq x$}\\ 0 & \text{otherwise}\end{cases}\end{split}\]

Then

\[\widehat{F}_n(x)=\frac{1}{n}\sum_{i=1}^n I(X_i\leq x_i)\]

Attention

  • Unbiased: \(\mathbb{E}[\widehat{F}_n(x)]=F(x)\)

  • \(\text{se}_F^2=\mathbb{V}_F(\widehat{F}_n)=\frac{F(x)(1-F(x))}{n}\), and \(\lim\limits_{n\to\infty}\text{mse}(\widehat{F}_n)=0\).

  • Empirical distribution function is a consistent estimator for any distribution.

    \[\widehat{F}_n(x)\xrightarrow[]{P}F(x)\]

Confidence interval for CDF estimator

Note

  • Glivenko-Cantelli Theorem: \(||\widehat{F_n}(x)-F(x)||_\infty=\sup_{x}|\widehat{F_n}(x)-F(x)|\xrightarrow[]{as} 0\).

  • Dvoretzsky-Kiefer-Wolfowitz (DKW) Inequality: For any \(\epsilon>0\),

    \[\mathbb{P}(\sup_x|\widehat{F_n}(x)-F(x)|>\epsilon) \le 2\exp(-2n\epsilon^2)\]

Tip

  • It can be derived from DKW that we can form a \(1-\alpha\) CI of width \(2\epsilon_n\) around \(\widehat{F_n}\) where \(\epsilon_n=\sqrt{\frac{1}{2n}\ln(\frac{2}{\alpha})}\).

    • TODO: derive.

Plug-in Estimator for Statistical Functionals

The plug-in estimator \(\widehat{T}_n(F)\) for any \(T(F)\) can be obtained by replacing \(F\) with \(\widehat{F}_n\).

Estimator for mean

Note

Here \(T(F)=\int x\mathop{dF}\). Since \(\widehat{F}_n\) is discrete

\[\widehat{T}_n(F)=T(\widehat{F}_n)=\frac{1}{n}\sum_{i=1}^nX_i=\bar{X}\]
  • \(\text{se}_F^2=\mathbb{V}_F(\widehat{T}_n)=\frac{\sigma^2}{n}\).

  • CLT says that this estimator is asymptotically normal.

Tip

  • \(\text{se}_F\) depends on the true distribution \(F\).

  • If the true variance \(\sigma^2\) is not known, it can be estimated as the next step.

  • Let the estimate for \(\text{se}_F\) be \(\widehat{\text{se}}_n\). Assuming asymptotic normality, we can compute confidence interval as

    \[T(\widehat{F}_n)\pm z_{\alpha/2}\widehat{\text{se}}_n\]

Estimtor for variance

Note

Here \(T(F)=\int (x-\mathbb{E}[X]^2)\mathop{dF}\). Since \(\widehat{F}_n\) is discrete

\[\widehat{T}(F)=T(\widehat{F}_n)=\frac{1}{n}\sum_{i=1}^n(X_i-\bar{X})^2=S^2_n\]
  • For sample mean estimator, \(\widehat{\text{se}}^2_n=S^2_n\)

Tip

We can use similar techniques for estimating any moments of \(F\).

Estimator for other functionals

The estimator can be obtained similarly.

Tip

  • \(\text{se}_F\) often has to be estimated in order to obtain a confidence interval.

  • As the estimator is also a statistic, the variance can be obtained using the following methodology.

Variance Estimation of a Statistic for CI

We’re interested in estimating the variance of a statistic \(g(X_1,\cdots,X_n)\) given the sample.

Bootstrap

Key Idea

Let \(X^*=(X^*_1,\cdots,X^*_2)\) be a simulation obtained from the original sample \((x_1,\cdots,x_n)\) by drawing with replacement.

Note

  • Let \(Y=g(X^*_1,\cdots,X^*_n)\)

  • WLLN: \(\frac{1}{B}\sum_{i=1}^BY_i\xrightarrow[]{P}\mathbb{E}[Y]\)

  • \(\frac{1}{B}\sum_{i=1}^Bh(Y_i)\xrightarrow[]{P}\mathbb{E}[h(Y)]\)

  • \(\frac{1}{B}\sum_{i=1}^B(Y_i-\bar{Y})^2=\frac{1}{B}\sum_{i=1}^n Y_i^2-\left(\frac{1}{B}\sum_{i=1}^n Y_i\right)^2\xrightarrow[]{P}\mathbb{E}[Y^2]-(\mathbb{E}[Y])^2=\mathbb{V}(Y)\)

Tip

  • We can therefore estimate the variance of a statistic by sample variance obtained via simulation \(B\) times.

Obtaining the variance of an estimator

Let the estimator for \(T(F)\) be \(\widehat{T}_n=g(X_1,\cdots,X_n)\).

Note

  • For \(i=1\) to \(B\):

    • Obtain a simulated sample \(X_i^*=(X^*_{i,1},\cdots,X^*_{i,n})\).

    • Compute estimate \(\widehat{T}^*_{n,i}=g(X^*_{i,1},\cdots,X^*_{i,n})\)

  • Compute bootstrap variance

    \[v_{\text{boot}}=\frac{1}{B}\sum_{i=1}^B(\widehat{T}^*_{n,i}-\frac{1}{B}\sum_{j=1}^B\widehat{T}^*_{n,i})^2\]
  • Use estimation strategy

    \[\mathbb{V}_F(\widehat{T}_n)\approx\mathbb{V}_{\widehat{F}_n}(\widehat{T}_n)\approx v_{\text{boot}}\]

Tip

We can use \(v_{\text{boot}}\) to obtain \(\text{se}\) and compute CI.

Jack knife

Note

  • Instead of a simulated sample obtained via replacement, we remove one observation and consider it a new sample.

  • Rest of the steps are carried out exactly the same way as bootstrap and we get \(v_{\text{jack}}\) to compute CI.

  • This is less computationally expensive than bootstrap.