# Estimators ¶

Go back

An estimator is a function taking random variables and returning an estimation of their values. In a dataset, for each column (=variable), you will look for an estimator of the variable, and merging every estimator in a vector is creating the vector $\hat{\theta}_n$.

$\begin{split} h_n(X_1,...,X_n) = \hat{\theta}_n \\ T(X_1,...,X_n) = \hat{\theta}_n \\ f(X_1,...,X_n) = \hat{\theta}_n \\ \end{split}$

But you know, things aren't that easy. The difference between the expected value and $\theta$ is called the bias (biais). If you got a choice, pick the estimator that has the least bias.

• biased: $B(\hat{\theta}) = E[\hat{\theta}] - \theta$
• unbiased: $B(\hat{\theta}) = 0$
• asymptotic without bias: $\lim_{n \rightarrow +\infty} E[\hat{\theta}] = \theta$

You should also pick the estimator with the least mean squared deviation/error (MSD/MSE)

@ MSE(\hat{\theta}) = E[(\hat{\theta}-\theta)^2] = Var(\hat{\theta})+B(\hat{\theta})^2 @

Notes

• when an estimator is converging to the real value, we are calling it convergent estimator (estimateur convergent)
• an estimator is efficient (estimateur efficace) if $Var(\hat{\theta})$ is low
• an estimator that is not impacted too much by the outliers is says to be robust (estimateur robuste)

The 3 words above along with the bias are making the key properties we are checking to see if an estimator is good.

## Well-known unbiased formulas ¶

You should know how to demonstrate them, but I don't know how to do so, hence I'm not doing it 🙄. You can find that on the web anyway, like you got one on Wikipedia.

• unbiased mean
$\hat{\mu} = \frac{1}{n} \sum_{i=1}^{n} X_i$
• unbiased variance (remember the 1/n-1)
$S^2_{n} = \frac{1}{n-1} \sum_{i=1}^{n} (X_i-\hat{\mu})^2$
• unbiased asymptotic variance
$S^2_{n} = \frac{1}{n} \sum_{i=1}^{n} (X_i-\hat{\mu})^2$