20  Bias of Estimators

Example 20.1 Consider a large population in which 20% of individuals have an annual income ($ thousands) of 200, 40% of have an income of 70, and 40% have an income of 10. For this population, the population mean is \[ \mu = 10(0.4) + 70(0.4) + 200(0.2) = 72 \] Also, the population variance is \[ [10^2(0.4) + 70^2(0.4) + 200^2(0.2)] - 72^2 = 4816 \] and the the population SD is \(\sigma = 69.4\).

Suppose our goal is to estimate \(\sigma\) based on a sample of size \(n\). We’ll start by estimating the population variance \(\sigma^2\). In this example, we know the population distribution and we know the population variance \(\sigma^2 = 4816\), but in practice the population distribution and the population variance would be unknown.

One estimator of the population variance \(\sigma^2\) is \[ \hat{\sigma}^2 = \frac{1}{n} \sum_{i=1}^n\left(X_i - \bar{X}\right)^2 \]

  1. For \(n=2\), describe in detail how you would use simulation to approximate the distribution of \(\hat{\sigma}^2\), and its expected value.




  2. For \(n=2\) determine the distribution of \(\hat{\sigma}^2\). (This is not a simulation; use ideas from the first half of the course.)




  3. Find and interpret \(\textrm{P}(\hat{\sigma}^2=0)\).




  4. Compute and interpret \(\textrm{E}(\hat{\sigma}^2)\). Compare \(\textrm{E}(\hat{\sigma}^2)\) to \(\sigma^2\); what does this say?




Example 20.2 Continuing Example 20.1. Another commonly used definition of the sample variance is \[ S^2 = \frac{1}{n-1} \sum_{i=1}^n\left(X_i - \bar{X}\right)^2 \]

  1. For \(n=2\), determine the distribution of \(S^2\).




  2. Compute and interpret \(\textrm{E}(S^2)\). Compare \(\textrm{E}(S^2)\) to \(\sigma^2\); what does this say?




  3. The sample standard deviation is commonly defined as \(S = \sqrt{S^2}\), that is, \[ S = \sqrt{\frac{1}{n-1} \sum_{i=1}^n\left(X_i - \bar{X}\right)^2} \] For \(n=2\), determine the distribution of \(S\).




  4. Compute and interpret \(\textrm{E}(S)\). Compare \(\textrm{E}(S)\) to \(\sigma\); what does this say?




Example 20.3 Let \(X_1, \ldots, X_n\) be an (i.i.d.) random sample from a population with population mean \(\mu\). Let \(\bar{X} = (X_1 + \ldots + X_n)/n\) be the sample mean. It can be shown that \[ \textrm{E}(\bar{X}) = \mu \] There are three means above; explain what all the means mean, and what the equation says.




Example 20.4 Recall that in the car dealership problem we were trying to estimate \(\mu\) for a Poisson(\(\mu\)) distribution. We considered a few different estimators of \(\mu\). Determine which of the following estimators are unbiased estimators of \(\mu\) based on a random sample \(X_1, \ldots, X_n\) of size \(n\) from a Poisson(\(\mu\)) distribution. If the estimator is not unbiased, identify its bias function.

  1. \(\bar{X}= \frac{1}{n}\sum_{i=1}^n X_i\).




  2. \(S^2 = \frac{1}{n-1}\sum_{i=1}^n\left(X_i-\bar{X}\right)^2\)




  3. \(\hat{\sigma}^2 = \frac{1}{n}\sum_{i=1}^n\left(X_i-\bar{X}\right)^2\). (Hint: \(\hat{\sigma}^2 = \frac{n-1}{n} S^2\).)




  4. \(\frac{n}{n+100}\bar{X}+ \frac{100}{n+100}(2.3)\). (Recall this estimator was like a weighted average of the mean from the old dealership and the sample mean for the new dealership.)




  5. \(2.3\). (This estimator ignores the sample data from the new dealership entirely and just uses the mean from the old dealsership as the estimator.)




Example 20.5 Continuing the Poisson(\(\mu\)) problem. Suppose \(n=3\). Imagine we had not already derived the bias function of the estimator \(\frac{n}{n+100}\bar{X}+ \frac{100}{n+100}(2.3)\)

  1. Describe in detail how you would use simulation to approximate the bias of \(\frac{n}{n+100}\bar{X}+ \frac{100}{n+100}(2.3)\) when \(\mu = 2.3\).




  2. Describe in detail how you would use simulation to approximate the bias function of \(\frac{n}{n+100}\bar{X}+ \frac{100}{n+100}(2.3)\).




Example 20.6 Continuing the Poisson(\(\mu\)) problem. Let \(\theta=e^{-\mu}\). We know \(\bar{X}\) is an unbiased estimator of \(\mu\). We also know that \(e^{-\bar{X}}\) is the MLE of \(e^{-\mu}\). Is \(\hat{\theta}= e^{-\bar{X}}\) an unbiased estimator of \(\theta = e^{-\mu}\)?




Example 20.7 Continuing the Poisson(\(\mu\)) problem. Consider the estimator \(\hat{\sigma}^2\). Recall that the bias function is \(\text{bias}_\mu(\hat{\sigma}^2) = -\frac{\mu}{n}\). What happens to the bias function as \(n\to\infty\). What does this mean?