21  Variance of Estimators

Example 21.1 Consider a large population in which 20% of individuals have an annual income ($ thousands) of 200, 40% of have an income of 70, and 40% have an income of 10. For this population, the population mean is \[ \mu = 10(0.4) + 70(0.4) + 200(0.2) = 72 \] Also, the population variance is \[ [10^2(0.4) + 70^2(0.4) + 200^2(0.2)] - 72^2 = 4816 \] and the the population SD is \(\sigma = 69.4\).

Suppose our goal is to estimate \(\mu\) based on a sample of size \(n\). In this example, we know the population distribution and we know the population mean \(\mu = 72\), but in practice the population distribution and the population mean would be unknown.

One estimator of the population mean \(\mu\) is \[ \bar{X} = \frac{1}{n} \sum_{i=1}^n X_i \]

First suppose \(n=2\).

  1. Determine the distribution of \(\bar{X}\).




  2. Find and interpret \(\textrm{P}(\bar{X}\ge 105)\).




  3. Compute \(\textrm{E}(\bar{X})\). How does it relate to \(\mu\)? Why?




  4. Compute and interpret \(\textrm{Var}(\bar{X})\). How does it relate to the population variance \(\sigma^2\)?




  5. Compute and interpret \(\textrm{SD}(\bar{X})\). How does it relate to the population standard deviation \(\sigma\)?




Example 21.2 Continuing Example 21.1, now consider \(n=8\).

  1. How could you use simulation to approximate the distribution of \(\bar{X}\), and its expected value and SD?




  2. Find and interpret \(\textrm{E}(\bar{X})\) without first finding the distribution of \(\bar{X}\), and compare to the simulation results.




  3. Find and interpret \(\textrm{SD}(\bar{X})\) without first finding the distribution of \(\bar{X}\), and compare to the simulation results. What does this measure variability of? Compare to \(n=2\).




  4. Use simulation to find and interpret \(\textrm{P}(\bar{X}\ge 105)\). Compare to \(n=2\).




Example 21.3 Consider again estimating \(\mu\) for a Poisson(\(\mu\)) distribution based on a random sample \(X_1, \ldots, X_n\) of size \(n\). We have seen that both \(\bar{X}\) and \(S^2\) are unbiased estimators of \(\mu\). If we want to choose between these two estimators, how do we decide?

  1. Assume \(n=3\) and \(\mu=2.3\). Describe in full detail how you could conduct a simulation to approximate the sample-to-sample distribution of \(\bar{X}\) and its and its expected value and standard deviation. Then conduct the simulation and record the results. What does the standard deviation measure?




  2. Assume \(n=3\) and \(\mu=2.3\). Describe in full detail how you could conduct a simulation to approximate the sample-to-sample distribution of \(S^2\) and its expected value and standard deviation. Then conduct the simulation and record the results. What does the standard deviation measure?




  3. Compare the simulation results for \(\bar{X}\) and \(S^2\) when \(n=3\) and \(\mu = 2.3\). Based on the simulation results, which estimator of \(\mu\) is preferred when \(\mu = 2.3\)\(\bar{X}\) or \(S^2\)? Why? But then explain why this information isn’t very helpful.




  4. Assuming \(n= 3\) and \(\mu = 2.3\), compute \(\textrm{E}(\bar{X})\) and \(\textrm{SD}(\bar{X})\), and compare to the simulation results.




  5. For a general \(n\) and \(\mu >0\), find expressions for \(\textrm{E}(\bar{X})\) and \(\textrm{SD}(\bar{X})\).




  6. It can be shown that \[ \textrm{Var}(S^2) = \frac{2\mu^2}{n-1} + \frac{\mu}{n} \] Sketch a plot of the variance functions of both \(\bar{X}\) and \(S^2\) as a function of \(\mu\). Regardless of the value of \(\mu\), which estimator has smaller variance? Which of these two unbiased estimators of \(\mu\), \(\bar{X}\) or \(S^2\), is preferred?