30 Variance of Estimators
- The bias function of an estimator measures the estimator’s tendency, over many hypothetical samples, to overestimate or underestimate the parameter, as a function of potential values of the parameter.
- But bias is only one consideration. Bias only considers the average value of the estimator over many samples, which is just one feature of the estimator’s sample-to-sample distribution.
Example 30.1 Consider again estimating \(\mu\) for a Poisson(\(\mu\)) distribution based on a random sample \(X_1, \ldots, X_n\) of size \(n\). We have seen that both \(\bar{X}\) and \(S^2\) are unbiased estimators of \(\mu\). If we want to choose between these two estimators, how do we decide?
- Assume \(n=3\) and \(\mu=2\). Describe in full detail how you could conduct a simulation to approximate the sample-to-sample distribution of \(\bar{X}\) and its and its expected value and standard deviation. Then conduct the simulation and record the results. What does the standard deviation measure?
- Repeat part 1 for \(S^2\).
- Compare the simulation results for \(\bar{X}\) and \(S^2\) when \(n=3\) and \(\mu = 2\). Based on the simulation results, which estimator of \(\mu\) is preferred when \(\mu = 2\): \(\bar{X}\) or \(S^2\)? Why? But then explain why this information by itself isn’t very helpful.
- A statistic (or estimator) is a characteristic of the sample which can be computed from the data. More precisely, a statistic is a function of \(X_1,\ldots,X_n\), but not of \(\theta\).
- Because the sample is random, a statistic is itself a random variable, and therefore has its own probability distribution, which describes how values of the statistic would vary from sample-to-sample over many (hypothetical) samples.
- Statistics exhibit sample-to-sample variability: the value of a statistic varies from sample to sample.
- For many statistics (e.g., means and proportions)— but not all— statistics from larger random samples vary less, from sample to sample, than statistics from smaller random samples.
- For example, \(\text{Var}(\bar{X}_n) =\frac{\sigma^2}{n}\).
- For many statistics (e.g., means and proportions)— but not all— statistics from larger random samples vary less, from sample to sample, than statistics from smaller random samples.
- When choosing between two unbiased estimators, the one with smaller variance is generally preferred.
- Remember that the variance of an estimator will be a function of the unknown parameter \(\theta\), so we need to consider the variance function for various potential values of \(\theta\).
Example 30.2 Consider again estimating \(\mu\) for a Poisson(\(\mu\)) distribution based on a random sample \(X_1, \ldots, X_n\) of size \(n\). We have seen that both \(\bar{X}\) and \(S^2\) are unbiased estimators of \(\mu\). If we want to choose between these two estimators, how do we decide?
- Identify the variance function of \(\bar{X}\).
- Describe in full detail how you could use simulation to approximate the variance function of \(S^2\).
- It can be shown that \[
\textrm{Var}(S^2) = \frac{2\mu^2}{n-1} + \frac{\mu}{n}, \qquad \text{when the population distribution is Poisson($\mu$)}
\] Sketch a plot of the variance functions of both \(\bar{X}\) and \(S^2\).
- Which of these two unbiased estimators of \(\mu\), \(\bar{X}\) or \(S^2\), is preferred? Why?
- Suppose \(n=3\) and the sample is \((3, 0, 2)\). For this sample \(\bar{x} = 1.67\) and \(s^2 = 2.33\). Which number, 1.67 or 2.33, is a better estimate of \(\mu\)? Explain.
- Suppose \(n=3\) and the sample is \((3, 0, 2)\). For this sample \(\bar{x} = 1.67\) and \(s^2 = 2.33\). Which number, 1.67 or 2.33, would you choose as the estimate of \(\mu\) based on this sample? Why?