25 Normal Distributions

Normal distributions are probably the most important distributions in probability and statistics. Any Normal distribution follows the “empirical rule” which specify the percentiles that give a Normal distribution its particular bell shape. The key to working with Normal distributions is to work with standardized values, that is, standard deviations from the mean

A continuous random variable \(X\) has a Normal (a.k.a., Gaussian) distribution with mean \(\mu\in (-\infty,\infty)\) and standard deviation \(\sigma>0\) if its pdf is \[ f_X(x) = \frac{1}{\sigma\sqrt{2\pi}}\,\exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right), \quad -\infty<x<\infty,\\ \]
If \(X\) has a Normal(\(\mu\), \(\sigma\)) distribution then \[\begin{align*} \text{E}(X) & = \mu\\ \text{SD}(X) & = \sigma \end{align*}\]
A Normal density is a particular “bell-shaped” curve which is symmetric about its mean \(\mu\). The mean \(\mu\) is a location parameter: \(\mu\) indicates where the center and peak of the distribution is.
The standard deviation \(\sigma\) is a scale parameter: \(\sigma\) indicates the distance from the mean to where the concavity of the density changes. That is, there are inflection points at \(\mu\pm \sigma\).

Example 25.1 The pdfs in the plot below represent the distribution of hypothetical test scores in three classes. The test scores in each class follow a Normal distribution. Identify the mean and standard deviation for each class.

The key to working with Normal distributions is to work in terms of standard deviations from the mean (that is, standardized values).
- If \(X\) has a Normal(\(\mu\), \(\sigma\)) distribution then \(Z = \frac{X-\mu}{\sigma}\) has a Normal(0, 1) distribution
- If \(Z\) has a Normal(0, 1) distribution then \(X = \mu + \sigma Z\) has a Normal(\(\mu\), \(\sigma\)) distribution
Any Normal distribution follows an “empirical rule” which relates percentiles to standard deviation from the mean and defines the particular bell shape.
A continuous random variable \(Z\) has a Standard Normal (a.k.a. Normal(0, 1)) distribution if its pdf is \[ \phi(z) = \frac{1}{\sqrt{2\pi}}\,e^{-z^2/2}, \quad -\infty<z<\infty,\\ \]
The cdf is a Normal(0, 1) distribution is \[ \Phi(z) = \int_{-\infty}^z\frac{1}{\sqrt{2\pi}}\,e^{-u^2/2}, \quad -\infty<z<\infty,\\ \] but integrals of this type do not have a closed form and must be evaluated with software

(a) Percentiles highlighted in 0.5 increments of standard deviations from the mean

Table 25.1: Select percentiles for a Normal distribution in terms of standard deviations from the mean

(a) In 0.5 increments of standard deviations from the mean

Percentile	Standard deviations from the mean
0.02%	-3.5
0.1%	-3.0
0.6%	-2.5
2.3%	-2.0
6.7%	-1.5
15.9%	-1.0
30.9%	-0.5
50%	0.0
69.1%	0.5
84.1%	1.0
93.3%	1.5
97.7%	2.0
99.4%	2.5
99.9%	3.0
99.98%	3.5

(b) Deciles and quantiles as standard deviations from the mean

Percentile	Standard deviations from the mean
1%	-2.33
5%	-1.64
10%	-1.28
20%	-0.84
25%	-0.67
30%	-0.52
40%	-0.25
50%	0.00
60%	0.25
70%	0.52
75%	0.67
80%	0.84
90%	1.28
95%	1.64
99%	2.33

Figure 25.1 defines a Normal distribution by specifying all of its percentiles.
The empirical rule is often described in terms of “within [blank] standard deviations of the mean” as in the following:
- 38% of values are within 0.5 standard deviations of the mean
- 50% of values are withing 0.67 standard deviations of the mean
- 68% of values are within 1 standard deviation of the mean
- 87% of values are within 1.5 standard deviations of the mean
- 95% of values are within 2 standard deviations of the mean
- 99% of values are within 2.6 standard deviations of the mean
- 99.7% of values are within 3 standard deviations of the mean
- 99.99% of values are within 4 standard deviations of the mean

Example 25.2 The wrapper of a package of candy lists a weight of 47.9 grams. Naturally, the weights of individual packages vary somewhat. Suppose package weights have an approximate Normal distribution with a mean of 49.8 grams and a standard deviation of 1.3 grams.

Sketch the distribution of package weights. Carefully label the variable axis. It is helpful to draw two axes: one in the measurement units of the variable, and one in standardized units.
Why wouldn’t the company print the mean weight of 49.8 grams as the weight on the package?
Estimate the probability that a package weighs less than the printed weight of 47.9 grams.
Estimate the probability that a package weighs between 47.9 and 53.0 grams.
Suppose that the company only wants 1% of packages to be underweight. Find the weight that must be printed on the packages.
Find the 99th percentile of package weights. How can you use the work you did in the previous part?

Example 25.3 A bank uses an applicant’s score on some criteria to decide whether or not to approve their loan application. Based on past history, the bank has determined that

Scores for those who repay the loan follow a Normal distribution with mean 60 and SD 8
Scores for those who do NOT repay the loan follow a Normal distribution with mean 40 and SD 12

When someone applies for a loan the bank does not know whether the applicant will eventually repay the loan. How can the bank use the applicant’s score to decide whether or not to approve the loan?

Draw sketches of these two normal curves on the same axis.
Suggest a general form of a decision rule, based on an applicant’s score, for deciding whether or not to approve the applicant’s loan.
Describe the two kinds of classification errors that could be made in this situation.
Determine the probability that we incorrectly reject the application of someone who would repay. Shade in the corresponding region under the Normal curve, and interpret this probability.
Determine the probability that we incorrectly approve the application of someone who would NOT repay. Shade in the corresponding region under the Normal curve, and interpret this probability.
In which direction — smaller or larger — would you need to change the decision rule’s cutoff value in order to decrease the probability that an applicant who would repay the loan is incorrectly rejected?
Would the probability of the other kind of error — incorrectly approving a loan for an applicant who would not repay it — increase or decrease with this new cutoff value?
Determine the cutoff value needed to decrease the probability that an applicant who would repay the loan is incorrectly rejected to 0.05.
Determine the other error probability with this new cut-off rule.
Now repeat the two previous parts with the goal of decreasing to 0.05 the probability of incorrectly approving an applicant who would not repay the loan.
If you consider the two kinds of errors to be equally serious, how might you decide which of the three decision rules considered so far is the best?
Which error do you think the bank would consider more serious? In which direction would that move the threshold?

Making decisions/predictions based on data often involves trade-offs.
Decreasing the probability of one kind of error often comes at the expense of increasing the probability of another kind of error.