| Percentile | Value (minutes) |
|---|---|
| 10th | 12.6 |
| 20th | 26.8 |
| 30th | 42.8 |
| 40th | 61.3 |
| 50th | 83.2 |
| 60th | 110.0 |
| 70th | 144.5 |
| 80th | 193.1 |
| 90th | 276.3 |
21 Continuous Random Variables
- The probability that a continuous random variable \(X\) equals any particular value is 0. That is, if \(X\) is continuous then \(\text{P}(X=x)=0\) for all \(x\).
- For continuous random variables, it doesn’t really make sense to talk about the probability that the random value is equal to a particular value.
- Even though any specific value of a continuous random variable has probability 0, intervals can have positive probability. In particular, the probability that a continuous random variable is “close to” a specific value can be positive.
- For continuous random variables, finding frequencies of individual values and impulse plots (like we used for discrete RVs) don’t make any sense since each simulated value will only occur once in the simulation.
- Instead, we summarize simulated values of continuous random variables in a histogram. A histogram groups the observed values into “close-to bins” and plots frequencies or densities for each bin.
- Typically, in a histogram areas of bars represent relative frequencies; in which case the axis which represents the height of the bars is called “density”.
- The continuous analog of a probability mass function (pmf) is a probability density function (pdf). However, while pmfs and pdfs play analogous roles, they are different in one fundamental way; namely, a pmf outputs probabilities directly, while a pdf does not.
- A probability density function only provides the density height; the area under the density curve determines probabilities.
- For a continuous random variable \(X\) with pdf \(f_X\), the probability that \(X\) takes a value in the interval \([a, b]\) is the area under the pdf over the region \([a,b]\).
Example 21.1 Maggie and Seamus are babies who have just turned one. At their one-year visits to their pediatrician:
- Maggie is 76cm tall and in the 75th percentile of height for females.
- Seamus is 72cm tall and in the 10th percentile of height for males.
Explain what these percentiles mean.
- A distribution is basically just a collection of percentiles.
- Roughly, the value \(x\) is the \(p\)th percentile of a distribution of a random variable \(X\) if \(p\) percent of values of the variable are less than or equal to \(x\): \(\text{P}(X\le x) = p\).
- The cumulative distribution function (cdf) of a random variable fills in the blank for any given \(x\): \(x\) is the (blank) percentile. That is, for an input \(x\), the cdf outputs \(\text{P}(X\le x)\).
- The cumulative distribution function (cdf) (of a random variable \(X\) defined on a probability space with probability measure \(\text{P}\)) is the function, \(F_X: \mathbb{R}\mapsto[0,1]\), defined by \(F_X(x) = \text{P}(X\le x)\).
- A cdf is defined for all real numbers \(x\) regardless of whether \(x\) is a possible value of \(X\).
- A cdf defines a distribution by defining all of its percentiles. Given all the percentiles we can determine the probability that the random variable lies in any interval.
Example 21.2 In a certain region, times (minutes) between occurrences of earthquakes (of any magnitude) have a distribution with percentiles displayed in Table 21.1. Suppose an earthquake just occurred and let \(X\) be the amount of time until the next earthquake; we’ll assume Table 21.1 represents (partially) the distribution of \(X\).
- Let \(F\) be the cdf of \(X\). Evaluate and interpret \(F(12.6)\) and \(F(26.8)\).
- Construct a spinner corresponding to this distribution.
- Sketch a histogram of this distribution with unequal bin widths.
- Sketch the pdf of \(X\).
- Sketch a histogram of this distribution with equal bin widths.
- Let \(Q(0.1)\) represent the value of \(x\) for which \(F(x)=0.1\). Evaluate and interpret \(Q(0.1)\).
- The quantile function (essentially the inverse cdf) fills in the following blank for a given \(p\in[0, 1]\): the \(100p\)th percentile is (blank).
- For example, evaluating the quantile function at \(p=0.25\) outputs the 25th percentile.
- For a continuous random variable with cdf \(F\), the quantile function \(Q:[0,1]\mapsto\mathbb{R}\) is the inverse of the cdf, \(Q(p) = F^{-1}(p)\).
- A quantile function basically gives you the same information as a cdf, just in the other direction. Both a quantile function and a cdf define a distribution by specifying all the percentiles.
Example 21.3 Recall Example 5.5. Let \(X\) be the random variable representing Han’s arrival time and assume that the cdf of \(X\) satisfies \[ F(x) = (x/60)^2, \qquad 0\le x\le 60 \]
- Compute and interpret \(\text{P}(X\le 30)\).
- Compute and interpret \(\text{P}(X \le 42.42)\).
- Compute and interpret \(\text{P}(X \le 51.96)\).
- What do the previous parts tell you about the percentiles of \(X\)?
- Use the answers to the previous parts to start to construct a spinner for simulating values of \(X\).
- How could you fill in more of the spinner? For example, where should you place the values 10, 20, 30, 40, 50 on the spinner axis?
- Suppose you simulate many values of \(X\). Sketch a histogram of \(X\).
- Sketch the pdf of \(X\).
- Compute and interpret the 10th percentile.
- Compute and interpret the 90th percentile.
- Find an expression for the \(p\)th percentile, where \(0<p<1\), e.g., \(p=0.1\) corresponds to the 10th percentile.
- Use the answers to the previous parts to start to construct a spinner for simulating values of \(X\). How could you fill in more of the spinner?
- The quantile function can be used to create a spinner for a distribution. Basically, the values on the outside boundary of the spinner are scaled based on the quantile function (which is determined by the cdf). Intervals corresponding to regions of higher density (“more likely”) values are stretched out on the spinner boundary; intervals corresponding regions of lower density (“less likely” values) are shrunk.
- Universality of the Uniform (or “one spinner to rule them all”). Let \(F\) be a cdf and \(Q\) its corresponding quantile function. Let \(U\) have a Uniform(0, 1) distribution and define the random variable \(X=Q(U)\). Then the cdf of \(X\) is \(F\).
- Universality of the uniform might look complicated but all it basically says is that you can construct a spinner by putting the 25th percentile 25% of the way around, the 75th percentile 75% of the way around, etc.
- Actually, universality of the uniform says we don’t have to create a new spinner. We can just spin the Uniform(0, 1) spinner—which would tell us what percentile that we want, e.g., 0.7 for the 70th percentile—and transform each resulting value by plugging it into the quantile function to obtain the percentile in the measurement units of the the variable.