| Percentile | Relative to mean |
|---|---|
| 10% | 0.11 times the mean |
| 25% | 0.29 times the mean |
| 39.3% | 0.50 times the mean |
| 50% | 0.69 times the mean |
| 63.2% | 1.00 times the mean |
| 75% | 1.39 times the mean |
| 77.7% | 1.50 times the mean |
| 86.5% | 2.00 times the mean |
| 90% | 2.30 times the mean |
| 91.8% | 2.50 times the mean |
| 95% | 3.00 times the mean |
| 97% | 3.50 times the mean |
| 98.2% | 4.00 times the mean |
| 98.9% | 4.50 times the mean |
| 99.3% | 5.00 times the mean |
24 Exponential Distributions
Exponential distributions are often used to model the waiting times between events in a random process that occurs continuously over time.
Example 24.1 NASA tracks data on fireball events, exceptionally bright meteors that are spectacular enough to to be seen over a very wide area1. Let \(X\) be the time, measured in months (assume 30 days), between any two fireballs and suppose \(X\) has pdf \[ f_X(x) = 2.5 e^{-2.5 x}, \; x \ge0 \] We say that \(X\) has an Exponential distribution with rate 2.5 per month.
- Sketch the pdf of \(X\) and verify that \(f_X\) is a valid pdf.
- Without doing any integration, approximate the probability that the time until the next fireball rounded to the nearest day is 6 days.
- Compute the probability that the time until the next fireball is less than 6 days.
- Compute and interpret \(\text{P}(X > 2)\).
- Find the cdf of \(X\).
- Find the median time between fireballs.
- Suggest a shortcut formula for \(\text{E}(X)\). Then compute and interpret \(\text{E}(X)\). How does the mean compare to the median? Why?
- Compute and interpret \(\text{P}(X \le \text{E}(X))\).
- Compute \(\text{E}(X^2)\).
- Find \(\text{Var}(X)\) and \(\text{SD}(X)\).
- Suppose \(Y\) is the time between fireballs measured in days. How does all of the above change?
- A continuous random variable \(X\) has an Exponential distribution with rate parameter \(\lambda>0\) if its pdf is \[ f_X(x) = \begin{cases}\lambda e^{-\lambda x}, & x \ge 0,\\ 0, & \text{otherwise} \end{cases} \]
- If \(X\) has an Exponential(\(\lambda\)) distribution then \[\begin{align*} \text{cdf:} \quad F_X(x) & = 1-e^{-\lambda x}, \quad x\ge 0\\ \text{P}(X>x) & = e^{-\lambda x}, \quad x\ge 0\\ \text{quantile:} \quad Q(p) & = -(1/\lambda) \ln(1-p),\quad 0<p<1\\ \text{E}(X) & = \frac{1}{\lambda}\\ \text{Var}(X) & = \frac{1}{\lambda^2}\\ \text{SD}(X) & = \frac{1}{\lambda} \end{align*}\]
- Percentiles for any Exponential distribution follow a particular pattern in terms of multiples of the mean:
- For any \(m>0\), \(\text{P}(X \le m\text{E}(X)) = 1-e^{-m}\).
- The \(p\)th percentile is \(-\ln(1-p)\) times \(\text{E}(X)\).
- See Table 24.1.
- Exponential distributions are often used to model the waiting time in a random process until some event occurs.
- \(\lambda\) is the average rate at which events occur over time (e.g., 2 per hour)
- \(1/\lambda\) is the mean time between events (e.g., 1/2 hour)
- If \(X\) has an Exponential(\(\lambda)\) distribution and \(a>0\) is a constant, then \(aX\) has an Exponential(\(\lambda/a\)) distribution.
- If \(X\) is measured in hours with rate \(\lambda = 2\) per hour and mean 1/2 hour
- Then \(60X\) is measured in minutes with rate \(2/60\) per minute and mean \(60(1/2)=30\) minutes.
- If \(X\) has an Exponential(1) distribution and \(\lambda>0\) is a constant then \(X/\lambda\) has an Exponential(\(\lambda\)) distribution.
- If \(U\) has a Uniform(0, 1) distribution then \(-\ln(1-U)\) has Exponential(1) distribution, and \(-(1/\lambda) \ln(1-U)\) has an Exponential(\(\lambda\)) distribution.
Example 24.2 Continuing Example 24.1. Let \(X\) be the waiting time (months) until the next fireball and assume \(X\) has an Exponential(2.5) distribution.
- Find the conditional probability that the waiting time from now until the next fireball is greater than 3 months given that no fireballs occur in the next month. Be sure to write a valid probability statement involving \(X\) before computing.
- Compare to \(\text{P}(X > 2)\). What do you notice?
- Memoryless property. If \(W\) has an Exponential(\(\lambda\)) distribution then for any \(w,h>0\) \[ \text{P}(W>w+h\,\vert\, W>w) = \text{P}(W>h) \]
- Given that we have already waited \(w\) units of time, the conditional probability that we wait at least an additional \(h\) units of time is the same as the unconditional probability that we wait at least \(h\) units of time.
- In this sense, the future waiting time does not “remember” how long we have already waited.
- A continuous random variable \(W\) has the memoryless property if and only if \(W\) has an Exponential distribution. That is, Exponential distributions are the only continuous2 distributions with the memoryless property.
- If \(W\) has an Exponential(\(\lambda\)) distribution then the conditional distribution of the excess waiting time, \(W - w\), given \(\{W>w\}\) is Exponential(\(\lambda\)).
Example 24.3 Xiomara and Rogelio each leave work at noon from different locations to meet the other for lunch. The amount of time, \(X\), that it takes Xiomara to arrive is a random variable with an Exponential distribution with mean 10 minutes. The amount of time, \(Y\), that it takes Rogelio to arrive is a random variable with an Exponential distribution with mean 20 minutes. Assume that \(X\) and \(Y\) are independent. Let \(W\) be the time, in minutes after noon, at which the first person arrives.
- What is the relationship between \(W\) and \(X, Y\)?
- Compute and interpret \(\text{P}(W>40)\).
- Find \(\text{P}(W > w)\) and identify by name the distribution of \(W\).
- Compute and interpret \(\text{E}(W)\). Is it equal to \(\min(\text{E}(X), \text{E}(Y))\)?
- Is \(\text{P}(Y>X)\) greater than or less than 0.5? Make an educated guess for \(\text{P}(Y > X)\). Then run a simulation to approximate the probability.
- Use simulation to approximate the conditional distribution of \(W\) given \(\{Y > X\}\) and the conditional distribution of \(W\) given \(\{Y < X\}\). What do you notice?
- Exponential race (a.k.a., competing risks.) Let \(W_1, W_2, \ldots, W_n\) be independent random variables. Suppose \(W_i\) has an Exponential distribution with rate parameter \(\lambda_i\). Let \(W = \min(W_1, \ldots, W_n)\) and let \(I\) be the discrete random variable which takes value \(i\) when \(W=W_i\), for \(i=1, \ldots, n\). Then
- \(W\) has an Exponential distribution with rate \(\lambda = \lambda_1 + \cdots+\lambda_n\)
- \(\text{P}(I=i) = \text{P}(W=W_i) = \frac{\lambda_i}{\lambda_1+\cdots+\lambda_n}, i = 1, \ldots, n\)
- \(W\) and \(I\) are independent
- Imagine there are \(n\) contestants in a race, labeled \(1, \ldots, n\), racing independently, and \(W_i\) is the time it takes for the \(i\)th contestant to finish the race. Then \(W = \min(W_1, \ldots, W_n)\) is the winning time of the race, and \(W\) has an Exponential distribution with rate parameter equal to sum of the individual contestant rate parameters.
- The discrete random variable \(I\) is the label of which contestant is the winner. The probability that a particular contestant is the winner is the contestant’s rate divided by the total rate. That is, the probability that contestant \(i\) is the winner is proportional to the contestant’s rate \(\lambda_i\).
- Furthermore, \(W\) and \(I\) are independent. Information about the winning time does not influence the probability that a particular contestant won. Information about which contestant won does not influence the distribution of the winning time.
Example 24.4 Database queries to the Cal Poly data warehouse occur randomly throughout the day. During regular business hours, queries arrive at rate 0.8 per second on average, so that the average number of queries that arrive during any \(t\) second time interval is \(0.8t\). Suppose that the number of queries that arrive during any \(t\) second time interval follows a marginal Poisson distribution with mean \(0.8t\).
We are interested in the distribution of \(T\), the time (seconds) until the next query arrives.
- Interpret the event \(\{T>2\}\). How can you express this as an equivalent event involving the number of queries?
- Compute \(\text{P}(T > 2)\).
- Compute \(\text{P}(T > t)\) as a function of \(t>0\).
- Identify by name the distribution of \(T\); be sure to specify the values of any relevant parameters.
- Compute and interpret \(\text{E}(T)\).
- Events occur continuously over time according to a Poisson process with rate \(\lambda>0\) (per unit time) if
- The number of events that occur in any time interval of length \(t\) has a Poisson\((\lambda t)\) distribution.
- In particular, the distribution does not depend on the starting time or ending time of the interval, just its length.
- The numbers of events that occur in disjoint intervals of time are independent.
- The number of events that occur in any time interval of length \(t\) has a Poisson\((\lambda t)\) distribution.
- A random process is a Poisson process with rate \(\lambda\) if and only if
- it is a counting process, that is, it is a process which counts the number of occurrences of some event over time.
- the interarrival times (times between events) \(W_0, W_1,\ldots\) are independent Exponential\((\lambda)\) random variables
- A random process is a Poisson process with rate \(\lambda\) if and only if
- It is a counting process
- The numbers of events that occur in disjoint intervals of time are independent
- For small \(h\), \[\begin{align*}
\text{P}(\text{0 events in time interval of length $h$}) & \approx 1-\lambda h\\
\text{P}(\text{1 events in time interval of length $h$}) & \approx \lambda h\\
\text{P}(\text{at least 2 events in time interval of length $h$}) & \approx 0
\end{align*}\]
- Roughly, in short intervals of time, at most one event will occur, and the probability that an event does occur is proportional to the length of the time interval.
- That is, if time is chopped up into many short intervals of equal length \(h\), a Poisson Process behaves like a Bernoulli process: each short interval is a trial, an event either occurs (success) or not, the trials (intervals) are independent, the probability of success is the same for each trial (\(\lambda h\)), and the process counts the number of successes
Example 24.5 Suppose that elapsed times (hours) between successive fireballs are independent, each having an Exponential(2.5) distribution. Let \(T\) be the time elapsed from now until the third fireball occurs.
- Compute \(\text{E}(T)\).
- Compute \(\text{SD}(T)\).
- Does \(T\) have an Exponential distribution? Explain.
- Use simulation to approximate the distribution of \(T\).
- A continuous random variable \(X\) has a Gamma distribution with shape parameter \(\alpha\), a positive integer3, and rate parameter4 \(\lambda>0\) if its pdf is \[\begin{align*} f_X(x) & = \frac{\lambda^\alpha}{(\alpha-1)!} x^{\alpha-1}e^{-\lambda x} , & x \ge 0,\\ & \propto x^{\alpha-1}e^{-\lambda x} , & x \ge 0 \end{align*}\] If \(X\) has a Gamma(\(\alpha\), \(\lambda\)) distribution then \[\begin{align*} \text{E}(X) & = \frac{\alpha}{\lambda}\\ \text{Var}(X) & = \frac{\alpha}{\lambda^2}\\ \text{SD}(X) & = \frac{\sqrt{\alpha}}{\lambda} \end{align*}\]
- If \(W_1, \ldots, W_n\) are independent and each \(W_i\) has an Exponential(\(\lambda\)) distribution then \((W_1+\cdots+W_n)\) has a Gamma(\(n\),\(\lambda\)) distribution.
- Each \(W_i\) represents the waiting time between two occurrences of some event, so \(W_1 + \cdots+ W_n\) represents the total waiting time until a total of \(n\) occurrences.
- Exponential distributions are continuous analogs of Geometric distributions, and Gamma distributions are continuous analogs of Negative Binomial distributions.
- For a positive integer \(d\), the Gamma(\(d/2, 1/2\)) distribution is also known as the chi-square distribution with \(d\) degrees of freedom.
Thanks for former STAT 305 student Martin Hsu for this example and data.↩︎
Geometric distributions are the only discrete distributions with the discrete analog of the memoryless property.↩︎
There is a more general expression of the pdf which replaces \((\alpha-1)!\) with the Gamma function \(\Gamma(\alpha)=\int_0^\infty u^{\alpha-1}e^{-u} du\), that can be used to define a Gamma pdf for any \(\alpha>0\). When \(\alpha\) is a positive integer, \(\Gamma(\alpha)=(\alpha-1)!\).↩︎
Like Exponential distributions, Gamma distributions are sometimes parametrized directly by their mean \(1/\lambda\), instead of the rate parameter \(\lambda\). The mean \(1/\lambda\) is called the scale parameter.↩︎