26  Null Hypothesis Testing

Example 26.1 Did the New England Patriots deliberately deflate footballs in an NFL playoff game against the Indianapolis Colts on January 18, 2015? The following table1 summarizes the decrease in air pressure (psi), from pre-game to halftime, for the footballs from each team that were checked at halftime by officials.

Team Deflation (psi) Size Mean SD
Patriots 1.00 1.65 1.35 1.80 1.40 0.90 0.65 1.40 1.55 2.00 1.60 11 1.391 0.402
Colts 0.30 0.25 0.50 0.45 4 0.375 0.119
  1. Compute the observed difference in sample means.




  2. If there were no difference between the Patriots and the Colts footballs, what would you expect the difference in sample means to be? Is the fact that we observed another value necessarily evidence that the Patriots footballs tended to deflate more than the Colts?




  3. Suppose we want to assess if it is plausible that the Patriots didn’t interfere with the footballs, and the observed difference in sample means is just due to chance variability. Specify how the “null distribution” of the statistic could be simulated with cards. Note: this is NOT bootstrapping; how is it different?




  4. Use cards to perform one repetition of the above simulation, record the results here, and compute the appropriate difference in means.
Team Deflation (psi) Size Mean SD
Patriots 11
Colts X X X X X X 4
  1. The following plot displays results from an applet used to run 10000 repetitions of the simulation (the plot on the left displays the results from just the first 100 repetitions). Estimate the probability of observing a difference in sample means as extreme as the observed difference just by chance if there were no real difference between the Patriots and Colts footballs.
(a) First 100 repetitions
(b) 10000 repetitions
Figure 26.1: Permutation simulation of null distribution in Example 26.1.
  1. Is it plausible that the observed difference in means is just due to random chance? To support your answer, interpret the probability from the previous part in context.




p-value Evidence to reject the null hypothesis?
Above 0.10 Little to no evidence
Between 0.01 and 0.10 Weak evidence
Between 0.001 and 0.01 Moderate evidence
Between 0.0001 and 0.001 Strong evidence
Less than 0.0001 Very strong evidence

Example 26.2 Carrying a heavy backpack can cause chronic shoulder, neck, and back pain. A common recommendation is that a daily backpack should not weigh more than 10 percent of the wearer’s body weight. Do Cal Poly students follow this recommendation? That is, do the backpacks of Cal Poly students weigh, on average, less than 10% of their bodyweight? In a random sample of 100 Cal Poly students the sample mean backpack weight as percent of bodyweight is 7.71 percent with sample standard deviation 0.366 percent.

  1. Identify the main parameter of interest with both words and an appropriate symbol.




  2. Translate the research question into a hypothesis testing problem by specifying, with both words and appropriate symbols, the null hypothesis and the alternative hypothesis.




  3. Identify the null distribution of sample means. (You might need to make some assumptions.)




  4. Compute the p-value.




  5. Interpret the p-value in context.




  6. Does the p-value provide evidence that backpacks of CalPoly students weigh, on average, less than 10% of their bodyweight? Explain.




  7. Does the p-value provide evidence that backpacks of CalPoly students weigh, on average, much less than 10% of their bodyweight? Explain.




  8. How could we address the question in the previous part?




  9. Do Cal Poly students follow the 10% recommendation? Is the analysis above the only way to address this question? Can you suggest another?




One-sample \(t\) test

Example 26.3 Continuing with the Ames housing data set. Now suppose we want to compare sale prices of single family homes with other homes (including condos, townhouses, etc). In particular, do single family homes tend to have higher sale prices on average than other homes?

The table below summarizes the sample data.

count mean std min 25% 50% 75% max
SingleFamily
False 505.0 161.511398 60.394023 55.000 120.0 147.4 190.0 392.5
True 2425.0 184.812041 82.821802 12.789 130.0 165.0 220.0 755.0
  1. Explain how you could use simulation to approximate the p-value.




  2. Use the sample data to conduct by hand an appropriate hypothesis test.




  3. Write a clearly worded sentence interpreting the p-value in context.




  4. Does the p-value provide evidence that single family homes tend to have higher sale prices on average than other homes? Explain.




  5. Does the p-value provide evidence that single family homes tend to have much higher sale prices on average than other homes? Explain.




  6. How could we address the question in the previous part?




Two-sample \(t\) test for a difference in population (or treatment) means.

Example 26.4 In Example 26.2 and Example 26.3 we both performed a null hypothesis test and computed a confidence interval. Discuss the advantages and disadvantages of each approach.





  1. Source: http://a.espncdn.com/pdf/2015/0506/PatriotsWellsReport.pdf↩︎