The sample as a proxy for the unknown population. Sample from said proxy population (i.e., the sample) to generate a sampling distribution. Bootstrap.
Class notes: Bootstrapping
Baumer, Horton, and Kaplan (2021), The bootstrap (Chp 9.3) in Modern Data Science for R.
Gareth, Witten, Hastie, and Tibshirani (2021), The Bootstrap (section 5.2) Introduction to Statistical Learning.
Why would anyone ever want to bootstrap?
What are the differences between a normal CI with Boot SE, a Bootstrap-t CI, and a percentile interval?
Why do we need to bootstrap twice for the Bootstrap-t CI?
What makes a confidence interval procedure good?
Why isn’t the bootstrap method a solution for the situation of small sample sizes?
Why isn’t the bootstrap method a solution for the situation with biased / unrepresentative data?
Consider a population with a maximum value (the parameter of interest). Will the sample max have a sampling distribution which is centered on the true maximum? Why or why not? [Quintessential example of how a statistic can be biased for the parameter.]
In class slides for both 10/5/21 and 10/7/21.
StatKey applets which demonstrate bootstrapping.
Confidence interval logic from the Rossman & Chance applets.
The Role of Statistical Learning in Applied Statistics Daniela Witten talks to Rafa Irizarry June 15, 2020.
Five ways to fix statistics, Nature Nov 28, 2017
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hardin47/m154-comp-stats, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".